TOAST versus toast

Started by Peter Smith12 months ago18 messages
#1Peter Smith
smithpb2250@gmail.com
1 attachment(s)

Hi,

During some recent reviews, I came across some comments mentioning "toast" ...

TOAST is a PostgreSQL acronym for "The Oversized-Attribute Storage
Technique" [1]TOAST -- https://www.postgresql.org/docs/current/storage-toast.html.

But, toast is just toast [2]toast -- https://en.wikipedia.org/wiki/Toast_(food).

~

AFAIK it is usual practice to uppercase acronyms to distinguish them
from ordinary words, but PostgreSQL currently has a scattered mixture
of "TOAST" versus "toast". Usage seems about 50:50.

Now that I have seen the problem I can't unsee it, and it is
everywhere, so here is a patch to address all the lowercase toast in
the documentation.

Note, for the unusual cases I have used the same wording as per the
original TOAST page [1]TOAST -- https://www.postgresql.org/docs/current/storage-toast.html, so:
- "toasted" becomes "TOASTed".
- "toastable" becomes "TOAST-able"
- "untoasted" becomes "un-TOASTed"
- "detoasted" is unchanged (and so is "detoast")

~~~

There are many more "toast" examples found in the source code
comments, but I'll first wait to see if this patch is accepted before
looking to address those.

======
[1]: TOAST -- https://www.postgresql.org/docs/current/storage-toast.html
[2]: toast -- https://en.wikipedia.org/wiki/Toast_(food)

Kind Regards,
Peter Smith.
Fujitsu Australia

Attachments:

v1-0001-TOAST-not-toast.patchapplication/octet-stream; name=v1-0001-TOAST-not-toast.patchDownload
From 366fae80bcef4ba69725a6ddcd2004c565bf884e Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 16 Jan 2025 14:44:30 +1100
Subject: [PATCH v1] TOAST not toast.

---
 doc/src/sgml/amcheck.sgml             | 10 +++++-----
 doc/src/sgml/bki.sgml                 |  2 +-
 doc/src/sgml/catalogs.sgml            |  6 +++---
 doc/src/sgml/logical-replication.sgml |  2 +-
 doc/src/sgml/logicaldecoding.sgml     |  2 +-
 doc/src/sgml/ref/alter_table.sgml     |  2 +-
 doc/src/sgml/ref/create_table.sgml    |  2 +-
 doc/src/sgml/ref/create_type.sgml     |  4 ++--
 doc/src/sgml/ref/pg_amcheck.sgml      | 12 ++++++------
 doc/src/sgml/sepgsql.sgml             |  2 +-
 doc/src/sgml/xfunc.sgml               |  2 +-
 doc/src/sgml/xtypes.sgml              |  2 +-
 12 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 3af0656..4974e9c 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -254,12 +254,12 @@ SET client_min_messages = DEBUG1;
        <term><literal>check_toast</literal></term>
        <listitem>
         <para>
-         If true, toasted values are checked against the target relation's
+         If true, TOASTed values are checked against the target relation's
          TOAST table.
         </para>
         <para>
-         This option is known to be slow.  Also, if the toast table or its
-         index is corrupt, checking it against toast values could conceivably
+         This option is known to be slow.  Also, if the TOAST table or its
+         index is corrupt, checking it against TOAST values could conceivably
          crash the server, although in many cases this would just produce an
          error.
         </para>
@@ -514,8 +514,8 @@ SET client_min_messages = DEBUG1;
   Relation pages which are correctly formatted, internally consistent, and
   correct relative to their own internal checksums may still contain
   logical corruption.  As such, this kind of corruption cannot be detected
-  with <application>checksums</application>.  Examples include toasted
-  values in the main table which lack a corresponding entry in the toast
+  with <application>checksums</application>.  Examples include TOASTed
+  values in the main table which lack a corresponding entry in the TOAST
   table, and tuples in the main table with a Transaction ID that is older
   than the oldest valid Transaction ID in the database or cluster.
  </para>
diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index 3cd5bee..53a982b 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -1042,7 +1042,7 @@ $ perl  rewrite_dat_with_prokind.pl  pg_proc.dat
     </listitem>
     <listitem>
      <para>
-      Define indexes and toast tables.
+      Define indexes and TOAST tables.
      </para>
     </listitem>
     <listitem>
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 238ed67..63e8aa2 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1954,7 +1954,7 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para>
       <para>
        The OID of the data type that corresponds to this table's row type,
-       if any; zero for indexes, sequences, and toast tables, which have
+       if any; zero for indexes, sequences, and TOAST tables, which have
        no <structname>pg_type</structname> entry
       </para></entry>
      </row>
@@ -9434,7 +9434,7 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       <para>
        <structfield>typstorage</structfield> tells for varlena
        types (those with <structfield>typlen</structfield> = -1) if
-       the type is prepared for toasting and what the default strategy
+       the type is prepared for TOASTing and what the default strategy
        for attributes of this type should be.
        Possible values are:
        <itemizedlist>
@@ -9464,7 +9464,7 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
          </para>
         </listitem>
        </itemizedlist>
-       <literal>x</literal> is the usual choice for toast-able types.
+       <literal>x</literal> is the usual choice for TOAST-able types.
        Note that <literal>m</literal> values can also be moved out to
        secondary storage, but only as a last resort (<literal>e</literal>
        and <literal>x</literal> values are moved first).
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index 8290cd1..1811617 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1738,7 +1738,7 @@ DETAIL:  <replaceable class="parameter">detailed_explanation</replaceable>.
          The <literal>remote tuple</literal> section includes the new tuple from
          the remote insert or update operation that caused the conflict. Note that
          for an update operation, the column value of the new tuple will be null
-         if the value is unchanged and toasted.
+         if the value is unchanged and TOASTed.
         </para>
        </listitem>
        <listitem>
diff --git a/doc/src/sgml/logicaldecoding.sgml b/doc/src/sgml/logicaldecoding.sgml
index 1c4ae38..ea65a4a 100644
--- a/doc/src/sgml/logicaldecoding.sgml
+++ b/doc/src/sgml/logicaldecoding.sgml
@@ -1368,7 +1368,7 @@ commit_prepared_cb(...);  &lt;-- commit of the prepared transaction
     currently used for decoded changes) is selected and streamed.  However, in
     some cases we still have to spill to disk even if streaming is enabled
     because we exceed the memory threshold but still have not decoded the
-    complete tuple e.g., only decoded toast table insert but not the main table
+    complete tuple e.g., only decoded TOAST table insert but not the main table
     insert.
    </para>
 
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index 938450f..0c462d2 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -825,7 +825,7 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
 
      <para>
       <literal>SHARE UPDATE EXCLUSIVE</literal> lock will be taken for
-      fillfactor, toast and autovacuum storage parameters, as well as the
+      fillfactor, TOAST and autovacuum storage parameters, as well as the
       planner parameter <varname>parallel_workers</varname>.
      </para>
     </listitem>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 2237321..73fc8e3 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1592,7 +1592,7 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
      <para>
       The toast_tuple_target specifies the minimum tuple length required before
       we try to compress and/or move long column values into TOAST tables, and
-      is also the target length we try to reduce the length below once toasting
+      is also the target length we try to reduce the length below once TOASTing
       begins. This affects columns marked as External (for move),
       Main (for compression), or Extended (for both) and applies only to new
       tuples. There is no effect on existing rows.
diff --git a/doc/src/sgml/ref/create_type.sgml b/doc/src/sgml/ref/create_type.sgml
index 994dfc6..a491914 100644
--- a/doc/src/sgml/ref/create_type.sgml
+++ b/doc/src/sgml/ref/create_type.sgml
@@ -411,10 +411,10 @@ CREATE TYPE <replaceable class="parameter">name</replaceable>
   <para>
    All <replaceable class="parameter">storage</replaceable> values other
    than <literal>plain</literal> imply that the functions of the data type
-   can handle values that have been <firstterm>toasted</firstterm>, as described
+   can handle values that have been <firstterm>TOASTed</firstterm>, as described
    in <xref linkend="storage-toast"/> and <xref linkend="xtypes-toast"/>.
    The specific other value given merely determines the default TOAST
-   storage strategy for columns of a toastable data type; users can pick
+   storage strategy for columns of a TOAST-able data type; users can pick
    other strategies for individual columns using <literal>ALTER TABLE
    SET STORAGE</literal>.
   </para>
diff --git a/doc/src/sgml/ref/pg_amcheck.sgml b/doc/src/sgml/ref/pg_amcheck.sgml
index 6bfe287..ef2bdfd 100644
--- a/doc/src/sgml/ref/pg_amcheck.sgml
+++ b/doc/src/sgml/ref/pg_amcheck.sgml
@@ -41,7 +41,7 @@ PostgreSQL documentation
   </para>
 
   <para>
-   Only ordinary and toast table relations, materialized views, sequences, and
+   Only ordinary and TOAST table relations, materialized views, sequences, and
    btree indexes are currently supported.  Other relation types are silently
    skipped.
   </para>
@@ -276,7 +276,7 @@ PostgreSQL documentation
      <term><option>--no-dependent-toast</option></term>
      <listitem>
       <para>
-       By default, if a table is checked, its toast table, if any, will also
+       By default, if a table is checked, its TOAST table, if any, will also
        be checked, even if it is not explicitly selected by an option
        such as <literal>--table</literal> or <literal>--relation</literal>.
        This option suppresses that behavior.
@@ -306,9 +306,9 @@ PostgreSQL documentation
      <term><option>--exclude-toast-pointers</option></term>
      <listitem>
       <para>
-       By default, whenever a toast pointer is encountered in a table,
+       By default, whenever a TOAST pointer is encountered in a table,
        a lookup is performed to ensure that it references apparently-valid
-       entries in the toast table. These checks can be quite slow, and this
+       entries in the TOAST table. These checks can be quite slow, and this
        option can be used to skip them.
       </para>
      </listitem>
@@ -368,9 +368,9 @@ PostgreSQL documentation
        End checking at the specified block number.  An error will occur if the
        table relation being checked has fewer than this number of blocks.
        This option does not apply to indexes, and is probably only useful when
-       checking a single table relation. If both a regular table and a toast
+       checking a single table relation. If both a regular table and a TOAST
        table are checked, this option will apply to both, but higher-numbered
-       toast blocks may still be accessed while validating toast pointers,
+       TOAST blocks may still be accessed while validating TOAST pointers,
        unless that is suppressed using
        <option>--exclude-toast-pointers</option>.
       </para>
diff --git a/doc/src/sgml/sepgsql.sgml b/doc/src/sgml/sepgsql.sgml
index ca038d7..eab5fde 100644
--- a/doc/src/sgml/sepgsql.sgml
+++ b/doc/src/sgml/sepgsql.sgml
@@ -433,7 +433,7 @@ UPDATE t1 SET x = 2, y = func1(y) WHERE z = 100;
    <para>
     The default database privilege system allows database superusers to
     modify system catalogs using DML commands, and reference or modify
-    toast tables.  These operations are prohibited when
+    TOAST tables.  These operations are prohibited when
     <filename>sepgsql</filename> is enabled.
    </para>
   </sect3>
diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index af7864a..69747d0 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -2555,7 +2555,7 @@ CREATE FUNCTION concat_text(text, text) RETURNS text
      to be just pointless obscurantism, compared to using
      plain <literal>C</literal> calling conventions.  They do however allow
      us to deal with <literal>NULL</literal>able arguments/return values,
-     and <quote>toasted</quote> (compressed or out-of-line) values.
+     and <quote>TOASTed</quote> (compressed or out-of-line) values.
     </para>
 
     <para>
diff --git a/doc/src/sgml/xtypes.sgml b/doc/src/sgml/xtypes.sgml
index e67e5bd..13af113 100644
--- a/doc/src/sgml/xtypes.sgml
+++ b/doc/src/sgml/xtypes.sgml
@@ -267,7 +267,7 @@ CREATE TYPE complex (
 
  <para>
   To support <acronym>TOAST</acronym> storage, the C functions operating on the data
-  type must always be careful to unpack any toasted values they are handed
+  type must always be careful to unpack any TOASTed values they are handed
   by using <function>PG_DETOAST_DATUM</function>.  (This detail is customarily hidden
   by defining type-specific <function>GETARG_DATATYPE_P</function> macros.)
   Then, when running the <command>CREATE TYPE</command> command, specify the
-- 
1.8.3.1

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Smith (#1)
Re: TOAST versus toast

Peter Smith <smithpb2250@gmail.com> writes:

During some recent reviews, I came across some comments mentioning "toast" ...
TOAST is a PostgreSQL acronym for "The Oversized-Attribute Storage
Technique" [1].

It is indeed an acronym, but usages such as "toasting" are all over
our code and docs, as you see. I question whether changing that
to "TOASTing" improves readability. I agree that consistently
saying "TOAST table" not "toast table" is a good idea, but I'm
not quite convinced that removing every last lower-case occurrence
is a win, especially in these combined forms.

- "toasted" becomes "TOASTed".
- "toastable" becomes "TOAST-able"

Those two choices seem inconsistent...

- "untoasted" becomes "un-TOASTed"
- "detoasted" is unchanged (and so is "detoast")

Hm, there seems a risk of confusion between "not toasted" (a
statement of fact about the contents of a Datum) versus "detoasting"
(the act of expanding a toasted datum to full form). I'd prefer
to say "not toasted" than "untoasted" because the latter feels like
it could also mean "detoasted". (And as I write this para, I'm
having a hard time wanting to upcase the words, which reinforces
my doubts about s/toast/TOAST/g.)

regards, tom lane

#3Peter Smith
smithpb2250@gmail.com
In reply to: Tom Lane (#2)
1 attachment(s)
Re: TOAST versus toast

On Thu, Jan 16, 2025 at 3:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Smith <smithpb2250@gmail.com> writes:

During some recent reviews, I came across some comments mentioning "toast" ...
TOAST is a PostgreSQL acronym for "The Oversized-Attribute Storage
Technique" [1].

It is indeed an acronym, but usages such as "toasting" are all over
our code and docs, as you see. I question whether changing that
to "TOASTing" improves readability. I agree that consistently
saying "TOAST table" not "toast table" is a good idea, but I'm
not quite convinced that removing every last lower-case occurrence
is a win, especially in these combined forms.

Hi, thanks for the reply.

How about I reduce the scope by only tackling the uncontroversial
stuff, and leave all those "combined forms" for another day?

Attached is the reduced patch for changes to the documentation.

- "toasted" becomes "TOASTed".
- "toastable" becomes "TOAST-able"

Those two choices seem inconsistent...

- "untoasted" becomes "un-TOASTed"
- "detoasted" is unchanged (and so is "detoast")

Hm, there seems a risk of confusion between "not toasted" (a
statement of fact about the contents of a Datum) versus "detoasting"
(the act of expanding a toasted datum to full form). I'd prefer
to say "not toasted" than "untoasted" because the latter feels like
it could also mean "detoasted". (And as I write this para, I'm
having a hard time wanting to upcase the words, which reinforces
my doubts about s/toast/TOAST/g.)

======
Kind Regards,
Peter Smith.
Fujitsu Australia

Attachments:

v2-0001-TOAST-not-toast.patchapplication/octet-stream; name=v2-0001-TOAST-not-toast.patchDownload
From b41122b4f26359c429f2e320f0cdd0fd85314465 Mon Sep 17 00:00:00 2001
From: Peter Smith <peter.b.smith@fujitsu.com>
Date: Thu, 16 Jan 2025 16:29:47 +1100
Subject: [PATCH v2] TOAST not toast.

---
 doc/src/sgml/amcheck.sgml         |  4 ++--
 doc/src/sgml/bki.sgml             |  2 +-
 doc/src/sgml/catalogs.sgml        |  2 +-
 doc/src/sgml/logicaldecoding.sgml |  2 +-
 doc/src/sgml/ref/alter_table.sgml |  2 +-
 doc/src/sgml/ref/pg_amcheck.sgml  | 12 ++++++------
 doc/src/sgml/sepgsql.sgml         |  2 +-
 7 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/doc/src/sgml/amcheck.sgml b/doc/src/sgml/amcheck.sgml
index 3af0656..da0d78c 100644
--- a/doc/src/sgml/amcheck.sgml
+++ b/doc/src/sgml/amcheck.sgml
@@ -258,8 +258,8 @@ SET client_min_messages = DEBUG1;
          TOAST table.
         </para>
         <para>
-         This option is known to be slow.  Also, if the toast table or its
-         index is corrupt, checking it against toast values could conceivably
+         This option is known to be slow.  Also, if the TOAST table or its
+         index is corrupt, checking it against TOAST values could conceivably
          crash the server, although in many cases this would just produce an
          error.
         </para>
diff --git a/doc/src/sgml/bki.sgml b/doc/src/sgml/bki.sgml
index 3cd5bee..53a982b 100644
--- a/doc/src/sgml/bki.sgml
+++ b/doc/src/sgml/bki.sgml
@@ -1042,7 +1042,7 @@ $ perl  rewrite_dat_with_prokind.pl  pg_proc.dat
     </listitem>
     <listitem>
      <para>
-      Define indexes and toast tables.
+      Define indexes and TOAST tables.
      </para>
     </listitem>
     <listitem>
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 238ed67..4c65686 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1954,7 +1954,7 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
       </para>
       <para>
        The OID of the data type that corresponds to this table's row type,
-       if any; zero for indexes, sequences, and toast tables, which have
+       if any; zero for indexes, sequences, and TOAST tables, which have
        no <structname>pg_type</structname> entry
       </para></entry>
      </row>
diff --git a/doc/src/sgml/logicaldecoding.sgml b/doc/src/sgml/logicaldecoding.sgml
index 1c4ae38..ea65a4a 100644
--- a/doc/src/sgml/logicaldecoding.sgml
+++ b/doc/src/sgml/logicaldecoding.sgml
@@ -1368,7 +1368,7 @@ commit_prepared_cb(...);  &lt;-- commit of the prepared transaction
     currently used for decoded changes) is selected and streamed.  However, in
     some cases we still have to spill to disk even if streaming is enabled
     because we exceed the memory threshold but still have not decoded the
-    complete tuple e.g., only decoded toast table insert but not the main table
+    complete tuple e.g., only decoded TOAST table insert but not the main table
     insert.
    </para>
 
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index 938450f..0c462d2 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -825,7 +825,7 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
 
      <para>
       <literal>SHARE UPDATE EXCLUSIVE</literal> lock will be taken for
-      fillfactor, toast and autovacuum storage parameters, as well as the
+      fillfactor, TOAST and autovacuum storage parameters, as well as the
       planner parameter <varname>parallel_workers</varname>.
      </para>
     </listitem>
diff --git a/doc/src/sgml/ref/pg_amcheck.sgml b/doc/src/sgml/ref/pg_amcheck.sgml
index 6bfe287..ef2bdfd 100644
--- a/doc/src/sgml/ref/pg_amcheck.sgml
+++ b/doc/src/sgml/ref/pg_amcheck.sgml
@@ -41,7 +41,7 @@ PostgreSQL documentation
   </para>
 
   <para>
-   Only ordinary and toast table relations, materialized views, sequences, and
+   Only ordinary and TOAST table relations, materialized views, sequences, and
    btree indexes are currently supported.  Other relation types are silently
    skipped.
   </para>
@@ -276,7 +276,7 @@ PostgreSQL documentation
      <term><option>--no-dependent-toast</option></term>
      <listitem>
       <para>
-       By default, if a table is checked, its toast table, if any, will also
+       By default, if a table is checked, its TOAST table, if any, will also
        be checked, even if it is not explicitly selected by an option
        such as <literal>--table</literal> or <literal>--relation</literal>.
        This option suppresses that behavior.
@@ -306,9 +306,9 @@ PostgreSQL documentation
      <term><option>--exclude-toast-pointers</option></term>
      <listitem>
       <para>
-       By default, whenever a toast pointer is encountered in a table,
+       By default, whenever a TOAST pointer is encountered in a table,
        a lookup is performed to ensure that it references apparently-valid
-       entries in the toast table. These checks can be quite slow, and this
+       entries in the TOAST table. These checks can be quite slow, and this
        option can be used to skip them.
       </para>
      </listitem>
@@ -368,9 +368,9 @@ PostgreSQL documentation
        End checking at the specified block number.  An error will occur if the
        table relation being checked has fewer than this number of blocks.
        This option does not apply to indexes, and is probably only useful when
-       checking a single table relation. If both a regular table and a toast
+       checking a single table relation. If both a regular table and a TOAST
        table are checked, this option will apply to both, but higher-numbered
-       toast blocks may still be accessed while validating toast pointers,
+       TOAST blocks may still be accessed while validating TOAST pointers,
        unless that is suppressed using
        <option>--exclude-toast-pointers</option>.
       </para>
diff --git a/doc/src/sgml/sepgsql.sgml b/doc/src/sgml/sepgsql.sgml
index ca038d7..eab5fde 100644
--- a/doc/src/sgml/sepgsql.sgml
+++ b/doc/src/sgml/sepgsql.sgml
@@ -433,7 +433,7 @@ UPDATE t1 SET x = 2, y = func1(y) WHERE z = 100;
    <para>
     The default database privilege system allows database superusers to
     modify system catalogs using DML commands, and reference or modify
-    toast tables.  These operations are prohibited when
+    TOAST tables.  These operations are prohibited when
     <filename>sepgsql</filename> is enabled.
    </para>
   </sect3>
-- 
1.8.3.1

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Peter Smith (#3)
Re: TOAST versus toast

On Wed, Jan 15, 2025 at 10:38 PM Peter Smith <smithpb2250@gmail.com> wrote:

On Thu, Jan 16, 2025 at 3:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Smith <smithpb2250@gmail.com> writes:

During some recent reviews, I came across some comments mentioning

"toast" ...

TOAST is a PostgreSQL acronym for "The Oversized-Attribute Storage
Technique" [1].

It is indeed an acronym, but usages such as "toasting" are all over
our code and docs, as you see. I question whether changing that
to "TOASTing" improves readability. I agree that consistently
saying "TOAST table" not "toast table" is a good idea, but I'm
not quite convinced that removing every last lower-case occurrence
is a win, especially in these combined forms.

I'm not particularly convinced that "TOAST table" is a good idea; but I
don't hate it either.

TOAST is a "technique", design feature, algorithm, process. When referring
to that concept, using TOAST makes sense. The implementation artifacts are
conveniently labelled e.g., "toast tables", and can be used in the same
capitalization that one would write "foreign table" or "temporary table".
Sure, we can define our made-up label as "TOAST tables" but it just makes
it stand out unnecessarily in comparison to "temporary tables" and the like.

I'd be more interested in making sure all TOAST references are in regards
to the technique and lower-case the ones that aren't.

David J.

#5Robert Treat
rob@xzilla.net
In reply to: David G. Johnston (#4)
Re: TOAST versus toast

On Mon, Feb 17, 2025 at 6:27 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:

On Wed, Jan 15, 2025 at 10:38 PM Peter Smith <smithpb2250@gmail.com> wrote:

On Thu, Jan 16, 2025 at 3:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Smith <smithpb2250@gmail.com> writes:

During some recent reviews, I came across some comments mentioning "toast" ...
TOAST is a PostgreSQL acronym for "The Oversized-Attribute Storage
Technique" [1].

It is indeed an acronym, but usages such as "toasting" are all over
our code and docs, as you see. I question whether changing that
to "TOASTing" improves readability. I agree that consistently
saying "TOAST table" not "toast table" is a good idea, but I'm
not quite convinced that removing every last lower-case occurrence
is a win, especially in these combined forms.

I took a look at this a few weeks ago and couldn't get excited about
it. It does seem to me that the cases where we use TOAST as a verb are
more readable when done in lower case, and this is pretty common in
everyday english/grammar; as an example, people would generally write
"the dr. lasered the tumor" not "the dr. LASERed the tumor". So I am
+1 on the idea of not uppercasing these instances, but the flip side
"should we ensure we are lower casing them" is interesting... we
usually do, but there are a few cases where we don't (typically where
they have been labeled as acronyms). the documentation on
pg_column_toast_chunk_id is a good example:

Shows the <structfield>chunk_id</structfield> of an on-disk
<acronym>TOAST</acronym>ed value. Returns <literal>NULL</literal>
if the value is un-<acronym>TOAST</acronym>ed or not on-disk. See
<xref linkend="storage-toast"/> for more information about
<acronym>TOAST</acronym>.

I'm not particularly convinced that "TOAST table" is a good idea; but I don't hate it either.

TOAST is a "technique", design feature, algorithm, process. When referring to that concept, using TOAST makes sense. The implementation artifacts are conveniently labelled e.g., "toast tables", and can be used in the same capitalization that one would write "foreign table" or "temporary table". Sure, we can define our made-up label as "TOAST tables" but it just makes it stand out unnecessarily in comparison to "temporary tables" and the like.

I'd be more interested in making sure all TOAST references are in regards to the technique and lower-case the ones that aren't.

I kind of wondered about this, because I felt pretty used to seeing
the term "TOAST table", so I did some quick searches, and it looks
like we have about 20 cases where we use TOAST table vs about 10 where
we use toast table, specifically focusing on cases where we don't add
any markup to the word "toast", and about 20 more where we use
"<acronym>TOAST</acronym> table". So ISTM that folks are probably used
to seeing the term with upper case, but not universally so... so I
could probably get onboard with David's suggestion, although tbh I
probably would lean the other way.

Robert Treat
https://xzilla.net

#6Robert Haas
robertmhaas@gmail.com
In reply to: Robert Treat (#5)
Re: TOAST versus toast

On Fri, Mar 7, 2025 at 11:24 AM Robert Treat <rob@xzilla.net> wrote:

everyday english/grammar; as an example, people would generally write
"the dr. lasered the tumor" not "the dr. LASERed the tumor".

For the record, I wouldn't write either of those things if I wanted to
be certain of being understood. Using acronyms as verbs is inherently
fraught: it supposes that the reader both understands the acronym in
general and is able to pick up on what you're doing with it. If I say
that somebody got swatted, for example, you could either fail to know
what a SWAT team is (which I imagine is quite plausible in a
non-American context) or you could think that I just meant that they
were struck lightly with a rolled-up newspaper. Writing SWATted
instead of swatted makes it clear that an acronym was intended, but
you still have to know what the acronym means in order to understand
the sentence.

And, to me, that's the root of the issue here. Some of the
documentation references to toasting, detoasting, etc. are in sections
that specifically define that mechanism, but some are not. In
particular I see that a reference to "detoasted" has crept into the
ALTER TABLE documentation, a state of affairs that is very possibly my
fault. That kind of thing is probably always going to be a mess no
matter how you capitalize it, because the reader may not know the
term. You could link to the definition, but rewording the sentence is
often going to be even better. For example, in the specific context
where this is used in the ALTER TABLE documentation, "decompressed"
would be just as accurate as "detoasted" and easier to understand.

--
Robert Haas
EDB: http://www.enterprisedb.com

#7Peter Smith
smithpb2250@gmail.com
In reply to: Peter Smith (#3)
Re: TOAST versus toast

Hi,

If I understand correctly, the summary is:
- Tom: +1 for "TOAST table", but changing all the combined forms is
maybe not worth the effort.
- DavidJ: Wants to uppercase TOAST only when it refers to 'technique';
lowercase otherwise.
- RobertT: The verbs should be lowercase (e.g. laser). Each-way bet re
David's technique idea.
- RobertH: Don't lowercase verbs, but instead try to rewrite these
differently where possible.

~~

This thread seems to have exposed a lot of different opinions. I guess
that's the reason why the docs/comments got to be how they are now --
e.g. Everybody wrote what they believe is correct, but their idea of
correct differs from the next person.

BTW, this thread was not created because of any particular confusion
it was causing (although I am sure there are some confusing examples
to be found). It was more just that during reviews I kept seeing there
was no consistent use of toast v TOAST even in the same file/function.
It was this inconsistency that was annoying and prompted this thread.

But, because of all the differing views expressed here I'm not sure
now how to proceed. Any ideas?

I think everyone would agree that inconsistency is bad, so it becomes
a question then what if anything should be done about it. My plan was
to just come up with some fixed rules for mechanical changes (e.g.
"Always say TOAST table" or whatever). I know that may not always
result in the perfect choice, but IMO having some simple/fixed rules
for a code monkey to apply might be more prudent than rules requiring
subjective interpretations (e.g. will two people ever agree what is a
'technique' and what is not?) which would end up not addressing
consistency issue. Also, I agree that just rewriting text would be the
best choice in some cases but probably there are many dozens of
candidates so getting consensus on all of those rewrites will be like
herding cats.

Meanwhile, I've moved this CF entry into the next commitfest, because
I don't see how this thread can get resolved by the end of the month.

======
Kind Regards,
Peter Smith.
Fujitsu Australia

#8Robert Haas
robertmhaas@gmail.com
In reply to: Peter Smith (#7)
Re: TOAST versus toast

On Sun, Mar 16, 2025 at 7:38 PM Peter Smith <smithpb2250@gmail.com> wrote:

If I understand correctly, the summary is:
- Tom: +1 for "TOAST table", but changing all the combined forms is
maybe not worth the effort.
- DavidJ: Wants to uppercase TOAST only when it refers to 'technique';
lowercase otherwise.
- RobertT: The verbs should be lowercase (e.g. laser). Each-way bet re
David's technique idea.
- RobertH: Don't lowercase verbs, but instead try to rewrite these
differently where possible.

I'm not sure I agree with this summary of my position. I'm against
TOASTed, TOAST-able, and un-TOASTed, and in fact it seems to me that
nobody else who has commented on this proposal likes those either. It
seems to me that the idea of upper-casting TOAST where it stands alone
as a separate word may have some support, although not everyone who
has commented wants to do it in every situation and nobody seems to
think it is super-important. But as far as I can see, nobody other
than you is a fan of doing it when a prefix or suffix has been added.
I don't mean to suggest that your opinion is unimportant, just that,
in this case, it doesn't seem to have attracted any support from
others.

So I would suggest that you either:

(1) drop this patch, or perhaps
(2) cut it down to something that just changes some or all usages of
TOAST without prefix or suffix and leaves everything else alone, or
perhaps
(3) do (2) but also add some rewording to (3a) avoid needing to use
prefixed or suffixed forms or (3b) to avoid using TOAST altogether.

I really don't think you're going to get consensus on capitalizing the
letters TOAST someplace in the middle of a word. I mean, there's
probably precedent both ways. You get tasered by the police, not
TASERed by the police; but I think you would write that you were
SMSing with a colleague rather than smsing with a colleague. But as
you say, "everybody wrote what they believe is correct," so there is
probably not going to be support for radically upending our existing
conventions, and deTOASTing is definitely a minority position. If you
really want to change something, getting rid of the few instances of
minority positions like that might be palatable, but something that
involves replacing a lot of the forms people chose with other forms
seems less likely to achieve consensus.

The alternative of just not worrying about it too much also seems to
have some merit. As you say, you weren't actually confused, just
irritated by the inconsistency; and spending effort on things that are
more irritating than serious is not always the right thing to do.

--
Robert Haas
EDB: http://www.enterprisedb.com

#9Isaac Morland
isaac.morland@gmail.com
In reply to: Peter Smith (#7)
Re: TOAST versus toast

On Sun, 16 Mar 2025 at 19:38, Peter Smith <smithpb2250@gmail.com> wrote:

But, because of all the differing views expressed here I'm not sure
now how to proceed. Any ideas?

May I suggest that you start with a patch to Appendix J, section 6 to
codify whatever is decided?

https://www.postgresql.org/docs/current/docguide-style.html

This is made a bit awkward because right now the style guide only has one
subsection, relating to reference page organization. So essentially I'm
suggesting an entirely new subsection which could eventually cover things
like capitalization and which grammatical forms to prefer, and that you
start with the toast/TOAST rules. Once you have at least one rule agreed
and added to the style guide, then a patch to revise existing examples of
contrary usage would be in my opinion a more clear win than it is now.

The above makes more sense to me if there are other questions of this
general nature that could benefit from an explicit mention in a style
guide, even if this patch wouldn't do that. If this is the only question
like this, then it looks a bit weird to add a whole section just for it.
But I lean towards the idea that over time there might be a number of
decisions of this nature that ought to be made and followed consistently.

#10Peter Smith
smithpb2250@gmail.com
In reply to: Robert Haas (#8)
Re: TOAST versus toast

On Mon, Mar 17, 2025 at 1:50 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Mar 16, 2025 at 7:38 PM Peter Smith <smithpb2250@gmail.com> wrote:

If I understand correctly, the summary is:
- Tom: +1 for "TOAST table", but changing all the combined forms is
maybe not worth the effort.
- DavidJ: Wants to uppercase TOAST only when it refers to 'technique';
lowercase otherwise.
- RobertT: The verbs should be lowercase (e.g. laser). Each-way bet re
David's technique idea.
- RobertH: Don't lowercase verbs, but instead try to rewrite these
differently where possible.

I'm not sure I agree with this summary of my position. I'm against
TOASTed, TOAST-able, and un-TOASTed, and in fact it seems to me that
nobody else who has commented on this proposal likes those either. It
seems to me that the idea of upper-casting TOAST where it stands alone
as a separate word may have some support, although not everyone who
has commented wants to do it in every situation and nobody seems to
think it is super-important. But as far as I can see, nobody other
than you is a fan of doing it when a prefix or suffix has been added.
I don't mean to suggest that your opinion is unimportant, just that,
in this case, it doesn't seem to have attracted any support from
others.

Sorry if I've misrepresented your position. And, just for the record,
I'm not "a fan of doing it [capitalizing] when a prefix or suffix has
been added". I know in earlier posts I may have suggested doing that,
but that was me trying to be consistent with usage on the docs page
[1]: https://www.postgresql.org/docs/17/storage-toast.html
related words.

So I would suggest that you either:

(1) drop this patch, or perhaps
(2) cut it down to something that just changes some or all usages of
TOAST without prefix or suffix and leaves everything else alone, or
perhaps
(3) do (2) but also add some rewording to (3a) avoid needing to use
prefixed or suffixed forms or (3b) to avoid using TOAST altogether.

I really don't think you're going to get consensus on capitalizing the
letters TOAST someplace in the middle of a word. I mean, there's
probably precedent both ways. You get tasered by the police, not
TASERed by the police; but I think you would write that you were
SMSing with a colleague rather than smsing with a colleague. But as
you say, "everybody wrote what they believe is correct," so there is
probably not going to be support for radically upending our existing
conventions, and deTOASTing is definitely a minority position. If you
really want to change something, getting rid of the few instances of
minority positions like that might be palatable, but something that
involves replacing a lot of the forms people chose with other forms
seems less likely to achieve consensus.

Thanks for your suggestions. At this point option (1) is looking most
attractive. Probably, I will just withdraw the CF entry soon unless
there is some new interest. Just chipping away fixing a few places
isn't going to achieve the consistency this thread was aiming for.

The alternative of just not worrying about it too much also seems to
have some merit. As you say, you weren't actually confused, just
irritated by the inconsistency; and spending effort on things that are
more irritating than serious is not always the right thing to do.

Yes, as I am learning.

======
[1]: https://www.postgresql.org/docs/17/storage-toast.html

Kind Regards,
Peter Smith.
Fujitsu Australia

#11Jan Wieck
jan@wi3ck.info
In reply to: Robert Haas (#8)
Re: TOAST versus toast

As the original author of the TOAST I vote for TOAST being used as the
name/acronym of the feature, but toast in all other cases like as verb.

Best Regards, Jan

Show quoted text

On 3/16/25 22:49, Robert Haas wrote:

On Sun, Mar 16, 2025 at 7:38 PM Peter Smith <smithpb2250@gmail.com> wrote:

If I understand correctly, the summary is:
- Tom: +1 for "TOAST table", but changing all the combined forms is
maybe not worth the effort.
- DavidJ: Wants to uppercase TOAST only when it refers to 'technique';
lowercase otherwise.
- RobertT: The verbs should be lowercase (e.g. laser). Each-way bet re
David's technique idea.
- RobertH: Don't lowercase verbs, but instead try to rewrite these
differently where possible.

I'm not sure I agree with this summary of my position. I'm against
TOASTed, TOAST-able, and un-TOASTed, and in fact it seems to me that
nobody else who has commented on this proposal likes those either. It
seems to me that the idea of upper-casting TOAST where it stands alone
as a separate word may have some support, although not everyone who
has commented wants to do it in every situation and nobody seems to
think it is super-important. But as far as I can see, nobody other
than you is a fan of doing it when a prefix or suffix has been added.
I don't mean to suggest that your opinion is unimportant, just that,
in this case, it doesn't seem to have attracted any support from
others.

So I would suggest that you either:

(1) drop this patch, or perhaps
(2) cut it down to something that just changes some or all usages of
TOAST without prefix or suffix and leaves everything else alone, or
perhaps
(3) do (2) but also add some rewording to (3a) avoid needing to use
prefixed or suffixed forms or (3b) to avoid using TOAST altogether.

I really don't think you're going to get consensus on capitalizing the
letters TOAST someplace in the middle of a word. I mean, there's
probably precedent both ways. You get tasered by the police, not
TASERed by the police; but I think you would write that you were
SMSing with a colleague rather than smsing with a colleague. But as
you say, "everybody wrote what they believe is correct," so there is
probably not going to be support for radically upending our existing
conventions, and deTOASTing is definitely a minority position. If you
really want to change something, getting rid of the few instances of
minority positions like that might be palatable, but something that
involves replacing a lot of the forms people chose with other forms
seems less likely to achieve consensus.

The alternative of just not worrying about it too much also seems to
have some merit. As you say, you weren't actually confused, just
irritated by the inconsistency; and spending effort on things that are
more irritating than serious is not always the right thing to do.

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Wieck (#11)
Re: TOAST versus toast

Jan Wieck <jan@wi3ck.info> writes:

As the original author of the TOAST I vote for TOAST being used as the
name/acronym of the feature, but toast in all other cases like as verb.

Well, if we're appealing to history ... I dug in the archives
and found that you seem to have invented the name here [1]/messages/by-id/m120C3U-0003kHC@orion.SAPserv.Hamburg.dsh.de:

Since we decided not to create a separate LONG datatype, and
not doing LONG attributes alone (compression at some point
too), I looked for some unique name for it - and found one.
The characters 'toast' did not show up on a case insensitive
grep over the entire CVS tree. Thus, I'll call it

tuple toaster

subsequently. I think there are enough similarities to a
toaster in this case. If you take a bread (tuple) and toast
some of the slices (attributes), anything can work as you
want and it will smell and taste delicious. In some cases,
slices might get burned (occationally hitting an indexed
value), taste bitter and it will stink.

BTW: The idea itself was stolen from toast/untoast, a GSM
voice data compression/decompression tool.

Note the lack of any upper case. Shortly later we reverse-engineered
an acronym for it [2]/messages/by-id/m120DHd-0003kLC@orion.SAPserv.Hamburg.dsh.de, with the winner being Tom Lockhart's

The Oversized-Attribute Storage Technique

So I'd say that the basis for upper-casing it at all is mighty
thin; it was not conceived as an acronym to begin with. We should
probably adjust our glossary entry for it to nod in the direction
of that GSM tool, if anyone can find a modern reference for that.

regards, tom lane

[1]: /messages/by-id/m120C3U-0003kHC@orion.SAPserv.Hamburg.dsh.de
[2]: /messages/by-id/m120DHd-0003kLC@orion.SAPserv.Hamburg.dsh.de

#13Jan Wieck
jan@wi3ck.info
In reply to: Tom Lane (#12)
Re: TOAST versus toast

On 3/17/25 00:24, Tom Lane wrote:

Note the lack of any upper case. Shortly later we reverse-engineered
an acronym for it [2], with the winner being Tom Lockhart's

The Oversized-Attribute Storage Technique

Which made it into an acronym. Acronyms are typically capitalized to
distinguish them from ordinary words.

Best Regards, Jan

#14Álvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Jan Wieck (#13)
Re: TOAST versus toast

On 3/17/25 00:24, Tom Lane wrote:

Note the lack of any upper case. Shortly later we reverse-engineered
an acronym for it [2], with the winner being Tom Lockhart's

The Oversized-Attribute Storage Technique

I (very easily) found a reference to the GSM tool:
https://linux.die.net/man/1/toast

At the bottom, you're directed to write to Jutta at UT Berlin in case of
bugs. Searching for that you'll eventually arrive at
http://quut.com/berlin/toast.html
which points out that this is Jutta Degener, currently of Sunnyvale, CA:
https://quut.com/credits.p3

On 2025-Mar-17, Jan Wieck wrote:

Which made it into an acronym. Acronyms are typically capitalized to
distinguish them from ordinary words.

However, we do stop capitalizing acronyms once they get in common
enough. The example of LASER (originall acronym for "light
amplification by stimulated emission of radiation") was already
mentioned, but there's also RADAR ("radio detection and ranging"), which
is particularly useful in this discussion because its wikipedia page
says

The term radar has since entered English and other languages as
an anacronym, a common noun, losing all capitalization.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"I apologize for the confusion in my previous responses.
There appears to be an error." (ChatGPT)

#15David G. Johnston
david.g.johnston@gmail.com
In reply to: Peter Smith (#10)
Re: TOAST versus toast

On Sun, Mar 16, 2025 at 8:33 PM Peter Smith <smithpb2250@gmail.com> wrote:

Thanks for your suggestions. At this point option (1) is looking most
attractive. Probably, I will just withdraw the CF entry soon unless
there is some new interest. Just chipping away fixing a few places
isn't going to achieve the consistency this thread was aiming for.

I've moved this back to waiting on author pending a final decision.
Interested parties might still chime in but it doesn't seem like it is
actively looking for reviewers at this point.

David J.

#16wenhui qiu
qiuwenhuifx@gmail.com
In reply to: David G. Johnston (#15)
Re: TOAST versus toast

Hi,
I think this point is of no significance at all. Besides, this is a
document that has been around for over ten years. Everyone has become
accustomed to this kind of expression. This is just a case of being full
but having nothing to do with anything.

On Sat, 12 Apr 2025 at 10:31, David G. Johnston <david.g.johnston@gmail.com>
wrote:

Show quoted text

On Sun, Mar 16, 2025 at 8:33 PM Peter Smith <smithpb2250@gmail.com> wrote:

Thanks for your suggestions. At this point option (1) is looking most
attractive. Probably, I will just withdraw the CF entry soon unless
there is some new interest. Just chipping away fixing a few places
isn't going to achieve the consistency this thread was aiming for.

I've moved this back to waiting on author pending a final decision.
Interested parties might still chime in but it doesn't seem like it is
actively looking for reviewers at this point.

David J.

#17Peter Eisentraut
peter@eisentraut.org
In reply to: Peter Smith (#3)
Re: TOAST versus toast

On 16.01.25 06:38, Peter Smith wrote:

On Thu, Jan 16, 2025 at 3:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Smith <smithpb2250@gmail.com> writes:

During some recent reviews, I came across some comments mentioning "toast" ...
TOAST is a PostgreSQL acronym for "The Oversized-Attribute Storage
Technique" [1].

It is indeed an acronym, but usages such as "toasting" are all over
our code and docs, as you see. I question whether changing that
to "TOASTing" improves readability. I agree that consistently
saying "TOAST table" not "toast table" is a good idea, but I'm
not quite convinced that removing every last lower-case occurrence
is a win, especially in these combined forms.

Hi, thanks for the reply.

How about I reduce the scope by only tackling the uncontroversial
stuff, and leave all those "combined forms" for another day?

Attached is the reduced patch for changes to the documentation.

committed

#18Peter Smith
smithpb2250@gmail.com
In reply to: Peter Eisentraut (#17)
Re: TOAST versus toast

On Tue, Jul 1, 2025 at 6:29 PM Peter Eisentraut <peter@eisentraut.org> wrote:

On 16.01.25 06:38, Peter Smith wrote:

On Thu, Jan 16, 2025 at 3:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Smith <smithpb2250@gmail.com> writes:

During some recent reviews, I came across some comments mentioning "toast" ...
TOAST is a PostgreSQL acronym for "The Oversized-Attribute Storage
Technique" [1].

It is indeed an acronym, but usages such as "toasting" are all over
our code and docs, as you see. I question whether changing that
to "TOASTing" improves readability. I agree that consistently
saying "TOAST table" not "toast table" is a good idea, but I'm
not quite convinced that removing every last lower-case occurrence
is a win, especially in these combined forms.

Hi, thanks for the reply.

How about I reduce the scope by only tackling the uncontroversial
stuff, and leave all those "combined forms" for another day?

Attached is the reduced patch for changes to the documentation.

committed

Thanks for pushing!

Those were all (supposedly) uncontroversial changes for just the SGML docs.

Originally, I had planned to see if this 1st patch would be pushed,
and if so, then look at making all the same kinds of changes to the
code comments. But, given the debate/time to get this far, I'm
thinking it's not worth opening Pandora's box a 2nd time. Please let
me know if you think otherwise.

======
Kind Regards,
Peter Smith.
Fujitsu Australia