doc: BRIN indexes and autosummarize
Here's a patch to clarify the BRIN indexes documentation, particularly with
regards
to autosummarize, vacuum and autovacuum. It basically breaks down a big
blob of a
paragraph into multiple paragraphs for clarity, plus explicitly tells how
summarization
happens manually or automatically.
I also added cross-references to various relevant sections, including the
create index
page.
On this topic... I'm not familiar with with the internals of BRIN indexes
and in
backend/access/common/reloptions.c I see:
{
"autosummarize",
"Enables automatic summarization on this BRIN index",
RELOPT_KIND_BRIN,
AccessExclusiveLock
},
Is the exclusive lock on the index why autosummarize is off by default?
What would be the downside (if any) of having autosummarize=on by default?
Roberto
--
Crunchy Data - passion for open source PostgreSQL
Attachments:
brin-autosummarize-docs.patchapplication/octet-stream; name=brin-autosummarize-docs.patchDownload
diff --git a/doc/src/sgml/brin.sgml b/doc/src/sgml/brin.sgml
index caf1ea4cef..5af53dee0f 100644
--- a/doc/src/sgml/brin.sgml
+++ b/doc/src/sgml/brin.sgml
@@ -16,9 +16,13 @@
<acronym>BRIN</acronym> is designed for handling very large tables
in which certain columns have some natural correlation with their
physical location within the table.
+ </para>
+ <para>
A <firstterm>block range</firstterm> is a group of pages that are physically
adjacent in the table; for each block range, some summary info is stored
by the index.
+ </para>
+ <para>
For example, a table storing a store's sale orders might have
a date column on which each order was placed, and most of the time
the entries for earlier orders will appear earlier in the table as well;
@@ -31,6 +35,8 @@
index scans, and will return all tuples in all pages within each range if
the summary info stored by the index is <firstterm>consistent</firstterm> with the
query conditions.
+ </para>
+ <para>
The query executor is in charge of rechecking these tuples and discarding
those that do not match the query conditions — in other words, these
indexes are lossy.
@@ -69,35 +75,88 @@
As new pages are filled with data, page ranges that are already
summarized will cause the summary information to be updated with data
from the new tuples.
+ </para>
+ <para>
When a new page is created that does not fall within the last
summarized range, that range does not automatically acquire a summary
tuple; those tuples remain unsummarized until a summarization run is
invoked later, creating initial summaries.
- This process can be invoked manually using the
- <function>brin_summarize_range(regclass, bigint)</function> or
- <function>brin_summarize_new_values(regclass)</function> functions;
- automatically when <command>VACUUM</command> processes the table;
- or by automatic summarization executed by autovacuum, as insertions
- occur. (This last trigger is disabled by default and can be enabled
- with the <literal>autosummarize</literal> parameter.)
- Conversely, a range can be de-summarized using the
- <function>brin_desummarize_range(regclass, bigint)</function> function,
- which is useful when the index tuple is no longer a very good
- representation because the existing values have changed.
</para>
+ <para>
+ This summarization process can happen automatically or be invoked
+ manually. Manually it is done by calling the following
+ functions (see <xref linkend="functions-admin-index"/> for more):
+
+ <itemizedlist>
+ <listitem>
+ <simpara>
+ <function>brin_summarize_range(regclass, bigint)</function>
+ </simpara>
+ </listitem>
+ <listitem>
+ <simpara>
+ <function>brin_summarize_new_values(regclass)</function>
+ </simpara>
+ </listitem>
+ </itemizedlist>
+ </para>
+ <para>
+ Automatically, the process can happen under two circunstances:
+ </para>
+
+ <variablelist>
+ <varlistentry>
+ <term>VACUUM</term>
+ <listitem>
+ <para>
+ When <xref linkend="sql-vacuum"/> is run manually, it processes the table
+ and summarization is performed.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+ <term>autovacuum</term>
+ <listitem>
+ <para>
+ The <literal>autovacuum</literal> process can perform automatic
+ summarization but that is NOT the default behavior, which is controlled by
+ the <xref linkend="index-reloption-autosummarize"/> parameter, by
+ default set to <literal>off</literal>.
+ </para>
+ <para>
+ In order for the <literal>autovacuum</literal> process to execute
+ automatic summarization as insertions occur, the index must have
+ been created with (or altered to have) the
+ <xref linkend="index-reloption-autosummarize"/> parameter set to
+ <literal>on</literal>. See <xref linkend="sql-createindex"/>,
+ <xref linkend="sql-alterindex"/> and <xref linkend="autovacuum"/>
+ for details.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
<para>
When autosummarization is enabled, each time a page range is filled a
- request is sent to autovacuum for it to execute a targeted summarization
- for that range, to be fulfilled at the end of the next worker run on the
- same database. If the request queue is full, the request is not recorded
- and a message is sent to the server log:
+ request is sent to <literal>autovacuum</literal> for it to execute a targeted
+ summarization for that range, to be fulfilled at the end of the next
+ autovacuum worker run on the same database. If the request queue is full, the
+ request is not recorded and a message is sent to the server log:
<screen>
LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was not recorded
</screen>
When this happens, the range will be summarized normally during the next
regular vacuum of the table.
</para>
+
+ <para>
+ Conversely, a range can be de-summarized using the
+ <function>brin_desummarize_range(regclass, bigint)</function> function,
+ which is useful when the index tuple is no longer a very good
+ representation because the existing values have changed.
+ See <xref linkend="functions-admin-index"/> for details.
+ </para>
+
</sect2>
</sect1>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 9ffcdc629e..83b5e82dfb 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -579,7 +579,10 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
<listitem>
<para>
Defines whether a summarization run is invoked for the previous page
- range whenever an insertion is detected on the next one.
+ range whenever an insertion is detected on the next one. This also
+ controls whether the <literal>autovacuum</literal> process will execute
+ summarizations. The default is <literal>off</literal> (see
+ <xref linkend="brin-operation"/>).
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/release-15.sgml b/doc/src/sgml/release-15.sgml
index 47ac329e79..6da3f89d08 100644
--- a/doc/src/sgml/release-15.sgml
+++ b/doc/src/sgml/release-15.sgml
@@ -63,11 +63,13 @@ Author: Noah Misch <noah@leadboat.com>
permissions on the <literal>public</literal> schema has not
been changed. Databases restored from previous Postgres releases
will be restored with their current permissions. Users wishing
- to have the former permissions will need to grant
+ to have the old permissions on new objects will need to grant
<literal>CREATE</literal> permission for <literal>PUBLIC</literal>
on the <literal>public</literal> schema; this change can be made
on <literal>template1</literal> to cause all new databases
- to have these permissions.
+ to have these permissions. <literal>template1</literal>
+ permissions for <application>pg_dumpall</application> and
+ <application>pg_upgrade</application>?
</para>
</listitem>
@@ -83,7 +85,7 @@ Author: Noah Misch <noah@leadboat.com>
</para>
<para>
- Previously it was the literal user name of the bootstrap superuser.
+ Previously it was the literal user name of the database owner.
Databases restored from previous Postgres releases will be restored
with their current owner specification.
</para>
On Tue, Jun 28, 2022 at 05:22:34PM -0600, Roberto Mello wrote:
Here's a patch to clarify the BRIN indexes documentation, particularly with
regards to autosummarize, vacuum and autovacuum. It basically breaks down a
big blob of a paragraph into multiple paragraphs for clarity, plus explicitly
tells how summarization happens manually or automatically.
See also this older thread
/messages/by-id/20220224193520.GY9008@telsasoft.com
--
Justin
On 2022-Jun-28, Roberto Mello wrote:
Here's a patch to clarify the BRIN indexes documentation, particularly with
regards to autosummarize, vacuum and autovacuum. It basically breaks
down a big blob of a paragraph into multiple paragraphs for clarity,
plus explicitly tells how summarization happens manually or
automatically.
[Some of] these additions are wrong actually. It says that autovacuum
will not summarize new entries; but it does. If you just let the table
sit idle, any autovacuum run that cleans the table will also summarize
any ranges that need summarization.
What 'autosummarization=off' means is that the behavior to trigger an
immediate summarization of a range once it becomes full is not default.
This is very different.
As for the new <para></para>s that you added, I'd say they're
stylistically wrong. Each paragraph is supposed to be one fully
contained idea; what these tags do is split each idea across several
smaller paragraphs. This is likely subjective though.
On this topic... I'm not familiar with with the internals of BRIN
indexes and in backend/access/common/reloptions.c I see:{
"autosummarize",
"Enables automatic summarization on this BRIN index",
RELOPT_KIND_BRIN,
AccessExclusiveLock
},Is the exclusive lock on the index why autosummarize is off by default?
No. The lock level mentioned here is what needs to be taken in order to
change the value of this option.
What would be the downside (if any) of having autosummarize=on by default?
I'm not aware of any. Maybe we should turn it on by default.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
What about this?
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Java is clearly an example of money oriented programming" (A. Stepanov)
Attachments:
brin-autosummarize-docs-2.patchtext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/brin.sgml b/doc/src/sgml/brin.sgml
index caf1ea4cef..0a715d41c7 100644
--- a/doc/src/sgml/brin.sgml
+++ b/doc/src/sgml/brin.sgml
@@ -73,31 +73,55 @@
summarized range, that range does not automatically acquire a summary
tuple; those tuples remain unsummarized until a summarization run is
invoked later, creating initial summaries.
- This process can be invoked manually using the
- <function>brin_summarize_range(regclass, bigint)</function> or
- <function>brin_summarize_new_values(regclass)</function> functions;
- automatically when <command>VACUUM</command> processes the table;
- or by automatic summarization executed by autovacuum, as insertions
- occur. (This last trigger is disabled by default and can be enabled
- with the <literal>autosummarize</literal> parameter.)
- Conversely, a range can be de-summarized using the
- <function>brin_desummarize_range(regclass, bigint)</function> function,
- which is useful when the index tuple is no longer a very good
- representation because the existing values have changed.
+ </para>
+
+ <para>
+ There are several triggers for initial summarization of a page range
+ to occur. If the table is vacuumed, either because
+ <xref linkend="sql-vacuum" /> has been manually invoked or because
+ autovacuum causes it,
+ all existing unsummarized page ranges are summarized.
+ Also, if the index has the
+ <xref linkend="index-reloption-autosummarize"/> parameter set to on,
+ then any run of autovacuum in the database will summarize all
+ unsummarized page ranges that have been completely filled recently,
+ regardless of whether the table is processed by autovacuum for other
+ reasons; see below.
+ Lastly, the following functions can be used:
+
+ <simplelist>
+ <member>
+ <function>brin_summarize_range(regclass, bigint)</function>
+ summarizes all unsummarized ranges
+ </member>
+ <member>
+ <function>brin_summarize_new_values(regclass)</function>
+ summarizes one specific range, if it is unsummarized
+ </member>
+ </simplelist>
</para>
<para>
When autosummarization is enabled, each time a page range is filled a
- request is sent to autovacuum for it to execute a targeted summarization
- for that range, to be fulfilled at the end of the next worker run on the
- same database. If the request queue is full, the request is not recorded
- and a message is sent to the server log:
+ request is sent to <literal>autovacuum</literal> for it to execute a targeted
+ summarization for that range, to be fulfilled at the end of the next
+ autovacuum worker run on the same database. If the request queue is full, the
+ request is not recorded and a message is sent to the server log:
<screen>
LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was not recorded
</screen>
When this happens, the range will be summarized normally during the next
regular vacuum of the table.
</para>
+
+ <para>
+ Conversely, a range can be de-summarized using the
+ <function>brin_desummarize_range(regclass, bigint)</function> function,
+ which is useful when the index tuple is no longer a very good
+ representation because the existing values have changed.
+ See <xref linkend="functions-admin-index"/> for details.
+ </para>
+
</sect2>
</sect1>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 9ffcdc629e..d3db03278d 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -580,6 +580,8 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
<para>
Defines whether a summarization run is invoked for the previous page
range whenever an insertion is detected on the next one.
+ See <xref linkend="brin-operation"/> for more details.
+ The default is <literal>off</literal>.
</para>
</listitem>
</varlistentry>
On Mon, Jul 04, 2022 at 09:38:42PM +0200, Alvaro Herrera wrote:
What about this?
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Java is clearly an example of money oriented programming" (A. Stepanov)
diff --git a/doc/src/sgml/brin.sgml b/doc/src/sgml/brin.sgml index caf1ea4cef..0a715d41c7 100644 --- a/doc/src/sgml/brin.sgml +++ b/doc/src/sgml/brin.sgml @@ -73,31 +73,55 @@ summarized range, that range does not automatically acquire a summary tuple; those tuples remain unsummarized until a summarization run is invoked later, creating initial summaries. - This process can be invoked manually using the - <function>brin_summarize_range(regclass, bigint)</function> or - <function>brin_summarize_new_values(regclass)</function> functions; - automatically when <command>VACUUM</command> processes the table; - or by automatic summarization executed by autovacuum, as insertions - occur. (This last trigger is disabled by default and can be enabled - with the <literal>autosummarize</literal> parameter.) - Conversely, a range can be de-summarized using the - <function>brin_desummarize_range(regclass, bigint)</function> function, - which is useful when the index tuple is no longer a very good - representation because the existing values have changed. + </para> +
I feel that somewhere in this paragraph it should be mentioned that is
off by default.
otherwise, +1
--
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL
On Mon, Jul 04, 2022 at 09:38:42PM +0200, Alvaro Herrera wrote:
+ There are several triggers for initial summarization of a page range + to occur. If the table is vacuumed, either because + <xref linkend="sql-vacuum" /> has been manually invoked or because + autovacuum causes it, + all existing unsummarized page ranges are summarized.
I'd say "If the table is vacuumed manually or by autovacuum, ..."
(Or "either manually or by autovacuum, ...")
+ Also, if the index has the + <xref linkend="index-reloption-autosummarize"/> parameter set to on,
Maybe say "If the autovacuum parameter is enabled" (this may avoid needing to
revise it later if we change the default).
+ then any run of autovacuum in the database will summarize all
I'd avoid saying "run" and instead say "then anytime autovacuum runs in that
database, all ..."
+ unsummarized page ranges that have been completely filled recently, + regardless of whether the table is processed by autovacuum for other + reasons; see below.
say "whether the table itself" and remove "for other reasons" ?
<para>
When autosummarization is enabled, each time a page range is filled a
Maybe: filled comma
- request is sent to autovacuum for it to execute a targeted summarization - for that range, to be fulfilled at the end of the next worker run on the - same database. If the request queue is full, the request is not recorded - and a message is sent to the server log: + request is sent to <literal>autovacuum</literal> for it to execute a targeted + summarization for that range, to be fulfilled at the end of the next + autovacuum worker run on the same database. If the request queue is full, the
"to be fulfilled the next time an autovacuum worker finishes running in that
database."
or
"to be fulfilled by an autovacuum worker the next it finishes running in that
database."
+++ b/doc/src/sgml/ref/create_index.sgml @@ -580,6 +580,8 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class= <para> Defines whether a summarization run is invoked for the previous page range whenever an insertion is detected on the next one. + See <xref linkend="brin-operation"/> for more details. + The default is <literal>off</literal>.
Maybe "invoked" should say "queued" ?
Also, a reminder that this was never addressed (I wish the project had a way to
keep track of known issues).
/messages/by-id/20201113160007.GQ30691@telsasoft.com
|error_severity of brin work item
|left | could not open relation with OID 292103095
|left | processing work entry for relation "ts.child.alarms_202010_alarm_clear_time_idx"
|Those happen following a REINDEX job on that index.
This inline patch includes my changes as well as yours.
And the attached patch is my changes only.
diff --git a/doc/src/sgml/brin.sgml b/doc/src/sgml/brin.sgml
index caf1ea4cef1..90897a4af07 100644
--- a/doc/src/sgml/brin.sgml
+++ b/doc/src/sgml/brin.sgml
@@ -73,31 +73,55 @@
summarized range, that range does not automatically acquire a summary
tuple; those tuples remain unsummarized until a summarization run is
invoked later, creating initial summaries.
- This process can be invoked manually using the
- <function>brin_summarize_range(regclass, bigint)</function> or
- <function>brin_summarize_new_values(regclass)</function> functions;
- automatically when <command>VACUUM</command> processes the table;
- or by automatic summarization executed by autovacuum, as insertions
- occur. (This last trigger is disabled by default and can be enabled
- with the <literal>autosummarize</literal> parameter.)
- Conversely, a range can be de-summarized using the
- <function>brin_desummarize_range(regclass, bigint)</function> function,
- which is useful when the index tuple is no longer a very good
- representation because the existing values have changed.
</para>
<para>
- When autosummarization is enabled, each time a page range is filled a
- request is sent to autovacuum for it to execute a targeted summarization
- for that range, to be fulfilled at the end of the next worker run on the
- same database. If the request queue is full, the request is not recorded
- and a message is sent to the server log:
+ There are several ways to trigger the initial summarization of a page range.
+ If the table is vacuumed, either manually or by
+ <link linkend="autovacuum">autovacuum</link>,
+ all existing unsummarized page ranges are summarized.
+ Also, if the index's
+ <xref linkend="index-reloption-autosummarize"/> parameter is enabled,
+ whenever autovacuum runs in that database, summarization will
+ occur for all
+ unsummarized page ranges that have been filled,
+ regardless of whether the table itself is processed by autovacuum; see below.
+
+ Lastly, the following functions can be used:
+
+ <simplelist>
+ <member>
+ <function>brin_summarize_range(regclass, bigint)</function>
+ summarizes all unsummarized ranges
+ </member>
+ <member>
+ <function>brin_summarize_new_values(regclass)</function>
+ summarizes one specific range, if it is unsummarized
+ </member>
+ </simplelist>
+ </para>
+
+ <para>
+ When autosummarization is enabled, each time a page range is filled, a
+ request is sent to <literal>autovacuum</literal> to execute a targeted
+ summarization for that range, to be fulfilled the next time an autovacuum
+ worker finishes running in that database. If the request queue is full, the
+ request is not recorded and a message is sent to the server log:
<screen>
LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was not recorded
</screen>
When this happens, the range will be summarized normally during the next
regular vacuum of the table.
</para>
+
+ <para>
+ Conversely, a range can be de-summarized using the
+ <function>brin_desummarize_range(regclass, bigint)</function> function,
+ which is useful when the index tuple is no longer a very good
+ representation because the existing values have changed.
+ See <xref linkend="functions-admin-index"/> for details.
+ </para>
+
</sect2>
</sect1>
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 9ffcdc629e6..a5bac9f7373 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -578,8 +578,10 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
</term>
<listitem>
<para>
- Defines whether a summarization run is invoked for the previous page
+ Defines whether a summarization run is queued for the previous page
range whenever an insertion is detected on the next one.
+ See <xref linkend="brin-operation"/> for more details.
+ The default is <literal>off</literal>.
</para>
</listitem>
</varlistentry>
Attachments:
0001-f.txttext/x-diff; charset=us-asciiDownload
From 28e8e2106983fc05e9415b078bcf6b158b49337e Mon Sep 17 00:00:00 2001
From: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Mon, 4 Jul 2022 16:11:40 -0500
Subject: [PATCH] f
---
doc/src/sgml/brin.sgml | 28 ++++++++++++++--------------
doc/src/sgml/ref/create_index.sgml | 2 +-
2 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/doc/src/sgml/brin.sgml b/doc/src/sgml/brin.sgml
index 0a715d41c71..90897a4af07 100644
--- a/doc/src/sgml/brin.sgml
+++ b/doc/src/sgml/brin.sgml
@@ -76,17 +76,17 @@
</para>
<para>
- There are several triggers for initial summarization of a page range
- to occur. If the table is vacuumed, either because
- <xref linkend="sql-vacuum" /> has been manually invoked or because
- autovacuum causes it,
+ There are several ways to trigger the initial summarization of a page range.
+ If the table is vacuumed, either manually or by
+ <link linkend="autovacuum">autovacuum</link>,
all existing unsummarized page ranges are summarized.
- Also, if the index has the
- <xref linkend="index-reloption-autosummarize"/> parameter set to on,
- then any run of autovacuum in the database will summarize all
- unsummarized page ranges that have been completely filled recently,
- regardless of whether the table is processed by autovacuum for other
- reasons; see below.
+ Also, if the index's
+ <xref linkend="index-reloption-autosummarize"/> parameter is enabled,
+ whenever autovacuum runs in that database, summarization will
+ occur for all
+ unsummarized page ranges that have been filled,
+ regardless of whether the table itself is processed by autovacuum; see below.
+
Lastly, the following functions can be used:
<simplelist>
@@ -102,10 +102,10 @@
</para>
<para>
- When autosummarization is enabled, each time a page range is filled a
- request is sent to <literal>autovacuum</literal> for it to execute a targeted
- summarization for that range, to be fulfilled at the end of the next
- autovacuum worker run on the same database. If the request queue is full, the
+ When autosummarization is enabled, each time a page range is filled, a
+ request is sent to <literal>autovacuum</literal> to execute a targeted
+ summarization for that range, to be fulfilled the next time an autovacuum
+ worker finishes running in that database. If the request queue is full, the
request is not recorded and a message is sent to the server log:
<screen>
LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was not recorded
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index d3db03278d6..a5bac9f7373 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -578,7 +578,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
</term>
<listitem>
<para>
- Defines whether a summarization run is invoked for the previous page
+ Defines whether a summarization run is queued for the previous page
range whenever an insertion is detected on the next one.
See <xref linkend="brin-operation"/> for more details.
The default is <literal>off</literal>.
--
2.17.1
On 2022-Jul-04, Jaime Casanova wrote:
I feel that somewhere in this paragraph it should be mentioned that is
off by default.
OK, I added it.
On 2022-Jul-04, Justin Pryzby wrote:
[ lots of comments ]
OK, I have adopted all your proposed changes, thanks for submitting in
both forms. I did some more wordsmithing and pushed, to branches 12 and
up. 11 fails 'make check', I think for lack of Docbook id tags, and I
didn't want to waste more time. Kindly re-read the result and let me
know if I left something unaddressed, or made something worse. The
updated text is already visible in the website:
https://www.postgresql.org/docs/devel/brin-intro.html
(Having almost-immediate doc refreshes is an enormous improvement.
Thanks Magnus.)
Also, a reminder that this was never addressed (I wish the project had a way to
keep track of known issues)./messages/by-id/20201113160007.GQ30691@telsasoft.com
|error_severity of brin work item
Yeah, I've not forgotten that item. I can't promise I'll get it fixed
soon, but it's on my list.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Nunca se desea ardientemente lo que solo se desea por razón" (F. Alexandre)
Import Notes
Reply to msg id not found: 20220704212227.GN13040@telsasoft.comYsNScLPHKcBBp4oO@ahch-to | Resolved by subject fallback
On Mon, Jul 4, 2022 at 9:20 AM Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:
[Some of] these additions are wrong actually. It says that autovacuum
will not summarize new entries; but it does. If you just let the table
sit idle, any autovacuum run that cleans the table will also summarize
any ranges that need summarization.What 'autosummarization=off' means is that the behavior to trigger an
immediate summarization of a range once it becomes full is not default.
This is very different.
Without having read through the code, I'll take your word for it. I simply
went with what was written on this phrase of the docs:
"or by automatic summarization executed by autovacuum, as insertions occur.
(This last trigger is disabled by default and can be enabled with the
autosummarize parameter.)"
To me this did not indicate a third behavior, which is what you are
describing, so I'm glad we're having this discussion to clarify it.
As for the new <para></para>s that you added, I'd say they're
stylistically wrong. Each paragraph is supposed to be one fully
contained idea; what these tags do is split each idea across several
smaller paragraphs. This is likely subjective though.
While I don't disagree with you, readability is more important. We have
lots of places (such as that one on the docs) where we have a big blob of
text, reducing readability, IMHO. In the source they are broken by new
lines, but in the rendered HTML, which is what the vast majority of people
read, they get rendered into a big blob-looking-thing.
What would be the downside (if any) of having autosummarize=on by default?
I'm not aware of any. Maybe we should turn it on by default.
+1
Thanks for looking at this Alvaro.
Roberto
--
Cunchy Data -- passion for open source PostgreSQL
On Tue, Jul 5, 2022 at 5:47 AM Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:
OK, I have adopted all your proposed changes, thanks for submitting in
both forms. I did some more wordsmithing and pushed, to branches 12 and
up. 11 fails 'make check', I think for lack of Docbook id tags, and I
didn't want to waste more time. Kindly re-read the result and let me
know if I left something unaddressed, or made something worse. The
updated text is already visible in the website:
https://www.postgresql.org/docs/devel/brin-intro.html
You removed the reference to the functions' documentation at
functions-admin-index choosing instead to duplicate a summarized
version of the docs, and to boot getting the next block to be blobbed
together with it.
Keeping with the reduced-readability theme, you made the paragraphs
even bigger. While I do appreciate the time to clarify things a bit, as was
my original intent with the patch,
We should be writing documentation with the user in mind, not for our
developer eyes. Different target audiences. It is less helpful to have
awesome features that don't get used because users can't really
grasp the docs.
Paragraphs such as this feel like we're playing "summary bingo":
When a new page is created that does not fall within the last
summarized range, the range that the new page belongs into
does not automatically acquire a summary tuple;
those tuples remain unsummarized until a summarization run is
invoked later, creating the initial summary for that range
Roberto
--
Crunchy Data -- passion for open source PostgreSQL
On 2022-Jul-05, Roberto Mello wrote:
You removed the reference to the functions' documentation at
functions-admin-index choosing instead to duplicate a summarized
version of the docs, and to boot getting the next block to be blobbed
together with it.
Actually, my first instinct was to move the interesting parts to the
functions docs, then reference those, removing the duplicate bits. But
I was discouraged when I read it, because it is just a table in a place
not really appropriate for a larger discussion on it. Also, a reference
to it is not direct, but rather it goes to a table that contains a lot
of other stuff.
Keeping with the reduced-readability theme, you made the paragraphs
even bigger. While I do appreciate the time to clarify things a bit, as was
my original intent with the patch, [...]
Hmm, which paragraph are you referring to? I'm not aware of having made
any paragraph bigger, quite the opposite. In the original text, the
paragraph "At the time of creation," is 13 lines on a browser window
that is half the screen; in the patched text, that has been replaced by
three paragraphs that are 7, 6, and 4 lines long, plus a separate one
for the de-summarization bits at the end of the page, which is 3 lines
long.
We should be writing documentation with the user in mind, not for our
developer eyes. Different target audiences. It is less helpful to have
awesome features that don't get used because users can't really
grasp the docs.
I try to do that. I guess I fail more frequently that I should.
Paragraphs such as this feel like we're playing "summary bingo":
When a new page is created that does not fall within the last
summarized range, the range that the new page belongs into
does not automatically acquire a summary tuple;
those tuples remain unsummarized until a summarization run is
invoked later, creating the initial summary for that range
Yeah, I am aware that the word "summary" and variations occur way too
many times. Maybe it is possible to replace "summary tuple" with "BRIN
tuple" for example; can you propose some synonym for "summarized" and
"unsummarized"? Perhaps something like this:
When a new page is created that does not fall within the last
summarized range, the range that the new page belongs into
does not automatically acquire a BRIN tuple;
those [pages] remain uncovered by the BRIN index until a summarization run is
invoked later, creating the initial BRIN tuple for that range
(I also replaced the word "tuples" with "pages" in one spot.)
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Hay dos momentos en la vida de un hombre en los que no debería
especular: cuando puede permitírselo y cuando no puede" (Mark Twain)
On Tue, Jul 05, 2022 at 01:47:27PM +0200, Alvaro Herrera wrote:
OK, I have adopted all your proposed changes, thanks for submitting in
both forms. I did some more wordsmithing and pushed, to branches 12 and
up. 11 fails 'make check', I think for lack of Docbook id tags, and I
didn't want to waste more time. Kindly re-read the result and let me
know if I left something unaddressed, or made something worse. The
updated text is already visible in the website:
https://www.postgresql.org/docs/devel/brin-intro.html
One issue:
+ summarized range, the range that the new page belongs into
+ does not automatically acquire a summary tuple;
"belongs into" sounds wrong - "belongs to" is better.
I'll put that change into my "typos" branch to fix later if it's not addressed
in this thread.
--
Justin
On 2022-Jul-05, Justin Pryzby wrote:
One issue:
+ summarized range, the range that the new page belongs into + does not automatically acquire a summary tuple;"belongs into" sounds wrong - "belongs to" is better.
Hah, and I was wondering if "belongs in" was any better.
I'll put that change into my "typos" branch to fix later if it's not addressed
in this thread.
Roberto has some more substantive comments on the new text, so let's try
and fix everything together. This time, I'll let you guys come up with
a new patch.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/