clarify "rewritten" in pg_checksums docs

Started by Michael Banckover 5 years ago6 messages
#1Michael Banck
michael.banck@credativ.de
1 attachment(s)

Hi,

the pg_checksums docs mention that "When enabling checksums, every file
in the cluster is rewritten".

From IRC discussions, "rewritten" seems ambiguous, it could mean that a
second copy of the file is written and then switched over, implying
increased storage demand during the operation.

So maybe "rewritten in-place" is better, as per the attached?

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Attachments:

pg_checksums_docs.patchtext/x-patch; charset=UTF-8; name=pg_checksums_docs.patchDownload
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index 8e7807f86b..1dd4e54ff1 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -47,8 +47,8 @@ PostgreSQL documentation
 
   <para>
    When verifying checksums, every file in the cluster is scanned. When
-   enabling checksums, every file in the cluster is rewritten. Disabling
-   checksums only updates the file <filename>pg_control</filename>.
+   enabling checksums, every file in the cluster is rewritten in-place.
+   Disabling checksums only updates the file <filename>pg_control</filename>.
   </para>
  </refsect1>
 
#2Daniel Gustafsson
daniel@yesql.se
In reply to: Michael Banck (#1)
Re: clarify "rewritten" in pg_checksums docs

On 1 Sep 2020, at 15:13, Michael Banck <michael.banck@credativ.de> wrote:

the pg_checksums docs mention that "When enabling checksums, every file
in the cluster is rewritten".

From IRC discussions, "rewritten" seems ambiguous, it could mean that a
second copy of the file is written and then switched over, implying
increased storage demand during the operation.

Makes sense, I can see that confusion.

So maybe "rewritten in-place" is better, as per the attached?

Isn't "modified in-place" a more accurate description of the process?

cheers ./daniel

#3Michael Banck
michael.banck@credativ.de
In reply to: Daniel Gustafsson (#2)
Re: clarify "rewritten" in pg_checksums docs

Hi,

Am Dienstag, den 01.09.2020, 15:29 +0200 schrieb Daniel Gustafsson:

On 1 Sep 2020, at 15:13, Michael Banck <michael.banck@credativ.de> wrote:
the pg_checksums docs mention that "When enabling checksums, every file
in the cluster is rewritten".

From IRC discussions, "rewritten" seems ambiguous, it could mean that a
second copy of the file is written and then switched over, implying
increased storage demand during the operation.

Makes sense, I can see that confusion.

So maybe "rewritten in-place" is better, as per the attached?

Isn't "modified in-place" a more accurate description of the process?

AIUI we do rewrite the whole file (block by block, after updating the
page header with the checksum), so yeah, I though about using modified
instead but then decided rewritten is pretty (or even more) accurate.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

#4Daniel Gustafsson
daniel@yesql.se
In reply to: Michael Banck (#3)
Re: clarify "rewritten" in pg_checksums docs

On 1 Sep 2020, at 15:34, Michael Banck <michael.banck@credativ.de> wrote:
Am Dienstag, den 01.09.2020, 15:29 +0200 schrieb Daniel Gustafsson:

Isn't "modified in-place" a more accurate description of the process?

AIUI we do rewrite the whole file (block by block, after updating the
page header with the checksum), so yeah, I though about using modified
instead but then decided rewritten is pretty (or even more) accurate.

Well, I was thinking less technically accurate and more descriptive for end
users, hiding the implementation details. "Rewrite" sounds to me more like
changing data rather than amending pages with a checksum keeping data intact.
Either way, adding "in-place" is an improvement IMO.

cheers ./daniel

#5Michael Paquier
michael@paquier.xyz
In reply to: Daniel Gustafsson (#4)
Re: clarify "rewritten" in pg_checksums docs

On Tue, Sep 01, 2020 at 03:44:06PM +0200, Daniel Gustafsson wrote:

Well, I was thinking less technically accurate and more descriptive for end
users, hiding the implementation details. "Rewrite" sounds to me more like
changing data rather than amending pages with a checksum keeping data intact.
Either way, adding "in-place" is an improvement IMO.

Using rewritten still sounds more adapted to me, as we still write the
thing with chunks of size BLCKSZ. No objections with the addition of
"in-place" for that sentence. Any extra opinions?
--
Michael

#6Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#5)
Re: clarify "rewritten" in pg_checksums docs

On Wed, Sep 02, 2020 at 05:26:16PM +0900, Michael Paquier wrote:

Using rewritten still sounds more adapted to me, as we still write the
thing with chunks of size BLCKSZ. No objections with the addition of
"in-place" for that sentence. Any extra opinions?

Seeing no objections, I have applied the original patch of this thread
down to 12.
--
Michael