Minor typo
Greetings,
While reviewing a bit of code around full page images, I came across a
typo and fixed it in the attach.
Sending it here in case anyone feels that we should do more than just
correct the word..? Perhaps for non-native English speakers seeing
"whose" used here is confusing?
If I don't hear any concerns then I'll go ahead and push it some time
tomorrow; it's a pretty minor change, of course, and we can always
adjust it further if someone raises an issue later too.
Thanks!
Stephen
Attachments:
fixtypo_20181127.patchtext/x-diff; charset=us-asciiDownload
From fb9b0f00396179af895c13e44bff98eee8302894 Mon Sep 17 00:00:00 2001
From: Stephen Frost <sfrost@snowman.net>
Date: Tue, 27 Nov 2018 18:37:59 -0500
Subject: [PATCH] Fix typo
Change:
When wal_compression is enabled, a full page image which "hole" was
removed is additionally compressed using PGLZ compression algorithm.
to:
When wal_compression is enabled, a full page image whose "hole" was
removed is additionally compressed using PGLZ compression algorithm.
As the correct word to use here is "whose", not "which".
---
src/include/access/xlogrecord.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h
index 863781937e..ccf92def93 100644
--- a/src/include/access/xlogrecord.h
+++ b/src/include/access/xlogrecord.h
@@ -114,7 +114,7 @@ typedef struct XLogRecordBlockHeader
* XLOG record's CRC, either). Hence, the amount of block data actually
* present is BLCKSZ - the length of "hole" bytes.
*
- * When wal_compression is enabled, a full page image which "hole" was
+ * When wal_compression is enabled, a full page image whose "hole" was
* removed is additionally compressed using PGLZ compression algorithm.
* This can reduce the WAL volume, but at some extra cost of CPU spent
* on the compression during WAL logging. In this case, since the "hole"
--
2.17.1
On 28 Nov 2018, at 00:43, Stephen Frost <sfrost@snowman.net> wrote:
Sending it here in case anyone feels that we should do more than just
correct the word..? Perhaps for non-native English speakers seeing
"whose" used here is confusing?
Being a non-native English speaker I think it’s fine and, in my own bias,
commonly understood.
+1 on raising the “does this make sense for a non-native speaker” question
though, making the documentation (regardless of if it’s docs, code comments or
READMEs) accessible is very important.
cheers ./daniel
Daniel Gustafsson <daniel@yesql.se> writes:
On 28 Nov 2018, at 00:43, Stephen Frost <sfrost@snowman.net> wrote:
Sending it here in case anyone feels that we should do more than just
correct the word..? Perhaps for non-native English speakers seeing
"whose" used here is confusing?
Being a non-native English speaker I think it’s fine and, in my own bias,
commonly understood.
I agree that "which" is just wrong here, but "whose" seems a bit malaprop
as well, and it's not the only poor English in that sentence. Maybe
rewrite the whole sentence to avoid the problem?
* When wal_compression is enabled and a "hole" is removed from a full
* page image, the page image is compressed using PGLZ compression.
(BTW, is this trying to say that we don't apply compression if the page
contains no hole? If so, why not?)
regards, tom lane
On Tue, Nov 27, 2018 at 07:18:39PM -0500, Tom Lane wrote:
* When wal_compression is enabled and a "hole" is removed from a full
* page image, the page image is compressed using PGLZ compression.(BTW, is this trying to say that we don't apply compression if the page
contains no hole? If so, why not?)
Compression can be applied on a page with or without a hole. What this
sentence means is that the equivalent of a double level of compression
can be applied on top of the hole being removed, if the hole can be
removed.
--
Michael
Greetings,
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
On 28 Nov 2018, at 00:43, Stephen Frost <sfrost@snowman.net> wrote:
Sending it here in case anyone feels that we should do more than just
correct the word..? Perhaps for non-native English speakers seeing
"whose" used here is confusing?Being a non-native English speaker I think it’s fine and, in my own bias,
commonly understood.I agree that "which" is just wrong here, but "whose" seems a bit malaprop
as well, and it's not the only poor English in that sentence. Maybe
rewrite the whole sentence to avoid the problem?* When wal_compression is enabled and a "hole" is removed from a full
* page image, the page image is compressed using PGLZ compression.
Yeah, that seems a bit cleaner.
(BTW, is this trying to say that we don't apply compression if the page
contains no hole? If so, why not?)
Sure seems to be saying that, and later on..
* If BKPIMAGE_HAS_HOLE and BKPIMAGE_IS_COMPRESSED, an
* XLogRecordBlockCompressHeader struct follows.
Yet XLogCompressBackupBlock() sure looks like it'll happily compress a
page even if it doesn't have a hole. The replay logic certainly seems
to only check if BKPIMAGE_IS_COMPRESSED is set.
I'm thinking there's quite a bit of room for improvement here... I
wonder if the idea originally was "the page is 'compressed', in some
way, if it either had a hole or was actually compressed"..
Thanks!
Stephen
Greetings,
* Michael Paquier (michael@paquier.xyz) wrote:
On Tue, Nov 27, 2018 at 07:18:39PM -0500, Tom Lane wrote:
* When wal_compression is enabled and a "hole" is removed from a full
* page image, the page image is compressed using PGLZ compression.(BTW, is this trying to say that we don't apply compression if the page
contains no hole? If so, why not?)Compression can be applied on a page with or without a hole. What this
sentence means is that the equivalent of a double level of compression
can be applied on top of the hole being removed, if the hole can be
removed.
That isn't at all what I got from that.
A rewrite of this really should avoid talking about removing the hole as
if it's 'compression' because, first of all, it isn't, and second, now
that we have *actual* compression happening, it's terribly confusing to
talk about removing the hole using that terminology.
Thanks!
Stephen
On Tue, Nov 27, 2018 at 07:51:03PM -0500, Stephen Frost wrote:
That isn't at all what I got from that.
A rewrite of this really should avoid talking about removing the hole as
if it's 'compression' because, first of all, it isn't, and second, now
that we have *actual* compression happening, it's terribly confusing to
talk about removing the hole using that terminology.
You may want to be careful about other comments in xloginsert.c or such
then. Removing a page hole is mentioned in a couple of places as a form
of compression if I recall correctly.
--
Michael
This is a noise from a Japanese having poor English skill..
At Wed, 28 Nov 2018 10:01:36 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20181128010136.GU1716@paquier.xyz>
On Tue, Nov 27, 2018 at 07:51:03PM -0500, Stephen Frost wrote:
That isn't at all what I got from that.
A rewrite of this really should avoid talking about removing the hole as
if it's 'compression' because, first of all, it isn't, and second, now
that we have *actual* compression happening, it's terribly confusing to
talk about removing the hole using that terminology.You may want to be careful about other comments in xloginsert.c or such
then. Removing a page hole is mentioned in a couple of places as a form
of compression if I recall correctly.
org> When wal_compression is enabled, a full page image which "hole" was
org> removed is additionally compressed using PGLZ compression algorithm.
Even though it has an obvious mistake, I could read this as
full_page_image is always compressed after removing a hole in it
if any. (I'm not sure it makes correct sense, though.) I feel
hopelessly that the sentence structure model in my mind is
different from that of natives. I need to study harder, but..
Sorry for the noise in advance.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Greetings,
* Kyotaro HORIGUCHI (horiguchi.kyotaro@lab.ntt.co.jp) wrote:
At Wed, 28 Nov 2018 10:01:36 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20181128010136.GU1716@paquier.xyz>
On Tue, Nov 27, 2018 at 07:51:03PM -0500, Stephen Frost wrote:
That isn't at all what I got from that.
A rewrite of this really should avoid talking about removing the hole as
if it's 'compression' because, first of all, it isn't, and second, now
that we have *actual* compression happening, it's terribly confusing to
talk about removing the hole using that terminology.You may want to be careful about other comments in xloginsert.c or such
then. Removing a page hole is mentioned in a couple of places as a form
of compression if I recall correctly.org> When wal_compression is enabled, a full page image which "hole" was
org> removed is additionally compressed using PGLZ compression algorithm.Even though it has an obvious mistake, I could read this as
full_page_image is always compressed after removing a hole in it
if any. (I'm not sure it makes correct sense, though.) I feel
hopelessly that the sentence structure model in my mind is
different from that of natives. I need to study harder, but..
Thanks to everyone for sharing their thoughts here, the goal is simply
to try and have the comments as clear as we can for everyone.
Please find attached a larger rewording of the comment in xlogrecord.h,
and a minor change to xloginsert.c to clarify that we're removing a hole
and not doing compression at that point.
Other thoughts on this..?
Thanks!
Stephen
Attachments:
fix-xlog-typo-reword.patchtext/x-diff; charset=us-asciiDownload
From febd0feac0a7e2cb216c7cae5af7a5c458d3e394 Mon Sep 17 00:00:00 2001
From: Stephen Frost <sfrost@snowman.net>
Date: Tue, 4 Dec 2018 11:30:37 -0500
Subject: [PATCH] Cleanup comments in xlog compression
Skipping over the "hole" in full page images in the XLOG code was
described as being a form of compression, but this got a bit confusing
since we now have PGLZ-based compression happening, so adjust the
wording to discuss "removing" the "hole" and keeping the talk about
compression to where we're talking about using PGLZ-based compression of
the full page images.
Discussion: https://postgr.es/m/20181127234341.GM3415@tamriel.snowman.net
---
src/backend/access/transam/xloginsert.c | 2 +-
src/include/access/xlogrecord.h | 13 ++++++-------
2 files changed, 7 insertions(+), 8 deletions(-)
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 34d4db4297..034d5b3b62 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -605,7 +605,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
}
else
{
- /* No "hole" to compress out */
+ /* No "hole" to remove */
bimg.hole_offset = 0;
cbimg.hole_length = 0;
}
diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h
index 863781937e..070d6d7cf1 100644
--- a/src/include/access/xlogrecord.h
+++ b/src/include/access/xlogrecord.h
@@ -107,15 +107,14 @@ typedef struct XLogRecordBlockHeader
* Additional header information when a full-page image is included
* (i.e. when BKPBLOCK_HAS_IMAGE is set).
*
- * As a trivial form of data compression, the XLOG code is aware that
- * PG data pages usually contain an unused "hole" in the middle, which
- * contains only zero bytes. If the length of "hole" > 0 then we have removed
- * such a "hole" from the stored data (and it's not counted in the
- * XLOG record's CRC, either). Hence, the amount of block data actually
+ * The XLOG code is aware that PG data pages usually contain an unused "hole"
+ * in the middle, which contains only zero bytes. Since we know that the
+ * "hole" is all zeros, we remove it from the stored data (and it's not counted
+ * in the XLOG record's CRC, either). Hence, the amount of block data actually
* present is BLCKSZ - the length of "hole" bytes.
*
- * When wal_compression is enabled, a full page image which "hole" was
- * removed is additionally compressed using PGLZ compression algorithm.
+ * Additionally, when wal_compression is enabled, we will try to compress full
+ * page images using the PGLZ compression algorithm, after removing the "hole".
* This can reduce the WAL volume, but at some extra cost of CPU spent
* on the compression during WAL logging. In this case, since the "hole"
* length cannot be calculated by subtracting the number of page image bytes
--
2.17.1
At Tue, 4 Dec 2018 11:37:16 -0500, Stephen Frost <sfrost@snowman.net> wrote in <20181204163716.GR3415@tamriel.snowman.net>
Thanks to everyone for sharing their thoughts here, the goal is simply
to try and have the comments as clear as we can for everyone.Please find attached a larger rewording of the comment in xlogrecord.h,
and a minor change to xloginsert.c to clarify that we're removing a hole
and not doing compression at that point.Other thoughts on this..?
Thank you for the complete rewriting. It makes the description
clearer at least for me, execpt the following:
* present is BLCKSZ - the length of "hole" bytes.
Maybe it's because I'm not so accustomed to punctuation marks but
I was confused for a second because the '-' didn't look to me a
minus sign, but a dash. If it is not specific to me, a word
'minus' seems better or, (BLCKSZ - <the length of "hole">) bytes
is clearer .... for me.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Greetings,
* Kyotaro HORIGUCHI (horiguchi.kyotaro@lab.ntt.co.jp) wrote:
At Tue, 4 Dec 2018 11:37:16 -0500, Stephen Frost <sfrost@snowman.net> wrote in <20181204163716.GR3415@tamriel.snowman.net>
Thanks to everyone for sharing their thoughts here, the goal is simply
to try and have the comments as clear as we can for everyone.Please find attached a larger rewording of the comment in xlogrecord.h,
and a minor change to xloginsert.c to clarify that we're removing a hole
and not doing compression at that point.Other thoughts on this..?
Thank you for the complete rewriting. It makes the description
clearer at least for me, execpt the following:* present is BLCKSZ - the length of "hole" bytes.
Maybe it's because I'm not so accustomed to punctuation marks but
I was confused for a second because the '-' didn't look to me a
minus sign, but a dash. If it is not specific to me, a word
'minus' seems better or, (BLCKSZ - <the length of "hole">) bytes
is clearer .... for me.
I agree- that threw me off a bit also when I was first reading it. I've
updated that part a bit and pushed it.
Thanks!
Stephen