pgstattuple: fix free space calculation

Started by Frédéric Yhuelover 1 year ago14 messages
#1Frédéric Yhuel
frederic.yhuel@dalibo.com
1 attachment(s)

Hello,

I think that pgstattuple should use PageGetExactFreeSpace() instead of
PageGetHeapFreeSpace() or PageGetFreeSpace(). The latter two compute the
free space minus the space of a line pointer. They are used like this in
the rest of the code (heapam.c):

pagefree = PageGetHeapFreeSpace(page);

if (newtupsize > pagefree) { we need a another page for the tuple }

... so it makes sense to take the line pointer into account in this context.

But it in the pgstattuple context, I think we want the exact free space.

I have attached a patch.

Best regards,
Frédéric

Attachments:

0001-pgstattuple-use-PageGetExactFreeSpace.patchtext/x-patch; charset=UTF-8; name=0001-pgstattuple-use-PageGetExactFreeSpace.patchDownload
From 350abec004fe472922800135d5d94ca3d8212da4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= <frederic.yhuel@dalibo.com>
Date: Wed, 21 Aug 2024 15:05:11 +0200
Subject: [PATCH] pgstattuple: use PageGetExactFreeSpace()

instead of PageGetHeapFreeSpace() and PageGetFreeSpace()

We want the exact free space, we don't care if there's enough room for
another line pointer.
---
 contrib/pgstattuple/pgstatapprox.c | 2 +-
 contrib/pgstattuple/pgstatindex.c  | 2 +-
 contrib/pgstattuple/pgstattuple.c  | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index c84c642355..a27dea621e 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -111,7 +111,7 @@ statapprox_heap(Relation rel, output_type *stat)
 		 * treat them as being free space for our purposes.
 		 */
 		if (!PageIsNew(page))
-			stat->free_space += PageGetHeapFreeSpace(page);
+			stat->free_space += PageGetExactFreeSpace(page);
 		else
 			stat->free_space += BLCKSZ - SizeOfPageHeaderData;
 
diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
index 5c06ba6db4..1b6b768cf8 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -311,7 +311,7 @@ pgstatindex_impl(Relation rel, FunctionCallInfo fcinfo)
 
 			max_avail = BLCKSZ - (BLCKSZ - ((PageHeader) page)->pd_special + SizeOfPageHeaderData);
 			indexStat.max_avail += max_avail;
-			indexStat.free_space += PageGetFreeSpace(page);
+			indexStat.free_space += PageGetExactFreeSpace(page);
 
 			indexStat.leaf_pages++;
 
diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c
index 3bd8b96197..7e2a7262a3 100644
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -372,7 +372,7 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
 			buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block,
 										RBM_NORMAL, hscan->rs_strategy);
 			LockBuffer(buffer, BUFFER_LOCK_SHARE);
-			stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
+			stat.free_space += PageGetExactFreeSpace((Page) BufferGetPage(buffer));
 			UnlockReleaseBuffer(buffer);
 			block++;
 		}
@@ -385,7 +385,7 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
 		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block,
 									RBM_NORMAL, hscan->rs_strategy);
 		LockBuffer(buffer, BUFFER_LOCK_SHARE);
-		stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
+		stat.free_space += PageGetExactFreeSpace((Page) BufferGetPage(buffer));
 		UnlockReleaseBuffer(buffer);
 		block++;
 	}
@@ -565,7 +565,7 @@ pgstat_index_page(pgstattuple_type *stat, Page page,
 {
 	OffsetNumber i;
 
-	stat->free_space += PageGetFreeSpace(page);
+	stat->free_space += PageGetExactFreeSpace(page);
 
 	for (i = minoff; i <= maxoff; i = OffsetNumberNext(i))
 	{
-- 
2.39.2

#2Rafia Sabih
rafia.pghackers@gmail.com
In reply to: Frédéric Yhuel (#1)
Re: pgstattuple: fix free space calculation

On Thu, 22 Aug 2024 at 10:11, Frédéric Yhuel <frederic.yhuel@dalibo.com>
wrote:

Hello,

I think that pgstattuple should use PageGetExactFreeSpace() instead of
PageGetHeapFreeSpace() or PageGetFreeSpace(). The latter two compute the
free space minus the space of a line pointer. They are used like this in
the rest of the code (heapam.c):

pagefree = PageGetHeapFreeSpace(page);

if (newtupsize > pagefree) { we need a another page for the tuple }

... so it makes sense to take the line pointer into account in this
context.

But it in the pgstattuple context, I think we want the exact free space.

I have attached a patch.

Best regards,
Frédéric

I agree with the approach here.
A minor comment here is to change the comments in code referring to the
PageGetHeapFreeSpace.

--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -111,7 +111,7 @@ statapprox_heap(Relation rel, output_type *stat)
                 * treat them as being free space for our purposes.
                 */
                if (!PageIsNew(page))
-                       stat->free_space += PageGetHeapFreeSpace(page);
+                       stat->free_space += PageGetExactFreeSpace(page);
-- 
Regards,
Rafia Sabih
#3Frédéric Yhuel
frederic.yhuel@dalibo.com
In reply to: Rafia Sabih (#2)
1 attachment(s)
Re: pgstattuple: fix free space calculation

On 8/22/24 21:56, Rafia Sabih wrote:

I agree with the approach here.
A minor comment here is to change the comments in code referring to the
PageGetHeapFreeSpace.

Thank you Rafia. Here is a v2 patch.

I've also added this to the commit message:

Also, PageGetHeapFreeSpace() will return zero if there are already
MaxHeapTuplesPerPage line pointers in the page and none are free. We
don't want that either, because here we want to keep track of the free
space after a page pruning operation even in the (very unlikely) case
that there are MaxHeapTuplesPerPage line pointers in the page.

Attachments:

v2-0001-pgstattuple-use-PageGetExactFreeSpace.patchtext/x-patch; charset=UTF-8; name=v2-0001-pgstattuple-use-PageGetExactFreeSpace.patchDownload
From c4d0569cd3dbcec7f84a794e927cfc81e137ff6a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= <frederic.yhuel@dalibo.com>
Date: Wed, 21 Aug 2024 15:05:11 +0200
Subject: [PATCH v2] pgstattuple: use PageGetExactFreeSpace()

instead of PageGetHeapFreeSpace() or PageGetFreeSpace()

We want the exact free space, we don't care if there's enough room for
another line pointer.

Also, PageGetHeapFreeSpace() will return zero if there are already
MaxHeapTuplesPerPage line pointers in the page and none are free. We
don't want that either, because here we want to keep track of the free
space after a page pruning operation even in the (very unlikely) case
that there are MaxHeapTuplesPerPage line pointers in the page.
---
 contrib/pgstattuple/pgstatapprox.c | 4 ++--
 contrib/pgstattuple/pgstatindex.c  | 2 +-
 contrib/pgstattuple/pgstattuple.c  | 6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index c84c642355..b63a9932d7 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -107,11 +107,11 @@ statapprox_heap(Relation rel, output_type *stat)
 		page = BufferGetPage(buf);
 
 		/*
-		 * It's not safe to call PageGetHeapFreeSpace() on new pages, so we
+		 * It's not safe to call PageGetExactFreeSpace() on new pages, so we
 		 * treat them as being free space for our purposes.
 		 */
 		if (!PageIsNew(page))
-			stat->free_space += PageGetHeapFreeSpace(page);
+			stat->free_space += PageGetExactFreeSpace(page);
 		else
 			stat->free_space += BLCKSZ - SizeOfPageHeaderData;
 
diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
index 5c06ba6db4..1b6b768cf8 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -311,7 +311,7 @@ pgstatindex_impl(Relation rel, FunctionCallInfo fcinfo)
 
 			max_avail = BLCKSZ - (BLCKSZ - ((PageHeader) page)->pd_special + SizeOfPageHeaderData);
 			indexStat.max_avail += max_avail;
-			indexStat.free_space += PageGetFreeSpace(page);
+			indexStat.free_space += PageGetExactFreeSpace(page);
 
 			indexStat.leaf_pages++;
 
diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c
index 3bd8b96197..7e2a7262a3 100644
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -372,7 +372,7 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
 			buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block,
 										RBM_NORMAL, hscan->rs_strategy);
 			LockBuffer(buffer, BUFFER_LOCK_SHARE);
-			stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
+			stat.free_space += PageGetExactFreeSpace((Page) BufferGetPage(buffer));
 			UnlockReleaseBuffer(buffer);
 			block++;
 		}
@@ -385,7 +385,7 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
 		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block,
 									RBM_NORMAL, hscan->rs_strategy);
 		LockBuffer(buffer, BUFFER_LOCK_SHARE);
-		stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
+		stat.free_space += PageGetExactFreeSpace((Page) BufferGetPage(buffer));
 		UnlockReleaseBuffer(buffer);
 		block++;
 	}
@@ -565,7 +565,7 @@ pgstat_index_page(pgstattuple_type *stat, Page page,
 {
 	OffsetNumber i;
 
-	stat->free_space += PageGetFreeSpace(page);
+	stat->free_space += PageGetExactFreeSpace(page);
 
 	for (i = minoff; i <= maxoff; i = OffsetNumberNext(i))
 	{
-- 
2.39.2

#4Rafia Sabih
rafia.pghackers@gmail.com
In reply to: Frédéric Yhuel (#3)
Re: pgstattuple: fix free space calculation

On Fri, 23 Aug 2024 at 11:01, Frédéric Yhuel <frederic.yhuel@dalibo.com>
wrote:

On 8/22/24 21:56, Rafia Sabih wrote:

I agree with the approach here.
A minor comment here is to change the comments in code referring to the
PageGetHeapFreeSpace.

Thank you Rafia. Here is a v2 patch.

I've also added this to the commit message:

Also, PageGetHeapFreeSpace() will return zero if there are already
MaxHeapTuplesPerPage line pointers in the page and none are free. We
don't want that either, because here we want to keep track of the free
space after a page pruning operation even in the (very unlikely) case
that there are MaxHeapTuplesPerPage line pointers in the page.

On the other hand, this got me thinking about the purpose of this space
information.
If we want to understand that there's still some space for the tuples in a
page, then using PageGetExactFreeSpace is not doing justice in case of heap
page, because we will not be able to add any more tuples there if there are
already MaxHeapTuplesPerPage tuples there.

--
Regards,
Rafia Sabih

#5Frédéric Yhuel
frederic.yhuel@dalibo.com
In reply to: Rafia Sabih (#4)
Re: pgstattuple: fix free space calculation

On 8/23/24 12:02, Rafia Sabih wrote:

On the other hand, this got me thinking about the purpose of this space
information.
If we want to understand that there's still some space for the tuples in
a page, then using PageGetExactFreeSpace is not doing justice in case of
heap page, because we will not be able to add any more tuples there if
there are already MaxHeapTuplesPerPage tuples there.

We won't be able to add, but we will be able to update a tuple in this
page. It's hard to test, because I can't fit more than 226 tuples on a
single page, while MaxHeapTuplesPerPage = 291 on my machine :-)

In any case, IMVHO, pgstattuple shouldn't answer to the question "can I
add more tuples?". The goal is for educational, introspection or
debugging purposes, and we want the exact amount of free space.

Best regards,
Frédéric

#6Andreas Karlsson
andreas@proxel.se
In reply to: Rafia Sabih (#4)
Re: pgstattuple: fix free space calculation

On 8/23/24 12:02 PM, Rafia Sabih wrote:> On the other hand, this got me
thinking about the purpose of this space > information.

If we want to understand that there's still some space for the tuples in
a page, then using PageGetExactFreeSpace is not doing justice in case of
heap page, because we will not be able to add any more tuples there if
there are already MaxHeapTuplesPerPage tuples there.

I think the new behavior is the more useful one since what if someone
wants to know the free space since they want to insert two tuples and
not just one? I do not think the function should assume that the only
reason someone would want to know the size is because they want to
insert exactly one new tuple.

I am less certain about what the right behavior on pages where we are
out of line pointers should be but I am leaning towards that the new
behvior is better than the old but can see a case for either.

Tested the patch and it works as advertised.

Andreas

#7Frédéric Yhuel
frederic.yhuel@dalibo.com
In reply to: Frédéric Yhuel (#5)
1 attachment(s)
Re: pgstattuple: fix free space calculation

On 8/23/24 12:51, Frédéric Yhuel wrote:

On 8/23/24 12:02, Rafia Sabih wrote:

On the other hand, this got me thinking about the purpose of this
space information.
If we want to understand that there's still some space for the tuples
in a page, then using PageGetExactFreeSpace is not doing justice in
case of heap page, because we will not be able to add any more tuples
there if there are already MaxHeapTuplesPerPage tuples there.

We won't be able to add, but we will be able to update a tuple in this
page.

Sorry, that's not true.

So in this marginal case we have free space that's unusable in practice.
No INSERT or UPDATE (HOT or not) is possible inside the page.

I don't know what pgstattuple should do in this case.

However, we should never encounter this case in practice (maybe on some
exotic architectures with strange alignment behavior?). As I said, I
can't fit more than 226 tuples per page on my machine, while
MaxHeapTuplesPerPage is 291. Am I missing something?

Besides, pgstattuple isn't mission critical, is it?

So I think we should just use PageGetExactFreeSpace().

Here is a v3 patch. It's the same as v2, I only removed the last
paragraph in the commit message.

Thank you Rafia and Andreas for your review and test.

Best regards,
Frédéric

Attachments:

v3-0001-pgstattuple-use-PageGetExactFreeSpace.patchtext/x-patch; charset=UTF-8; name=v3-0001-pgstattuple-use-PageGetExactFreeSpace.patchDownload
From f749d4ca2d6881a916c6ca2a3b882bb2740188d4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= <frederic.yhuel@dalibo.com>
Date: Wed, 21 Aug 2024 15:05:11 +0200
Subject: [PATCH v3] pgstattuple: use PageGetExactFreeSpace()

instead of PageGetHeapFreeSpace() or PageGetFreeSpace()

We want the exact free space, we don't care if there's enough room for
another line pointer.
---
 contrib/pgstattuple/pgstatapprox.c | 4 ++--
 contrib/pgstattuple/pgstatindex.c  | 2 +-
 contrib/pgstattuple/pgstattuple.c  | 6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index c84c642355..b63a9932d7 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -107,11 +107,11 @@ statapprox_heap(Relation rel, output_type *stat)
 		page = BufferGetPage(buf);
 
 		/*
-		 * It's not safe to call PageGetHeapFreeSpace() on new pages, so we
+		 * It's not safe to call PageGetExactFreeSpace() on new pages, so we
 		 * treat them as being free space for our purposes.
 		 */
 		if (!PageIsNew(page))
-			stat->free_space += PageGetHeapFreeSpace(page);
+			stat->free_space += PageGetExactFreeSpace(page);
 		else
 			stat->free_space += BLCKSZ - SizeOfPageHeaderData;
 
diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
index 5c06ba6db4..1b6b768cf8 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -311,7 +311,7 @@ pgstatindex_impl(Relation rel, FunctionCallInfo fcinfo)
 
 			max_avail = BLCKSZ - (BLCKSZ - ((PageHeader) page)->pd_special + SizeOfPageHeaderData);
 			indexStat.max_avail += max_avail;
-			indexStat.free_space += PageGetFreeSpace(page);
+			indexStat.free_space += PageGetExactFreeSpace(page);
 
 			indexStat.leaf_pages++;
 
diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c
index 3bd8b96197..7e2a7262a3 100644
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -372,7 +372,7 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
 			buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block,
 										RBM_NORMAL, hscan->rs_strategy);
 			LockBuffer(buffer, BUFFER_LOCK_SHARE);
-			stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
+			stat.free_space += PageGetExactFreeSpace((Page) BufferGetPage(buffer));
 			UnlockReleaseBuffer(buffer);
 			block++;
 		}
@@ -385,7 +385,7 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
 		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block,
 									RBM_NORMAL, hscan->rs_strategy);
 		LockBuffer(buffer, BUFFER_LOCK_SHARE);
-		stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
+		stat.free_space += PageGetExactFreeSpace((Page) BufferGetPage(buffer));
 		UnlockReleaseBuffer(buffer);
 		block++;
 	}
@@ -565,7 +565,7 @@ pgstat_index_page(pgstattuple_type *stat, Page page,
 {
 	OffsetNumber i;
 
-	stat->free_space += PageGetFreeSpace(page);
+	stat->free_space += PageGetExactFreeSpace(page);
 
 	for (i = minoff; i <= maxoff; i = OffsetNumberNext(i))
 	{
-- 
2.39.2

#8Andreas Karlsson
andreas@proxel.se
In reply to: Frédéric Yhuel (#7)
Re: pgstattuple: fix free space calculation

On 8/29/24 4:53 PM, Frédéric Yhuel wrote:

So I think we should just use PageGetExactFreeSpace().

I agree, I feel that is the least surprising behavior because we
currently sum tiny amounts of free space that is unusable anyway. E.g.
imagine one million pages with 10 free bytes each, that looks like 10
free MB so I do not see why we should treat the max tuples per page case
with any special logic.

Andreas

#9Rafia Sabih
rafia.pghackers@gmail.com
In reply to: Frédéric Yhuel (#7)
Re: pgstattuple: fix free space calculation

On Thu, 29 Aug 2024 at 16:53, Frédéric Yhuel <frederic.yhuel@dalibo.com>
wrote:

On 8/23/24 12:51, Frédéric Yhuel wrote:

On 8/23/24 12:02, Rafia Sabih wrote:

On the other hand, this got me thinking about the purpose of this
space information.
If we want to understand that there's still some space for the tuples
in a page, then using PageGetExactFreeSpace is not doing justice in
case of heap page, because we will not be able to add any more tuples
there if there are already MaxHeapTuplesPerPage tuples there.

We won't be able to add, but we will be able to update a tuple in this
page.

Sorry, that's not true.

So in this marginal case we have free space that's unusable in practice.
No INSERT or UPDATE (HOT or not) is possible inside the page.

I don't know what pgstattuple should do in this case.

However, we should never encounter this case in practice (maybe on some
exotic architectures with strange alignment behavior?). As I said, I
can't fit more than 226 tuples per page on my machine, while
MaxHeapTuplesPerPage is 291. Am I missing something?

Besides, pgstattuple isn't mission critical, is it?

Yes, also as stated before I am not sure of the utility of this field in
real-world scenarios.
So, I can not comment more on that. That was just one thought that popped
into my head.
Otherwise, the idea seems fine to me.

So I think we should just use PageGetExactFreeSpace().

Here is a v3 patch. It's the same as v2, I only removed the last
paragraph in the commit message.

Thanks for the new patch. LGTM.

Thank you Rafia and Andreas for your review and test.

Thanks to you too.

Best regards,
Frédéric

--
Regards,
Rafia Sabih

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rafia Sabih (#9)
Re: pgstattuple: fix free space calculation

Rafia Sabih <rafia.pghackers@gmail.com> writes:

On Thu, 29 Aug 2024 at 16:53, Frédéric Yhuel <frederic.yhuel@dalibo.com>
wrote:

So I think we should just use PageGetExactFreeSpace().

Here is a v3 patch. It's the same as v2, I only removed the last
paragraph in the commit message.

Thanks for the new patch. LGTM.

I looked at this patch. I agree with making the change. However,
I don't agree with the CF entry's marking of "target version: stable"
(i.e., requesting back-patch). I think this falls somewhere in the
gray area between a bug fix and a definitional change. Also, people
are unlikely to be happy if they suddenly get new, not-comparable
numbers after a minor version update. So I think we should just fix
it in HEAD.

As far as the patch itself goes, the one thing that is bothering me
is this comment change

         /*
-         * It's not safe to call PageGetHeapFreeSpace() on new pages, so we
+         * It's not safe to call PageGetExactFreeSpace() on new pages, so we
          * treat them as being free space for our purposes.
          */

which looks like it wasn't made with a great deal of thought.
Now it seems to me that the comment was already bogus when written:
there isn't anything uncertain about what will happen if you call
either of these functions on a "new" page. PageIsNew checks for

return ((PageHeader) page)->pd_upper == 0;

If pd_upper is 0, PageGet[Exact]FreeSpace is absolutely guaranteed
to return zero, even if pd_lower contains garbage. And then
PageGetHeapFreeSpace will likewise return zero. Perhaps there
could be trouble if we got into the line-pointer-checking part
of PageGetHeapFreeSpace, but we can't. So this comment is wrong,
and is even more obviously wrong after the above change. I thought
for a moment about removing the PageIsNew test altogether, but
then I decided that it probably *is* what we want and is just
mis-explained. I think the comment should read more like

/*
* PageGetExactFreeSpace() will return zero for a "new" page,
* but it's actually usable free space, so count it that way.
*/

Now alternatively you could argue that a "new" page isn't usable free
space yet and so we should count it as zero, just as we don't count
dead tuples as usable free space. You need VACUUM to turn either of
those things into real free space. But that'd be a bigger definitional
change, and I'm not sure we want it. Thoughts?

Also, do we need any documentation change for this? I looked through
https://www.postgresql.org/docs/devel/pgstattuple.html
and didn't see anything that was being very specific about what
"free space" means, so maybe it's fine as-is.

regards, tom lane

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#10)
Re: pgstattuple: fix free space calculation

I wrote:

Now alternatively you could argue that a "new" page isn't usable free
space yet and so we should count it as zero, just as we don't count
dead tuples as usable free space. You need VACUUM to turn either of
those things into real free space. But that'd be a bigger definitional
change, and I'm not sure we want it. Thoughts?

On the third hand: the code in question is in statapprox_heap, which
is presumably meant to deliver numbers comparable to pgstat_heap.
And pgstat_heap takes no special care for "new" pages, it just applies
PageGetHeapFreeSpace (or PageGetExactFreeSpace after this patch).
So that leaves me feeling pretty strongly that this whole stanza
is wrong and we should just do PageGetExactFreeSpace here.

regards, tom lane

#12Frédéric Yhuel
frederic.yhuel@dalibo.com
In reply to: Tom Lane (#10)
Re: pgstattuple: fix free space calculation

Hi Tom, thanks for your review.

On 9/7/24 22:10, Tom Lane wrote:

I looked at this patch. I agree with making the change. However,
I don't agree with the CF entry's marking of "target version: stable"
(i.e., requesting back-patch). I think this falls somewhere in the
gray area between a bug fix and a definitional change. Also, people
are unlikely to be happy if they suddenly get new, not-comparable
numbers after a minor version update. So I think we should just fix
it in HEAD.

OK, I did the change.

As far as the patch itself goes, the one thing that is bothering me
is this comment change

/*
-         * It's not safe to call PageGetHeapFreeSpace() on new pages, so we
+         * It's not safe to call PageGetExactFreeSpace() on new pages, so we
* treat them as being free space for our purposes.
*/

which looks like it wasn't made with a great deal of thought.
Now it seems to me that the comment was already bogus when written:
there isn't anything uncertain about what will happen if you call
either of these functions on a "new" page. PageIsNew checks for

return ((PageHeader) page)->pd_upper == 0;

If pd_upper is 0, PageGet[Exact]FreeSpace is absolutely guaranteed
to return zero, even if pd_lower contains garbage. And then

Indeed. I failed to notice that LocationIndex was an unsigned int, so I
thought that pg_upper - pd_upper could be positive with garbage in pg_upper.

PageGetHeapFreeSpace will likewise return zero. Perhaps there
could be trouble if we got into the line-pointer-checking part
of PageGetHeapFreeSpace, but we can't. So this comment is wrong,
and is even more obviously wrong after the above change. I thought
for a moment about removing the PageIsNew test altogether, but
then I decided that it probably*is* what we want and is just
mis-explained. I think the comment should read more like

/*
* PageGetExactFreeSpace() will return zero for a "new" page,
* but it's actually usable free space, so count it that way.
*/

Now alternatively you could argue that a "new" page isn't usable free
space yet and so we should count it as zero, just as we don't count
dead tuples as usable free space. You need VACUUM to turn either of
those things into real free space. But that'd be a bigger definitional
change, and I'm not sure we want it. Thoughts?

Also, do we need any documentation change for this? I looked through
https://www.postgresql.org/docs/devel/pgstattuple.html
and didn't see anything that was being very specific about what
"free space" means, so maybe it's fine as-is.

It's not easy. Maybe something like this?

"For any initialized page, free space refers to anything that isn't page
metadata (header and special), a line pointer or a tuple pointed to by a
valid line pointer. In particular, a dead tuple is not free space
because there's still a valid line pointer pointer pointing to it, until
VACUUM or some other maintenance mechanism (e.g. page pruning) cleans up
the page. A dead line pointer is not free space either, but the tuple it
points to has become free space. An unused line pointer could be
considered free space, but pgstattuple doesn't take it into account."

#13Frédéric Yhuel
frederic.yhuel@dalibo.com
In reply to: Tom Lane (#11)
1 attachment(s)
Re: pgstattuple: fix free space calculation

On 9/7/24 22:45, Tom Lane wrote:

I wrote:

Now alternatively you could argue that a "new" page isn't usable free
space yet and so we should count it as zero, just as we don't count
dead tuples as usable free space. You need VACUUM to turn either of
those things into real free space. But that'd be a bigger definitional
change, and I'm not sure we want it. Thoughts?

On the third hand: the code in question is in statapprox_heap, which
is presumably meant to deliver numbers comparable to pgstat_heap.
And pgstat_heap takes no special care for "new" pages, it just applies
PageGetHeapFreeSpace (or PageGetExactFreeSpace after this patch).
So that leaves me feeling pretty strongly that this whole stanza
is wrong and we should just do PageGetExactFreeSpace here.

+1

v4 patch attached.

Best regards,
Frédéric

Attachments:

v4-0001-pgstattuple-use-PageGetExactFreeSpace.patchtext/x-patch; charset=UTF-8; name=v4-0001-pgstattuple-use-PageGetExactFreeSpace.patchDownload
From a59c16d33ff37ce5c9d0e663809be190c608c75b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= <frederic.yhuel@dalibo.com>
Date: Wed, 21 Aug 2024 15:05:11 +0200
Subject: [PATCH v4] pgstattuple: use PageGetExactFreeSpace()

instead of PageGetHeapFreeSpace() or PageGetFreeSpace()

We want the exact free space, we don't care if there's enough room for
another line pointer.

Also, statapprox_heap() shouldn't have to take any special care with new
pages.
---
 contrib/pgstattuple/pgstatapprox.c | 9 +--------
 contrib/pgstattuple/pgstatindex.c  | 2 +-
 contrib/pgstattuple/pgstattuple.c  | 6 +++---
 3 files changed, 5 insertions(+), 12 deletions(-)

diff --git a/contrib/pgstattuple/pgstatapprox.c b/contrib/pgstattuple/pgstatapprox.c
index c84c642355..04457f4b79 100644
--- a/contrib/pgstattuple/pgstatapprox.c
+++ b/contrib/pgstattuple/pgstatapprox.c
@@ -106,14 +106,7 @@ statapprox_heap(Relation rel, output_type *stat)
 
 		page = BufferGetPage(buf);
 
-		/*
-		 * It's not safe to call PageGetHeapFreeSpace() on new pages, so we
-		 * treat them as being free space for our purposes.
-		 */
-		if (!PageIsNew(page))
-			stat->free_space += PageGetHeapFreeSpace(page);
-		else
-			stat->free_space += BLCKSZ - SizeOfPageHeaderData;
+		stat->free_space += PageGetExactFreeSpace(page);
 
 		/* We may count the page as scanned even if it's new/empty */
 		scanned++;
diff --git a/contrib/pgstattuple/pgstatindex.c b/contrib/pgstattuple/pgstatindex.c
index 5c06ba6db4..1b6b768cf8 100644
--- a/contrib/pgstattuple/pgstatindex.c
+++ b/contrib/pgstattuple/pgstatindex.c
@@ -311,7 +311,7 @@ pgstatindex_impl(Relation rel, FunctionCallInfo fcinfo)
 
 			max_avail = BLCKSZ - (BLCKSZ - ((PageHeader) page)->pd_special + SizeOfPageHeaderData);
 			indexStat.max_avail += max_avail;
-			indexStat.free_space += PageGetFreeSpace(page);
+			indexStat.free_space += PageGetExactFreeSpace(page);
 
 			indexStat.leaf_pages++;
 
diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c
index 3bd8b96197..7e2a7262a3 100644
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -372,7 +372,7 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
 			buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block,
 										RBM_NORMAL, hscan->rs_strategy);
 			LockBuffer(buffer, BUFFER_LOCK_SHARE);
-			stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
+			stat.free_space += PageGetExactFreeSpace((Page) BufferGetPage(buffer));
 			UnlockReleaseBuffer(buffer);
 			block++;
 		}
@@ -385,7 +385,7 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
 		buffer = ReadBufferExtended(rel, MAIN_FORKNUM, block,
 									RBM_NORMAL, hscan->rs_strategy);
 		LockBuffer(buffer, BUFFER_LOCK_SHARE);
-		stat.free_space += PageGetHeapFreeSpace((Page) BufferGetPage(buffer));
+		stat.free_space += PageGetExactFreeSpace((Page) BufferGetPage(buffer));
 		UnlockReleaseBuffer(buffer);
 		block++;
 	}
@@ -565,7 +565,7 @@ pgstat_index_page(pgstattuple_type *stat, Page page,
 {
 	OffsetNumber i;
 
-	stat->free_space += PageGetFreeSpace(page);
+	stat->free_space += PageGetExactFreeSpace(page);
 
 	for (i = minoff; i <= maxoff; i = OffsetNumberNext(i))
 	{
-- 
2.39.2

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Frédéric Yhuel (#13)
Re: pgstattuple: fix free space calculation

=?UTF-8?Q?Fr=C3=A9d=C3=A9ric_Yhuel?= <frederic.yhuel@dalibo.com> writes:

v4 patch attached.

LGTM, pushed.

regards, tom lane