PG17beta2: SMGR: inconsistent type for nblocks

Started by Matthias van de Meentover 1 year ago5 messages
#1Matthias van de Meent
boekewurm@gmail.com

Hi,

While working on rebasing the patches of Neon's fork onto the
REL_17_STABLE branch, I noticed that the nblocks arguments of various
smgr functions have inconsistent types: smgrzeroextend accepts
`nblocks` as signed integer, as does the new signature for
smgrprefetch, but the new vectorized operations of *readv and *writev,
and the older *writeback all use an unsigned BlockNumber as indicator
for number of blocks.

Can we update the definition to be consistent across this (new, or
also older) API? As far as I can see, in none of these cases are
negative numbers allowed or expected, so updating this all to be
consistently BlockNumber across the API seems like a straigthforward
patch.

cc-ed Thomas as committer of the PG17 smgr API changes.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

#2Thomas Munro
thomas.munro@gmail.com
In reply to: Matthias van de Meent (#1)
Re: PG17beta2: SMGR: inconsistent type for nblocks

On Tue, Jul 30, 2024 at 11:24 PM Matthias van de Meent
<boekewurm@gmail.com> wrote:

While working on rebasing the patches of Neon's fork onto the
REL_17_STABLE branch, I noticed that the nblocks arguments of various
smgr functions have inconsistent types: smgrzeroextend accepts
`nblocks` as signed integer, as does the new signature for
smgrprefetch, but the new vectorized operations of *readv and *writev,
and the older *writeback all use an unsigned BlockNumber as indicator
for number of blocks.

Can we update the definition to be consistent across this (new, or
also older) API? As far as I can see, in none of these cases are
negative numbers allowed or expected, so updating this all to be
consistently BlockNumber across the API seems like a straigthforward
patch.

cc-ed Thomas as committer of the PG17 smgr API changes.

Hi Matthias,

Yeah, right, I noticed that once myself[1]/messages/by-id/CA+hUKGLx5bLwezZKAYB2O_qHj=ov10RpgRVY7e8TSJVE74oVjg@mail.gmail.com. For the cases from my
keyboard, I guess I was trying to be consistent with nearby existing
stuff in each case, which was already inconsistent... Do you have a
patch?

[1]: /messages/by-id/CA+hUKGLx5bLwezZKAYB2O_qHj=ov10RpgRVY7e8TSJVE74oVjg@mail.gmail.com

#3Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Thomas Munro (#2)
1 attachment(s)
Re: PG17beta2: SMGR: inconsistent type for nblocks

On Tue, 30 Jul 2024 at 14:32, Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Jul 30, 2024 at 11:24 PM Matthias van de Meent
<boekewurm@gmail.com> wrote:

While working on rebasing the patches of Neon's fork onto the
REL_17_STABLE branch, I noticed that the nblocks arguments of various
smgr functions have inconsistent types: smgrzeroextend accepts
`nblocks` as signed integer, as does the new signature for
smgrprefetch, but the new vectorized operations of *readv and *writev,
and the older *writeback all use an unsigned BlockNumber as indicator
for number of blocks.

Can we update the definition to be consistent across this (new, or
also older) API? As far as I can see, in none of these cases are
negative numbers allowed or expected, so updating this all to be
consistently BlockNumber across the API seems like a straigthforward
patch.

cc-ed Thomas as committer of the PG17 smgr API changes.

Yeah, right, I noticed that once myself[1]. For the cases from my
keyboard, I guess I was trying to be consistent with nearby existing
stuff in each case, which was already inconsistent... Do you have a
patch?

Here's one that covers both master and the v17 backbranch.

Kind regards,

Matthias van de Meent

Attachments:

0001-Update-SMGR-API-to-use-consistent-types-for-nblocks-.patchapplication/octet-stream; name=0001-Update-SMGR-API-to-use-consistent-types-for-nblocks-.patchDownload
From ee6285d563f9fda914c22b0922c64275ab204209 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm+postgres@gmail.com>
Date: Thu, 1 Aug 2024 12:05:36 +0200
Subject: [PATCH] Update SMGR API to use consistent types for `nblocks`
 parameters

Rather than signed integers, we now consistently use BlockNumber to
indicate nblocks in the smgr API.
---
 src/backend/storage/smgr/md.c   | 6 +++---
 src/backend/storage/smgr/smgr.c | 9 +++++----
 src/include/storage/md.h        | 5 +++--
 src/include/storage/smgr.h      | 5 +++--
 4 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 6796756358..1d02766978 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -523,11 +523,11 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
  */
 void
 mdzeroextend(SMgrRelation reln, ForkNumber forknum,
-			 BlockNumber blocknum, int nblocks, bool skipFsync)
+			 BlockNumber blocknum, BlockNumber nblocks, bool skipFsync)
 {
 	MdfdVec    *v;
 	BlockNumber curblocknum = blocknum;
-	int			remblocks = nblocks;
+	int64		remblocks = nblocks;
 
 	Assert(nblocks > 0);
 
@@ -712,7 +712,7 @@ mdclose(SMgrRelation reln, ForkNumber forknum)
  */
 bool
 mdprefetch(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
-		   int nblocks)
+		   BlockNumber nblocks)
 {
 #ifdef USE_PREFETCH
 
diff --git a/src/backend/storage/smgr/smgr.c b/src/backend/storage/smgr/smgr.c
index 7b9fa103ef..90af425973 100644
--- a/src/backend/storage/smgr/smgr.c
+++ b/src/backend/storage/smgr/smgr.c
@@ -85,9 +85,10 @@ typedef struct f_smgr
 	void		(*smgr_extend) (SMgrRelation reln, ForkNumber forknum,
 								BlockNumber blocknum, const void *buffer, bool skipFsync);
 	void		(*smgr_zeroextend) (SMgrRelation reln, ForkNumber forknum,
-									BlockNumber blocknum, int nblocks, bool skipFsync);
+									BlockNumber blocknum, BlockNumber nblocks,
+									bool skipFsync);
 	bool		(*smgr_prefetch) (SMgrRelation reln, ForkNumber forknum,
-								  BlockNumber blocknum, int nblocks);
+								  BlockNumber blocknum, BlockNumber nblocks);
 	void		(*smgr_readv) (SMgrRelation reln, ForkNumber forknum,
 							   BlockNumber blocknum,
 							   void **buffers, BlockNumber nblocks);
@@ -558,7 +559,7 @@ smgrextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
  */
 void
 smgrzeroextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
-			   int nblocks, bool skipFsync)
+			   BlockNumber nblocks, bool skipFsync)
 {
 	smgrsw[reln->smgr_which].smgr_zeroextend(reln, forknum, blocknum,
 											 nblocks, skipFsync);
@@ -583,7 +584,7 @@ smgrzeroextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
  */
 bool
 smgrprefetch(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
-			 int nblocks)
+			 BlockNumber nblocks)
 {
 	return smgrsw[reln->smgr_which].smgr_prefetch(reln, forknum, blocknum, nblocks);
 }
diff --git a/src/include/storage/md.h b/src/include/storage/md.h
index 620f10abde..12467669bb 100644
--- a/src/include/storage/md.h
+++ b/src/include/storage/md.h
@@ -29,9 +29,10 @@ extern void mdunlink(RelFileLocatorBackend rlocator, ForkNumber forknum, bool is
 extern void mdextend(SMgrRelation reln, ForkNumber forknum,
 					 BlockNumber blocknum, const void *buffer, bool skipFsync);
 extern void mdzeroextend(SMgrRelation reln, ForkNumber forknum,
-						 BlockNumber blocknum, int nblocks, bool skipFsync);
+						 BlockNumber blocknum, BlockNumber nblocks,
+						 bool skipFsync);
 extern bool mdprefetch(SMgrRelation reln, ForkNumber forknum,
-					   BlockNumber blocknum, int nblocks);
+					   BlockNumber blocknum, BlockNumber nblocks);
 extern void mdreadv(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
 					void **buffers, BlockNumber nblocks);
 extern void mdwritev(SMgrRelation reln, ForkNumber forknum,
diff --git a/src/include/storage/smgr.h b/src/include/storage/smgr.h
index e15b20a566..4abd3a17b6 100644
--- a/src/include/storage/smgr.h
+++ b/src/include/storage/smgr.h
@@ -89,9 +89,10 @@ extern void smgrdounlinkall(SMgrRelation *rels, int nrels, bool isRedo);
 extern void smgrextend(SMgrRelation reln, ForkNumber forknum,
 					   BlockNumber blocknum, const void *buffer, bool skipFsync);
 extern void smgrzeroextend(SMgrRelation reln, ForkNumber forknum,
-						   BlockNumber blocknum, int nblocks, bool skipFsync);
+						   BlockNumber blocknum, BlockNumber nblocks,
+						   bool skipFsync);
 extern bool smgrprefetch(SMgrRelation reln, ForkNumber forknum,
-						 BlockNumber blocknum, int nblocks);
+						 BlockNumber blocknum, BlockNumber nblocks);
 extern void smgrreadv(SMgrRelation reln, ForkNumber forknum,
 					  BlockNumber blocknum,
 					  void **buffers, BlockNumber nblocks);
-- 
2.40.1

#4Andres Freund
andres@anarazel.de
In reply to: Matthias van de Meent (#3)
Re: PG17beta2: SMGR: inconsistent type for nblocks

Hi,

On 2024-08-01 12:45:16 +0200, Matthias van de Meent wrote:

On Tue, 30 Jul 2024 at 14:32, Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Jul 30, 2024 at 11:24 PM Matthias van de Meent
<boekewurm@gmail.com> wrote:

While working on rebasing the patches of Neon's fork onto the
REL_17_STABLE branch, I noticed that the nblocks arguments of various
smgr functions have inconsistent types: smgrzeroextend accepts
`nblocks` as signed integer, as does the new signature for
smgrprefetch, but the new vectorized operations of *readv and *writev,
and the older *writeback all use an unsigned BlockNumber as indicator
for number of blocks.

Can we update the definition to be consistent across this (new, or
also older) API? As far as I can see, in none of these cases are
negative numbers allowed or expected, so updating this all to be
consistently BlockNumber across the API seems like a straigthforward
patch.

cc-ed Thomas as committer of the PG17 smgr API changes.

Yeah, right, I noticed that once myself[1]. For the cases from my
keyboard, I guess I was trying to be consistent with nearby existing
stuff in each case, which was already inconsistent... Do you have a
patch?

Here's one that covers both master and the v17 backbranch.

FWIW, I find it quite ugly to use BlockNumber to indicate the number of blocks
to be written. It's just further increasing the type confusion by conflating
"the first block to be targeted" and "number of blocks".

diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 6796756358..1d02766978 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -523,11 +523,11 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
*/
void
mdzeroextend(SMgrRelation reln, ForkNumber forknum,
-			 BlockNumber blocknum, int nblocks, bool skipFsync)
+			 BlockNumber blocknum, BlockNumber nblocks, bool skipFsync)
{
MdfdVec    *v;
BlockNumber curblocknum = blocknum;
-	int			remblocks = nblocks;
+	int64		remblocks = nblocks;

Assert(nblocks > 0);

Isn't this particularly bogus? What's the point of using a 64bit remblocks
here?

Greetings,

Andres Freund

#5Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Andres Freund (#4)
Re: PG17beta2: SMGR: inconsistent type for nblocks

Hi,

On Thu, 1 Aug 2024 at 18:44, Andres Freund <andres@anarazel.de> wrote:

On 2024-08-01 12:45:16 +0200, Matthias van de Meent wrote:

Here's one that covers both master and the v17 backbranch.

FWIW, I find it quite ugly to use BlockNumber to indicate the number of blocks
to be written. It's just further increasing the type confusion by conflating
"the first block to be targeted" and "number of blocks".

IIf BlockNumber doesn't do it for you, then between plain uint32 and
int64, which would you prefer? int itself doesn't allow syncing of all
blocks of a relation's fork, so that's out for me.

diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 6796756358..1d02766978 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -523,11 +523,11 @@ mdextend(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
*/
void
mdzeroextend(SMgrRelation reln, ForkNumber forknum,
-                      BlockNumber blocknum, int nblocks, bool skipFsync)
+                      BlockNumber blocknum, BlockNumber nblocks, bool skipFsync)
{
MdfdVec    *v;
BlockNumber curblocknum = blocknum;
-     int                     remblocks = nblocks;
+     int64           remblocks = nblocks;

Assert(nblocks > 0);

Isn't this particularly bogus? What's the point of using a 64bit remblocks
here?

To prevent underflows in the loop below, if any would happen to exist.
Could've been BlockNumber too, but I went with a slightly more
defensive approach.

Kind regards,

Matthias van de Meent