Issues with blocksize smaller than 8KB

Started by Casey Shobe6 months ago9 messagesbugs
Jump to latest
#1Casey Shobe
casey.allen.shobe@icloud.com

I have been comparing performance of postgresql (18.0) compiled for various block sizes etc., and found that while 8kb and 16kb builds work fine, a 4kb build does not, with identical configuration. When I try to initialize pgbench, --scale=10 works fine, but the --scale=100 I was trying and even just --scale=20 result in a long delay on the vacuum analyze step followed by Postgres crashing due to a segmentation fault.

I also found that initdb fails when I compile for a blocksize of either 1KB or 2KB build:

2025-10-17 15:39:13.182 UTC [97433] DETAIL: The database cluster was initialized with RELSEG_SIZE 1895825408, but the server was compiled with RELSEG_SIZE 1895825408.
2025-10-17 15:39:13.182 UTC [97433] HINT: It looks like you need to recompile or initdb.

- Casey

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Casey Shobe (#1)
Re: Issues with blocksize smaller than 8KB

Casey Shobe <casey.allen.shobe@icloud.com> writes:

I have been comparing performance of postgresql (18.0) compiled for various block sizes etc., and found that while 8kb and 16kb builds work fine, a 4kb build does not, with identical configuration. When I try to initialize pgbench, --scale=10 works fine, but the --scale=100 I was trying and even just --scale=20 result in a long delay on the vacuum analyze step followed by Postgres crashing due to a segmentation fault.
I also found that initdb fails when I compile for a blocksize of either 1KB or 2KB build:

2025-10-17 15:39:13.182 UTC [97433] DETAIL: The database cluster was initialized with RELSEG_SIZE 1895825408, but the server was compiled with RELSEG_SIZE 1895825408.
2025-10-17 15:39:13.182 UTC [97433] HINT: It looks like you need to recompile or initdb.

I could not reproduce any of these misbehaviors here. I suspect you
have a build process problem, ie failure to clean out all compilation
products when reconfiguring. You need at least "make clean", and
personally I usually use "make distclean" or "git clean -dfxq".

(In theory, using --enable-depend would let you be sloppy about this,
but I've never particularly trusted that option. It definitely will
not work to change configure settings without that.)

regards, tom lane

#3Casey & Gina
cg@osss.net
In reply to: Tom Lane (#2)
Re: Issues with blocksize smaller than 8KB

Hi Tom,

I was also setting the relation segment size to something high like 1000 or 10000 - I found that when I changed that back to the default, the smaller blocksizes work fine without segfaulting. So the problem is apparently with that option being set too high, or maybe an incompatibility with smaller block sizes...

Show quoted text

On Oct 18, 2025, at 12:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Casey Shobe <casey.allen.shobe@icloud.com> writes:

I have been comparing performance of postgresql (18.0) compiled for various block sizes etc., and found that while 8kb and 16kb builds work fine, a 4kb build does not, with identical configuration. When I try to initialize pgbench, --scale=10 works fine, but the --scale=100 I was trying and even just --scale=20 result in a long delay on the vacuum analyze step followed by Postgres crashing due to a segmentation fault.
I also found that initdb fails when I compile for a blocksize of either 1KB or 2KB build:

2025-10-17 15:39:13.182 UTC [97433] DETAIL: The database cluster was initialized with RELSEG_SIZE 1895825408, but the server was compiled with RELSEG_SIZE 1895825408.
2025-10-17 15:39:13.182 UTC [97433] HINT: It looks like you need to recompile or initdb.

I could not reproduce any of these misbehaviors here. I suspect you
have a build process problem, ie failure to clean out all compilation
products when reconfiguring. You need at least "make clean", and
personally I usually use "make distclean" or "git clean -dfxq".

(In theory, using --enable-depend would let you be sloppy about this,
but I've never particularly trusted that option. It definitely will
not work to change configure settings without that.)

regards, tom lane

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: Issues with blocksize smaller than 8KB

On 2025-Oct-18, Tom Lane wrote:

2025-10-17 15:39:13.182 UTC [97433] DETAIL: The database cluster was initialized with RELSEG_SIZE 1895825408, but the server was compiled with RELSEG_SIZE 1895825408.
2025-10-17 15:39:13.182 UTC [97433] HINT: It looks like you need to recompile or initdb.

I could not reproduce any of these misbehaviors here.

Hmm, but how come two values that print identical result in failing the
test that they are not equal? The involved code is this:

if (ControlFile->relseg_size != RELSEG_SIZE)
ereport(FATAL,
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
errmsg("database files are incompatible with server"),
/* translator: %s is a variable name and %d is its value */
errdetail("The database cluster was initialized with %s %d,"
" but the server was compiled with %s %d.",
"RELSEG_SIZE", ControlFile->relseg_size,
"RELSEG_SIZE", RELSEG_SIZE),
errhint("It looks like you need to recompile or initdb.")));

where ControlFile->relseg_size is uint32 and RELSEG_SIZE is defined in
my pg_config.h as
#define RELSEG_SIZE 131072
which would be a signed quantity ... of what size again? The value 1895825408
should fit in a signed int32 (if snugly). I wonder whether something is
going awry with that.

What sort of platform is this on, anyway?

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#4)
Re: Issues with blocksize smaller than 8KB

=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:

On 2025-Oct-18, Tom Lane wrote:

2025-10-17 15:39:13.182 UTC [97433] DETAIL: The database cluster was initialized with RELSEG_SIZE 1895825408, but the server was compiled with RELSEG_SIZE 1895825408.

Hmm, but how come two values that print identical result in failing the
test that they are not equal?

Incautious use of --with-segsize could perhaps result in RELSEG_SIZE
being too big for an int ...

regards, tom lane

#6Jeff Janes
jeff.janes@gmail.com
In reply to: Casey Shobe (#1)
Re: Issues with blocksize smaller than 8KB

On Sat, Oct 18, 2025 at 10:02 AM Casey Shobe <casey.allen.shobe@icloud.com>
wrote:

I have been comparing performance of postgresql (18.0) compiled for
various block sizes etc., and found that while 8kb and 16kb builds work
fine, a 4kb build does not, with identical configuration. When I try to
initialize pgbench, --scale=10 works fine, but the --scale=100 I was trying
and even just --scale=20 result in a long delay on the vacuum analyze step
followed by Postgres crashing due to a segmentation fault.

I also found that initdb fails when I compile for a blocksize of either
1KB or 2KB build:

2025-10-17 15:39:13.182 UTC [97433] DETAIL: The database cluster was
initialized with RELSEG_SIZE 1895825408, but the server was compiled with
RELSEG_SIZE 1895825408.
2025-10-17 15:39:13.182 UTC [97433] HINT: It looks like you need to
recompile or initdb.

Like Tom, I can't reproduce any of these problems. If the problem exists
after cleaning the build tree and repeating, then please let us know your
hardware, OS and version, and the command lines you used to configure and
to build.

Cheers,

Jeff

#7Casey & Gina
cg@osss.net
In reply to: Tom Lane (#5)
Re: Issues with blocksize smaller than 8KB

On Oct 19, 2025, at 2:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Incautious use of --with-segsize could perhaps result in RELSEG_SIZE
being too big for an int ...

This must have been it, though the same --with-segsize value worked for larger blocksizes.

Perhaps configure could throw an error if too large of a value were to be used, but happy for the explanation...

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Casey & Gina (#7)
Re: Issues with blocksize smaller than 8KB

Casey & Gina <cg@osss.net> writes:

On Oct 19, 2025, at 2:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Incautious use of --with-segsize could perhaps result in RELSEG_SIZE
being too big for an int ...

This must have been it, though the same --with-segsize value worked for larger blocksizes.
Perhaps configure could throw an error if too large of a value were to be used, but happy for the explanation...

I was able to reproduce this error precisely by doing

./configure ... --with-blocksize=1 --with-segsize=10000

so I guess that's what you did (and you must have ignored the boatload
of compiler warnings that ensued). configure does try to notice
an overflow, but it seems to expect that expr(1) will complain about
a larger-than-int result:

AC_MSG_CHECKING([for segment size])
if test $segsize_blocks -eq 0; then
# this expression is set up to avoid unnecessary integer overflow
# blocksize is already guaranteed to be a factor of 1024
RELSEG_SIZE=`expr '(' 1024 / ${blocksize} ')' '*' ${segsize} '*' 1024`
test $? -eq 0 || exit 1
AC_MSG_RESULT([${segsize}GB])
else
RELSEG_SIZE=$segsize_blocks
AC_MSG_RESULT([${RELSEG_SIZE} blocks])
fi

At least on my Linux box, there's no complaint, you just get back
the correct value of 10485760000. So that's not great, and it's
even less great that --with-segsize-blocks isn't checked at all.

The natural way to deal with this would be to add a test like

if test $RELSEG_SIZE -le 0 -o $RELSEG_SIZE -gt 2147483647; then
... fail ...

but this assumes a fact not in evidence, that test(1) will do
wider-than-int arithmetic sanely. (Just because expr(1) does
doesn't prove a lot about test(1), IMO.) I'm even less sure
that I want to rely on meson to do it right. So I think we'd
better leave it to the C compiler, as attached.

regards, tom lane

Attachments:

assert-that-RELSEG_SIZE-is-sane.patchtext/x-diff; charset=us-ascii; name=assert-that-RELSEG_SIZE-is-sane.patchDownload+10-0
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#8)
Re: Issues with blocksize smaller than 8KB

I wrote:

... So I think we'd
better leave it to the C compiler, as attached.

After further study I decided that md.c was a more natural home
for this assertion than xlog.c. Pushed that way.

regards, tom lane