wal segment size

Started by Colin 't Hart4 months ago12 messagesgeneral
Jump to latest
#1Colin 't Hart
colinthart@gmail.com

Hi,

I see very little advice on tuning WAL segment size.

One of my clients has a few datawarehouses at around 8 - 16 TB

On one of the nodes there are approx 15000 WAL segments of 16MB each,
totalling approx 230GB. The archiver is archiving approx one per second, so
approx 4 hours to clear.

Would we gain anything by bumping the WAL segment size?

Thanks,

Colin

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Colin 't Hart (#1)
Re: wal segment size

On Wed, 2025-12-17 at 16:13 +0100, Colin 't Hart wrote:

I see very little advice on tuning WAL segment size.

One of my clients has a few datawarehouses at around 8 - 16 TB

On one of the nodes there are approx 15000 WAL segments of 16MB each, totalling
approx 230GB. The archiver is archiving approx one per second, so approx 4 hours to clear.

Would we gain anything by bumping the WAL segment size?

Very likely yes, if the problem is the overhead of starting the archive_command.

Another thing that can slow down archiving is if you compress these segments
too aggressively.

Yours,
Laurenz Albe

#3Ron
ronljohnsonjr@gmail.com
In reply to: Colin 't Hart (#1)
Re: wal segment size

On Wed, Dec 17, 2025 at 10:13 AM Colin 't Hart <colinthart@gmail.com> wrote:

Hi,

I see very little advice on tuning WAL segment size.

One of my clients has a few datawarehouses at around 8 - 16 TB

On one of the nodes there are approx 15000 WAL segments of 16MB each,
totalling approx 230GB. The archiver is archiving approx one per second, so
approx 4 hours to clear.

One second to archive 16 measly MB seems really darned slow. Another vote
to *that* problem.

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

#4Colin 't Hart
colinthart@gmail.com
In reply to: Laurenz Albe (#2)
Re: wal segment size

Thanks Laurenz, that confirms what I was assuming. Archiving is via
pgbackrest to a backup server, over SSH. Approx 750ms to archive each
segment is crazy -- I'll check compression parameters too.

Any reason not to bump it up to 1GB? Or is that overkill?

/Colin

On Wed, 17 Dec 2025 at 16:25, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Show quoted text

On Wed, 2025-12-17 at 16:13 +0100, Colin 't Hart wrote:

I see very little advice on tuning WAL segment size.

One of my clients has a few datawarehouses at around 8 - 16 TB

On one of the nodes there are approx 15000 WAL segments of 16MB each,

totalling

approx 230GB. The archiver is archiving approx one per second, so approx

4 hours to clear.

Would we gain anything by bumping the WAL segment size?

Very likely yes, if the problem is the overhead of starting the
archive_command.

Another thing that can slow down archiving is if you compress these
segments
too aggressively.

Yours,
Laurenz Albe

#5Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Colin 't Hart (#4)
Re: wal segment size

On 12/17/25 08:10, Colin 't Hart wrote:

Thanks Laurenz, that confirms what I was assuming. Archiving is via
pgbackrest to a backup server, over SSH. Approx 750ms to archive each
segment is crazy -- I'll check compression parameters too.

How much of that time is network travel?

What are the configuration settings for the archiving portion of pgBackRest?

Any reason not to bump it up to 1GB? Or is that overkill?

/Colin

On Wed, 17 Dec 2025 at 16:25, Laurenz Albe <laurenz.albe@cybertec.at
<mailto:laurenz.albe@cybertec.at>> wrote:

On Wed, 2025-12-17 at 16:13 +0100, Colin 't Hart wrote:

I see very little advice on tuning WAL segment size.

One of my clients has a few datawarehouses at around 8 - 16 TB

On one of the nodes there are approx 15000 WAL segments of 16MB

each, totalling

approx 230GB. The archiver is archiving approx one per second, so

approx 4 hours to clear.

Would we gain anything by bumping the WAL segment size?

Very likely yes, if the problem is the overhead of starting the
archive_command.

Another thing that can slow down archiving is if you compress these
segments
too aggressively.

Yours,
Laurenz Albe

--
Adrian Klaver
adrian.klaver@aklaver.com

#6Greg Sabino Mullane
greg@turnstep.com
In reply to: Colin 't Hart (#4)
Re: wal segment size

On Wed, Dec 17, 2025 at 11:10 AM Colin 't Hart <colinthart@gmail.com> wrote:

Thanks Laurenz, that confirms what I was assuming. Archiving is via
pgbackrest to a backup server, over SSH. Approx 750ms to archive each
segment is crazy -- I'll check compression parameters too.

Switch to archive-async = on. When doing that, the typical time drops to
10ms or less. Also use a compress-type of lz4 or zst, which perform way
better than the default gz. If you are encrypting, that's a bottleneck you
just have to deal with, no shortcuts there. :)

tl;dr try other things before messing with the WAL size. The current size
can work very well even on very large and very, very busy systems.

Cheers,
Greg

--
Crunchy Data - https://www.crunchydata.com
Enterprise Postgres Software Products & Tech Support

#7Ron
ronljohnsonjr@gmail.com
In reply to: Colin 't Hart (#4)
Re: wal segment size

Adding this to ~/.ssh/config will almost totally eliminate the cost of ssh
authentication:

Host *
ControlMaster auto
#ControlPath /run/user/%i/%L_%r_at_%n:%p
ControlPath ~/.ssh/%L_%r_at_%n:%p
ControlPersist 5m

Pointing ControlPath to /run/user/%i is even faster, but it doesn't always
exist if you sudo into the postgres service account.

On Wed, Dec 17, 2025 at 11:10 AM Colin 't Hart <colinthart@gmail.com> wrote:

Thanks Laurenz, that confirms what I was assuming. Archiving is via
pgbackrest to a backup server, over SSH. Approx 750ms to archive each
segment is crazy -- I'll check compression parameters too.

Any reason not to bump it up to 1GB? Or is that overkill?

/Colin

On Wed, 17 Dec 2025 at 16:25, Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

On Wed, 2025-12-17 at 16:13 +0100, Colin 't Hart wrote:

I see very little advice on tuning WAL segment size.

One of my clients has a few datawarehouses at around 8 - 16 TB

On one of the nodes there are approx 15000 WAL segments of 16MB each,

totalling

approx 230GB. The archiver is archiving approx one per second, so

approx 4 hours to clear.

Would we gain anything by bumping the WAL segment size?

Very likely yes, if the problem is the overhead of starting the
archive_command.

Another thing that can slow down archiving is if you compress these
segments
too aggressively.

Yours,
Laurenz Albe

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

#8Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Greg Sabino Mullane (#6)
Re: wal segment size

On Wed, 2025-12-17 at 12:21 -0500, Greg Sabino Mullane wrote:

On Wed, Dec 17, 2025 at 11:10 AM Colin 't Hart <colinthart@gmail.com> wrote:

Thanks Laurenz, that confirms what I was assuming. Archiving is via pgbackrest
to a backup server, over SSH. Approx 750ms to archive each segment is crazy --
I'll check compression parameters too.

Switch to archive-async = on. When doing that, the typical time drops to 10ms or less.
Also use a compress-type of lz4 or zst, which perform way better than the default gz.
If you are encrypting, that's a bottleneck you just have to deal with, no shortcuts there. :)

I second that. Asynchronous archiving in pgBackRest tends to work around the problem.

tl;dr try other things before messing with the WAL size. The current size can work very
well even on very large and very, very busy systems.

On the other hand, 16MB on a very busy system is somewhat ridiculous.
A somewhat bigger segment size may be appropriate.

Yours,
Laurenz Albe

#9Andrew
adhenry.9@gmail.com
In reply to: Laurenz Albe (#8)
Re: wal segment size

As an oracle dba new to Postgres, I’m used to the concept of context switches and latch issues with regards to transaction log switches. Does Postgres have a similar mechanism with latching etc when it switches to a new wal segment that is alleviated when increasing the size of the wal segments?

Regards
Andrew
Sent from my iPhone

Show quoted text

On 17 Dec 2025, at 18:58, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Wed, 2025-12-17 at 12:21 -0500, Greg Sabino Mullane wrote:

On Wed, Dec 17, 2025 at 11:10 AM Colin 't Hart <colinthart@gmail.com> wrote:
Thanks Laurenz, that confirms what I was assuming. Archiving is via pgbackrest
to a backup server, over SSH. Approx 750ms to archive each segment is crazy --
I'll check compression parameters too.

Switch to archive-async = on. When doing that, the typical time drops to 10ms or less.
Also use a compress-type of lz4 or zst, which perform way better than the default gz.
If you are encrypting, that's a bottleneck you just have to deal with, no shortcuts there. :)

I second that. Asynchronous archiving in pgBackRest tends to work around the problem.

tl;dr try other things before messing with the WAL size. The current size can work very
well even on very large and very, very busy systems.

On the other hand, 16MB on a very busy system is somewhat ridiculous.
A somewhat bigger segment size may be appropriate.

Yours,
Laurenz Albe

#10Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Andrew (#9)
Re: wal segment size

On Fri, 2025-12-19 at 09:26 +0100, Andrew wrote:

As an oracle dba new to Postgres, I’m used to the concept of context switches and latch issues
with regards to transaction log switches. Does Postgres have a similar mechanism with latching
etc when it switches to a new wal segment that is alleviated when increasing the size of the
wal segments?

Not really. PostgreSQL doesn't reuse WAL segments in a circular fashion like Oracle does.
At the end of a checkpoint, it creates new, empty WAL segments for future use, so if there
is a need to switch to a new segment, there is no need to wait for anything.

Yours,
Laurenz Albe

#11Colin 't Hart
colinthart@gmail.com
In reply to: Laurenz Albe (#10)
Re: wal segment size

What's the behaviour when pg_resetwal is used to change the WAL segment
size?

This note is worrying to me:
--
While pg_resetwal will set the WAL starting address beyond the latest
existing WAL segment file, some segment size changes can cause previous WAL
file names to be reused. It is recommended to use -l together with this
option to manually set the WAL starting address if WAL file name overlap
will cause problems with your archiving strategy.
--
Why can a segment size change cause previous WAL file names to be reused?

Do we need to take a new backup immediately after changing the WAL segment
size?

Thanks,

Colin

On Fri, 19 Dec 2025 at 14:09, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

Show quoted text

On Fri, 2025-12-19 at 09:26 +0100, Andrew wrote:

As an oracle dba new to Postgres, I’m used to the concept of context

switches and latch issues

with regards to transaction log switches. Does Postgres have a similar

mechanism with latching

etc when it switches to a new wal segment that is alleviated when

increasing the size of the

wal segments?

Not really. PostgreSQL doesn't reuse WAL segments in a circular fashion
like Oracle does.
At the end of a checkpoint, it creates new, empty WAL segments for future
use, so if there
is a need to switch to a new segment, there is no need to wait for
anything.

Yours,
Laurenz Albe

#12Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Colin 't Hart (#11)
Re: wal segment size

On Fri, 2025-12-19 at 14:25 +0100, Colin 't Hart wrote:

What's the behaviour when pg_resetwal is used to change the WAL segment size?

This note is worrying to me:
--
While pg_resetwal will set the WAL starting address beyond the latest existing WAL segment file, some segment size changes can cause previous WAL file names to be reused. It is recommended to use -l together with this option to manually set the WAL starting address if WAL file name overlap will cause problems with your archiving strategy.
--
Why can a segment size change cause previous WAL file names to be reused?

Do we need to take a new backup immediately after changing the WAL segment size?

I think that is supposed to mean that the new WAL numbering scheme might produce
the same WAL segment name as a WAL segment name had long ago, so you might overwrite
that earlier segment in the WAL archive, which could prevent you from recovering from
an old backup that needs the overwritten WAl segment to recover.

I'm not sure how likely that is to happen.

It never harms to run an extra backup.

The main thing is that you shut down PostgreSQL cleanly before running "pg_resetwal"
and that you only change the WAL segment size, nothing else.

Yours,
Laurenz Albe