pg_restore causing ENOSPACE on the WAL partition. Fundamental issue?

Started by Dimitrios Apostolou7 months ago4 messages

Dimitrios Apostolou

jimis@gmx.net

7 months ago

Hello list,

I have previously raised an issue on pgsql-general, where PostgreSQL is
logging without any boundaries to the WAL directory, even if other writer
processes can't catch up with it. It ends up with WAL partition becoming
full and a bad crash. Read more at the thread at:

/messages/by-id/076464ad-3d70-dd25-9e8f-e84f27decfba@gmx.net

If it's true, I consider this a critical and under-documented issue, thus
I'm bringing it up here. Otherwise please let me know if any of my
assumptions are wrong.

In short, this is how it can easily happen, but in principle it's much
more generic:

+ The WAL is on a dedicated high-perf drive

+ pg_restore is writing to a slow tablespace on a network drive

+ data is logged to the WAL faster than what can be written to that
tablespace

+ it eventually ends up with the WAL filling up, regardless of
max_wal_size...

In other words, I'm surprised that there is no mechanism for the backends
to block while the WAL is overflowing. Am I wrong here?

Thank in advance,
Dimitris

David G. Johnston

david.g.johnston@gmail.com

7 months ago

In reply to: Dimitrios Apostolou (#1)

Re: pg_restore causing ENOSPACE on the WAL partition. Fundamental issue?

On Tuesday, June 10, 2025, Dimitrios Apostolou <jimis@gmx.net> wrote:

Hello list,

I have previously raised an issue on pgsql-general, where PostgreSQL is
logging without any boundaries to the WAL directory, even if other writer
processes can't catch up with it. It ends up with WAL partition becoming
full and a bad crash. Read more at the thread at:

/messages/by-id/076464ad-3d70-dd2
5-9e8f-e84f27decfba%40gmx.net

If it's true, I consider this a critical and under-documented issue, thus
I'm bringing it up here. Otherwise please let me know if any of my
assumptions are wrong.

In short, this is how it can easily happen, but in principle it's much
more generic:

+ The WAL is on a dedicated high-perf drive

+ pg_restore is writing to a slow tablespace on a network drive

+ data is logged to the WAL faster than what can be written to that
tablespace

+ it eventually ends up with the WAL filling up, regardless of
max_wal_size...

In other words, I'm surprised that there is no mechanism for the backends
to block while the WAL is overflowing. Am I wrong here?

Probably not. max_wal_size is documented as being a soft limit that can be
exceeded due to this very thing. If the performance of your data disk is
so much worse than your WAL disk that you cannot checkpoint fast enough to
prevent WAL from overflowing its disk during that period then you do indeed
have an issue that PostgreSQL doesn’t provide protections for - it’s being
optimistic and helpful to the point of letting you hang yourself with the
rope it’s given you.

David J.

Dimitrios Apostolou

jimis@gmx.net

7 months ago

In reply to: David G. Johnston (#2)

Re: pg_restore causing ENOSPACE on the WAL partition. Fundamental issue?

On Tue, 10 Jun 2025, David G. Johnston wrote:

On Tuesday, June 10, 2025, Dimitrios Apostolou <jimis@gmx.net> wrote:
Hello list,

I have previously raised an issue on pgsql-general, where PostgreSQL is logging without any boundaries to the WAL directory, even if other writer processes can't catch up with it. It ends up with WAL partition becoming full and a bad crash. Read more at the thread at:

/messages/by-id/076464ad-3d70-dd25-9e8f-e84f27decfba@gmx.net

If it's true, I consider this a critical and under-documented issue, thus I'm bringing it up here. Otherwise please let me know if any of my assumptions are wrong.

In short, this is how it can easily happen, but in principle it's much more generic:

+ The WAL is on a dedicated high-perf drive

+ pg_restore is writing to a slow tablespace on a network drive

+ data is logged to the WAL faster than what can be written to that
tablespace

+ it eventually ends up with the WAL filling up, regardless of
max_wal_size...

In other words, I'm surprised that there is no mechanism for the backends to block while the WAL is overflowing. Am I wrong here?

Probably not. max_wal_size is documented as being a soft limit that can be exceeded due to this very thing. If the performance of your data disk is so much worse than your WAL disk that you cannot checkpoint fast enough to prevent WAL from overflowing its disk during that period then you do indeed have an issue that PostgreSQL doesn’t provide protections for - it’s being optimistic and helpful to the
point of letting you hang yourself with the rope it’s given you.

I'm not talking about crossing the max_wal_size boundary, but about
overflowing a dedicated partition sized 4x max_wal_size. I believe I
followed standard guidelines by putting WAL on a well performing dedicated
volume.

And the unrecoverable mess I got when the WAL partition overflowed was
ugly, to say the least. Despite adding space, I had to use pg_controldata
and pg_archivecleanup to manage and start the database. Postgres should
handle it better if it can happen so easily.

Anyway one of the things I tried while seeing WAL overflowing, was sending
SIGSTOP to walwriter. IIRC I saw the COPY backends actually blocking,
waiting on some lock, which means that the mechanism already exists. I
don't recall if checkpointer actually continued or also blocked, but when
I sent SIGCONT to walwriter the IO pressure was actually less.

Why can't walwriter just refuse to write anything above a certain limit,
thus freezing all other writers until the checkpointer frees a chunk of
WAL?

Thanks,
Dimitris

David G. Johnston

david.g.johnston@gmail.com

7 months ago

In reply to: Dimitrios Apostolou (#3)

Re: pg_restore causing ENOSPACE on the WAL partition. Fundamental issue?

On Thursday, June 12, 2025, Dimitrios Apostolou <jimis@gmx.net> wrote:

Why can't walwriter just refuse to write anything above a certain limit,
thus freezing all other writers until the checkpointer frees a chunk of WAL?

There may be performance reasons discouraging this, but more likely it’s
just a lack of awareness, or maybe an insufficient benefit/cost ratio to
volunteer to address this corner-case scenario.

David J.