WAL archive is lost
Hi all
I find a situation that WAL archive file is lost but any WAL segment file is not lost.
It causes for archive recovery to fail. Is this behavior a bug?
example:
WAL segment files
000000010000000000000001
000000010000000000000002
000000010000000000000003
Archive files
000000010000000000000001
000000010000000000000003
Archive file 000000010000000000000002 is lost but WAL segment files
is continuous. Recovery with archive (i.e. PITR) stops at the end of
000000010000000000000001.
How to reproduce:
- Set up replication (primary and standby).
- Set [archive_mode = always] in standby.
- WAL receiver exits (i.e. because primary goes down)
after receiver inserts the last record in some WAL segment file
before receiver notifies the segement file to archiver(create .ready file).
Even if WAL receiver restarts, the WAL segment file is not notified to
archiver.
Regards
Ryo Matsumura
On Fri, Nov 22, 2019 at 05:31:55AM +0000, matsumura.ryo@fujitsu.com wrote:
Hi all
I find a situation that WAL archive file is lost but any WAL segment file is not lost.
It causes for archive recovery to fail. Is this behavior a bug?example:
WAL segment files
000000010000000000000001
000000010000000000000002
000000010000000000000003Archive files
000000010000000000000001
000000010000000000000003Archive file 000000010000000000000002 is lost but WAL segment files
is continuous. Recovery with archive (i.e. PITR) stops at the end of
000000010000000000000001.How to reproduce:
- Set up replication (primary and standby).
- Set [archive_mode = always] in standby.
- WAL receiver exits (i.e. because primary goes down)
after receiver inserts the last record in some WAL segment file
before receiver notifies the segement file to archiver(create .ready file).Even if WAL receiver restarts, the WAL segment file is not notified to
archiver.
That does indeed seem like a bug. We should certainly archive all WAL
segments, irrespectedly of primary shutdowns/restarts/whatever. I guess
we should make sure the archiver is properly notified befor ethe exit.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Nov 22, 2019 at 8:04 AM matsumura.ryo@fujitsu.com <
matsumura.ryo@fujitsu.com> wrote:
Hi all
I find a situation that WAL archive file is lost but any WAL segment file
is not lost.
It causes for archive recovery to fail. Is this behavior a bug?example:
WAL segment files
000000010000000000000001
000000010000000000000002
000000010000000000000003Archive files
000000010000000000000001
000000010000000000000003Archive file 000000010000000000000002 is lost but WAL segment files
is continuous. Recovery with archive (i.e. PITR) stops at the end of
000000010000000000000001.
Will it not archive 000000010000000000000002 eventually, like at the
conclusion of the next restartpoint? or does it get recycled/removed
without ever being archived? Or does it just hang out forever in pg_wal?
How to reproduce:
- Set up replication (primary and standby).
- Set [archive_mode = always] in standby.
- WAL receiver exits (i.e. because primary goes down)
after receiver inserts the last record in some WAL segment file
before receiver notifies the segement file to archiver(create .ready
file).
Do you have a trick for reliably achieving this last step?
Cheers,
Jeff
Tomas-san and Jeff-san
I'm very sorry for my slow response.
Tomas-san wrote:
That does indeed seem like a bug. We should certainly archive all WAL
segments, irrespectedly of primary shutdowns/restarts/whatever.
I think so, too.
Tomas-san wrote:
I guess we should make sure the archiver is properly notified befor
ethe exit.
Just an idea.
If walrcv_receive(libpqrcv_receive) returns by error value when
socket error is occured, it is enable for walreceiver to walk
endofwal-route that calls XLogArchiveNotify() in the end of
outter loop of walreceiver.
593 XLogArchiveNotify(xlogfname);
594 }
595 recvFile = -1;
596
597 elog(DEBUG1, "walreceiver ended streaming and awaits new instructions");
598 Wal
Jeff-san wrote:
Will it not archive 000000010000000000000002 eventually, like at the
conclusion of the next restartpoint? or does it get recycled/removed
without ever being archived? Or does it just hang out forever in pg_wal?
000000010000000000000002 hang out forever.
000000010000000000000002 will be never archived, recycled, and removed.
I found that even if archive_mode is not set to 'always',
it will be never recycled and removed.
Jeff-san wrote:
Do you have a trick for reliably achieving this last step?
If possible, stop walsender just after it sends the end record of in one
WAL segement file or SWITCH_LOG, and then stop primary immediately.
There are two pattern that cause this issue.
Pattern 1.
If primary is shut down immediately when walreceiver receives the end
record of one WAL segment file and then wait for next record by walrcv_receive(),
walreceiver exits without XLogArchiveNotify() or XLogArchiveForceDone() in
XLogWalRcvWrite() because walrcv_receive() reports ERROR.
Even if the startup process restarts walreceiver and requests to start
from the top of next segement file. Then, walreceiver receives it and
writes by XLogWalRcvWrite() but it doesn't walk the route to XLogArchiveNotify()
because it has not opened any file (recvFile == -1).
Pattern 2.
Only trigger is different.
If primary is shut down immediately when walreceiver receives SWITCH_LOG
and then wait for next record by walrcv_receive(), walreceiver exits
without notification to archiver.
The startup process will tell for walreceiver to start receiving from
the top of next segment file.
Regards
Ryo Matsumura