archive command doesnt work

Started by Alpaslan AKDAĞabout 2 years ago10 messagesgeneral
Jump to latest
#1Alpaslan AKDAĞ
alpaslanakdag@gmail.com

Hello

We have a primary, a hot standby and 2 warm standby servers with 2 days
delay.

After switchover since 01.02.2024 hot standby server does not archive wal
files. I couldn't find the problem.

Hot Standby Server infos:
Redhat Enterprise Linux 9.2
postgreSQL v13.10

related postgresql.conf lines:
archive_mode = on
archive_command = 'cp %p /archive/archived_wal/%f && cp %p
/opt/postgres/backup/archived_wal/%f'
archive_timeout = 300
wal_keep_size = 80
wal_level = 'replica'
wal_log_hints = on
max_wal_size = 10GB
min_wal_size = 80MB

archived WAL files per day in /archive/archived_wal/ folder. All of them
are before the switchover.
Number Day
421 2024-01-19
672 2024-01-20
1374 2024-01-21
564 2024-01-22
569 2024-01-23
607 2024-01-24
510 2024-01-25
612 2024-01-26
495 2024-01-27
1497 2024-01-28
556 2024-01-29
506 2024-01-30
549 2024-01-31
228 2024-02-01

and in log files there are lines like below but no error or failure lines:

2024-02-02 02:07:01.978 CET [979081] LOG: restartpoint complete: wrote
26904 buffers (2.6%); 0 WAL file(s) added, 0 removed, 0 recycled; write=445.
874 s, sync=0.005 s, total=445.887 s; sync files=88, longest=0.001 s,
average=0.001 s; distance=732035 kB, estimate=732035 kB
2024-02-02 02:07:01.978 CET [979081] LOG: recovery restart point at
252/2D7E5E68
2024-02-02 02:07:01.978 CET [979081] DETAIL: Last completed transaction
was at log time 2024-02-02 02:07:01.889865+01.
--
2024-02-02 02:07:02.024 CET [979081] LOG: restartpoint complete: wrote
310 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.037
s, sync=0.004 s, total=0.046 s; sync files=54, longest=0.002 s,
average=0.001 s; distance=90216 kB, estimate=667853 kB
2024-02-02 02:07:02.024 CET [979081] LOG: recovery restart point at
252/33000028
2024-02-02 02:07:02.024 CET [979081] DETAIL: Last completed transaction
was at log time 2024-02-02 02:07:01.889865+01.

postgresql-2024-02-03_000000.log:2024-02-03 09:03:20.383 CET [979081]
LOG: restartpoint complete: wrote 102950 buffers (9.8%); 0 WAL file(s)
added,
1286 removed, 117 recycled; write=1079.221 s, sync=0.008 s, total=1082.510
s; sync files=146, longest=0.001 s, average=0.001 s; distance=1143627 kB,
estimate=1143627 kB
postgresql-2024-02-03_000000.log-2024-02-03 09:03:20.383 CET [979081]
LOG: recovery restart point at 255/43D56D80
postgresql-2024-02-03_000000.log-2024-02-03 09:03:20.383 CET [979081]
DETAIL: Last completed transaction was at log time 2024-02-03
09:03:20.311619
+01.
--
postgresql-2024-02-03_000000.log:2024-02-03 17:47:29.480 CET [979081]
LOG: restartpoint complete: wrote 1305 buffers (0.1%); 0 WAL file(s)
added, 1
removed, 3 recycled; write=131.069 s, sync=0.005 s, total=131.088 s; sync
files=104, longest=0.001 s, average=0.001 s; distance=65368 kB, estimate=13
9744 kB
postgresql-2024-02-03_000000.log-2024-02-03 17:47:29.480 CET [979081]
LOG: recovery restart point at 255/E40CE260
postgresql-2024-02-03_000000.log-2024-02-03 17:47:29.480 CET [979081]
DETAIL: Last completed transaction was at log time 2024-02-03
17:47:28.055165
+01.

--
postgresql-2024-02-05_000000.log:2024-02-05 06:25:50.272 CET [979081]
LOG: restartpoint complete: wrote 477 buffers (0.0%); 0 WAL file(s) added,
0
removed, 4 recycled; write=47.946 s, sync=0.005 s, total=47.964 s; sync
files=85, longest=0.001 s, average=0.001 s; distance=65484 kB,
estimate=413488
kB
postgresql-2024-02-05_000000.log-2024-02-05 06:25:50.272 CET [979081]
LOG: recovery restart point at 25C/74083E58
postgresql-2024-02-05_000000.log-2024-02-05 06:25:50.272 CET [979081]
DETAIL: Last completed transaction was at log time 2024-02-05
06:25:50.223799
+01.

and last thing that i want to share

select pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn(),
pg_last_xact_replay_timestamp() \gx
-[ RECORD 1 ]-----------------+------------------------------
pg_last_wal_receive_lsn | 261/C10BB318
pg_last_wal_replay_lsn | 261/C10BB318
pg_last_xact_replay_timestamp | 2024-02-07 11:40:57.536997+01

Everything seems fine except archiving.
for now i share these informations but if you need more, i can give.

Thanks in advance.
Best Regards
Alpaslan

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Alpaslan AKDAĞ (#1)
Re: archive command doesnt work

On Wed, 2024-02-07 at 12:07 +0100, Alpaslan AKDAĞ wrote:

We have a primary, a hot standby and 2 warm standby servers with 2 days delay.

After switchover since 01.02.2024 hot standby server does not archive wal files. I couldn't find the problem.

related postgresql.conf lines:
archive_mode = on
archive_command = 'cp %p /archive/archived_wal/%f && cp %p /opt/postgres/backup/archived_wal/%f'

and in log files there are lines like below but no error or failure lines:

postgresql-2024-02-05_000000.log:2024-02-05 06:25:50.272 CET [979081]   LOG:  restartpoint complete: wrote 477 buffers (0.0%); 0 WAL file(s) added, 0
removed, 4 recycled; write=47.946 s, sync=0.005 s, total=47.964 s; sync files=85, longest=0.001 s, average=0.001 s; distance=65484 kB, estimate=413488
 kB
postgresql-2024-02-05_000000.log-2024-02-05 06:25:50.272 CET [979081]   LOG:  recovery restart point at 25C/74083E58
postgresql-2024-02-05_000000.log-2024-02-05 06:25:50.272 CET [979081]   DETAIL:  Last completed transaction was at log time 2024-02-05 06:25:50.223799
+01.

That looks like the standby wasn't promoted and is still in recovery,
so it won't generate WAL.

Restartpoints are only written on servers in recovery mode. Moreover,
a promote request would generate a log entry.

Another option would be that the promote request is further up in the log,
and the server takes a long time to replay all the changes
(if the "2 days delay" you mention are set in "recovery_min_apply_delay").

What do you get for

SELECT pg_is_in_recovery();

Yours,
Laurenz Albe

#3Alpaslan AKDAĞ
alpaslanakdag@gmail.com
In reply to: Laurenz Albe (#2)
Re: archive command doesnt work

Hello Laurenz,

Thank you for your answer.

I have checked the pg_is_in_recovery() and here are the results from
primary and hot stand by server.

Primary (old standby ):
select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)

hot standby(old primary):
select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
and there is also standby.signal file in standby server.
So it seems that there is nothing wrong.

recovery_min_apply_delay = '2d' are set in warm standby servers. Before
the switchover we had the same settings but we did not have this problem.
How can I fix it?

Best regards,
Alpaslan

On Wed, Feb 7, 2024 at 9:06 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

Show quoted text

On Wed, 2024-02-07 at 12:07 +0100, Alpaslan AKDAĞ wrote:

We have a primary, a hot standby and 2 warm standby servers with 2 days

delay.

After switchover since 01.02.2024 hot standby server does not archive

wal files. I couldn't find the problem.

related postgresql.conf lines:
archive_mode = on
archive_command = 'cp %p /archive/archived_wal/%f && cp %p

/opt/postgres/backup/archived_wal/%f'

and in log files there are lines like below but no error or failure

lines:

postgresql-2024-02-05_000000.log:2024-02-05 06:25:50.272 CET [979081]

LOG: restartpoint complete: wrote 477 buffers (0.0%); 0 WAL file(s) added,
0

removed, 4 recycled; write=47.946 s, sync=0.005 s, total=47.964 s; sync

files=85, longest=0.001 s, average=0.001 s; distance=65484 kB,
estimate=413488

kB
postgresql-2024-02-05_000000.log-2024-02-05 06:25:50.272 CET [979081]

LOG: recovery restart point at 25C/74083E58

postgresql-2024-02-05_000000.log-2024-02-05 06:25:50.272 CET [979081]

DETAIL: Last completed transaction was at log time 2024-02-05
06:25:50.223799

+01.

That looks like the standby wasn't promoted and is still in recovery,
so it won't generate WAL.

Restartpoints are only written on servers in recovery mode. Moreover,
a promote request would generate a log entry.

Another option would be that the promote request is further up in the log,
and the server takes a long time to replay all the changes
(if the "2 days delay" you mention are set in "recovery_min_apply_delay").

What do you get for

SELECT pg_is_in_recovery();

Yours,
Laurenz Albe

#4Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Alpaslan AKDAĞ (#3)
Re: archive command doesnt work

On Thu, 2024-02-08 at 09:23 +0100, Alpaslan AKDAĞ wrote:

Thank you for your answer.

I have checked the pg_is_in_recovery() and here are the results
from primary and hot stand by server.

Primary (old standby ):
select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 f
(1 row)

hot standby(old primary):
select pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 t
(1 row)
and there is also standby.signal file in standby server.
So it seems that there is nothing wrong.

recovery_min_apply_delay = '2d' are set in warm standby servers.
Before the switchover we had the same settings but we did not have this problem.

It is becoming increasingly obvious that you never actually performed
a switchover: it seems that you didn't promote the standby.

Either use

pg_ctl promote -D /path/to/datadir

on the command line or

SELECT pg_promote();

in SQL.

Yours,
Laurenz Albe

#5Mateusz Henicz
mateuszhenicz@gmail.com
In reply to: Laurenz Albe (#4)
Re: archive command doesnt work

czw., 8 lut 2024 o 21:10 Laurenz Albe <laurenz.albe@cybertec.at> napisał(a):

On Thu, 2024-02-08 at 09:23 +0100, Alpaslan AKDAĞ wrote:

Thank you for your answer.

I have checked the pg_is_in_recovery() and here are the results
from primary and hot stand by server.

Primary (old standby ):
select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)

hot standby(old primary):
select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
and there is also standby.signal file in standby server.
So it seems that there is nothing wrong.

recovery_min_apply_delay = '2d' are set in warm standby servers.
Before the switchover we had the same settings but we did not have this

problem.

It is becoming increasingly obvious that you never actually performed
a switchover: it seems that you didn't promote the standby.

Either use

pg_ctl promote -D /path/to/datadir

on the command line or

SELECT pg_promote();

in SQL.

Yours,
Laurenz Albe

From what you sent:

*"After switchover since 01.02.2024 hot standby server does not archive wal
files. I couldn't find the problem."*
Hot standby normally does not archive WAL files. Only the primary server
does that unless you have archive_mode set to "always" but your
configuration shows that you have it "on" only.

*"related postgresql.conf lines:*
*archive_mode = on "*

Cheers,
Mateusz

#6Alpaslan AKDAĞ
alpaslanakdag@gmail.com
In reply to: Laurenz Albe (#4)
Re: archive command doesnt work

Now I am confused. In Primary pg_is_in_recovery() should be false and in
standby node should be true. Am I wrong?

Thanks and regards
Alpaslan

On Thu, Feb 8, 2024 at 9:10 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

Show quoted text

On Thu, 2024-02-08 at 09:23 +0100, Alpaslan AKDAĞ wrote:

Thank you for your answer.

I have checked the pg_is_in_recovery() and here are the results
from primary and hot stand by server.

Primary (old standby ):
select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)

hot standby(old primary):
select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
and there is also standby.signal file in standby server.
So it seems that there is nothing wrong.

recovery_min_apply_delay = '2d' are set in warm standby servers.
Before the switchover we had the same settings but we did not have this

problem.

It is becoming increasingly obvious that you never actually performed
a switchover: it seems that you didn't promote the standby.

Either use

pg_ctl promote -D /path/to/datadir

on the command line or

SELECT pg_promote();

in SQL.

Yours,
Laurenz Albe

#7Alpaslan AKDAĞ
alpaslanakdag@gmail.com
In reply to: Mateusz Henicz (#5)
Re: archive command doesnt work

Hello Mateusz

Thank you for your answer.

If it is like that, there is not any problem. I tried to find an
explanation that hot standby archive wal files or not but I couldn't.
Is this information from the Documentation?

Best Regards,
Alpaslan

On Thu, Feb 8, 2024 at 9:24 PM Mateusz Henicz <mateuszhenicz@gmail.com>
wrote:

Show quoted text

czw., 8 lut 2024 o 21:10 Laurenz Albe <laurenz.albe@cybertec.at>
napisał(a):

On Thu, 2024-02-08 at 09:23 +0100, Alpaslan AKDAĞ wrote:

Thank you for your answer.

I have checked the pg_is_in_recovery() and here are the results
from primary and hot stand by server.

Primary (old standby ):
select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)

hot standby(old primary):
select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
and there is also standby.signal file in standby server.
So it seems that there is nothing wrong.

recovery_min_apply_delay = '2d' are set in warm standby servers.
Before the switchover we had the same settings but we did not have this

problem.

It is becoming increasingly obvious that you never actually performed
a switchover: it seems that you didn't promote the standby.

Either use

pg_ctl promote -D /path/to/datadir

on the command line or

SELECT pg_promote();

in SQL.

Yours,
Laurenz Albe

From what you sent:

*"After switchover since 01.02.2024 hot standby server does not archive
wal files. I couldn't find the problem."*
Hot standby normally does not archive WAL files. Only the primary server
does that unless you have archive_mode set to "always" but your
configuration shows that you have it "on" only.

*"related postgresql.conf lines:*
*archive_mode = on "*

Cheers,
Mateusz

#8Mateusz Henicz
mateuszhenicz@gmail.com
In reply to: Alpaslan AKDAĞ (#7)
Re: archive command doesnt work

Hey,
Yes, it is.
https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-ARCHIVE-MODE

czw., 8 lut 2024 o 21:35 Alpaslan AKDAĞ <alpaslanakdag@gmail.com>
napisał(a):

Show quoted text

Hello Mateusz

Thank you for your answer.

If it is like that, there is not any problem. I tried to find an
explanation that hot standby archive wal files or not but I couldn't.
Is this information from the Documentation?

Best Regards,
Alpaslan

On Thu, Feb 8, 2024 at 9:24 PM Mateusz Henicz <mateuszhenicz@gmail.com>
wrote:

czw., 8 lut 2024 o 21:10 Laurenz Albe <laurenz.albe@cybertec.at>
napisał(a):

On Thu, 2024-02-08 at 09:23 +0100, Alpaslan AKDAĞ wrote:

Thank you for your answer.

I have checked the pg_is_in_recovery() and here are the results
from primary and hot stand by server.

Primary (old standby ):
select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)

hot standby(old primary):
select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
and there is also standby.signal file in standby server.
So it seems that there is nothing wrong.

recovery_min_apply_delay = '2d' are set in warm standby servers.
Before the switchover we had the same settings but we did not have

this problem.

It is becoming increasingly obvious that you never actually performed
a switchover: it seems that you didn't promote the standby.

Either use

pg_ctl promote -D /path/to/datadir

on the command line or

SELECT pg_promote();

in SQL.

Yours,
Laurenz Albe

From what you sent:

*"After switchover since 01.02.2024 hot standby server does not archive
wal files. I couldn't find the problem."*
Hot standby normally does not archive WAL files. Only the primary server
does that unless you have archive_mode set to "always" but your
configuration shows that you have it "on" only.

*"related postgresql.conf lines:*
*archive_mode = on "*

Cheers,
Mateusz

#9Alpaslan AKDAĞ
alpaslanakdag@gmail.com
In reply to: Mateusz Henicz (#8)
Re: archive command doesnt work

Thank you very much.

I got my answer and also read from the documentation. Primary and standby
both seem fine.

Best Regards
Alpaslan

On Thu, Feb 8, 2024 at 9:39 PM Mateusz Henicz <mateuszhenicz@gmail.com>
wrote:

Show quoted text

Hey,
Yes, it is.

https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-ARCHIVE-MODE

czw., 8 lut 2024 o 21:35 Alpaslan AKDAĞ <alpaslanakdag@gmail.com>
napisał(a):

Hello Mateusz

Thank you for your answer.

If it is like that, there is not any problem. I tried to find an
explanation that hot standby archive wal files or not but I couldn't.
Is this information from the Documentation?

Best Regards,
Alpaslan

On Thu, Feb 8, 2024 at 9:24 PM Mateusz Henicz <mateuszhenicz@gmail.com>
wrote:

czw., 8 lut 2024 o 21:10 Laurenz Albe <laurenz.albe@cybertec.at>
napisał(a):

On Thu, 2024-02-08 at 09:23 +0100, Alpaslan AKDAĞ wrote:

Thank you for your answer.

I have checked the pg_is_in_recovery() and here are the results
from primary and hot stand by server.

Primary (old standby ):
select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)

hot standby(old primary):
select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
and there is also standby.signal file in standby server.
So it seems that there is nothing wrong.

recovery_min_apply_delay = '2d' are set in warm standby servers.
Before the switchover we had the same settings but we did not have

this problem.

It is becoming increasingly obvious that you never actually performed
a switchover: it seems that you didn't promote the standby.

Either use

pg_ctl promote -D /path/to/datadir

on the command line or

SELECT pg_promote();

in SQL.

Yours,
Laurenz Albe

From what you sent:

*"After switchover since 01.02.2024 hot standby server does not archive
wal files. I couldn't find the problem."*
Hot standby normally does not archive WAL files. Only the primary server
does that unless you have archive_mode set to "always" but your
configuration shows that you have it "on" only.

*"related postgresql.conf lines:*
*archive_mode = on "*

Cheers,
Mateusz

#10Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Alpaslan AKDAĞ (#6)
Re: archive command doesnt work

On Thu, 2024-02-08 at 21:28 +0100, Alpaslan AKDAĞ wrote:

Now I am confused. In Primary pg_is_in_recovery() should be false and in
standby node should be true. Am I wrong?

Right. I guess I got confused which is your primary and which is your
standby. That's normal in the context of switchover :^/

Yours,
Laurenz Albe