PgbackRest PointTIme Recovery : server unable to start back

Started by KK CHNover 1 year ago4 messagesgeneral

kkchn.in@gmail.com

over 1 year ago

List,

Reference: https://pgbackrest.org/user-guide-rhel.html#pitr
I am following the PTR on RHEL9 EPAS16.
I am able to do a backup(Full, diff and incr) and restore from a full
backup and restart of EPAS16 works fine.

But when I do an incremental backup after doing the procedures mentioned
in the PTR section of the above reference link and try restoring the EDB
database from the INCR backup and then starting up the EPAS16 the
server always results in dead state

My repo server is another machine. If I do a full restore on the DB
server ( sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
restore) it works and the server starts without any issue.
Restoring from Incremental backup tty output shows successful but edb
service start results in failure.

Any help is much appreciated.

Krishane.

STEPS followed:

after dropping the table pg-primary *⇒* Drop the important table
([section]
stopped the EDB server.

$ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
--set=20240719-122703F_20240724-094727I --target-timeline=current
--type=time "--target=2024-07-24 09:44:01.3255+05:30"
--target-action=promote restore
.....................................................

2024-07-26 09:48:06.343 P00 INFO: restore command end: completed
successfully (1035ms)

But

[root@rservice01 ~]# sudo systemctl start edb-as-16.service
[root@service01 ~]# sudo systemctl status edb-as-16.service
× edb-as-16.service - EDB Postgres Advanced Server 16
Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled;
preset: disabled)
* Active: failed* (Result: exit-code) since Fri 2024-07-26 09:48:50
IST; 8s ago
Duration: 242ms
Process: 41903 ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir
${PGDATA} (code=exited, status=0/SUCCESS)
Process: 41908 ExecStart=/usr/edb/as16/bin/edb-postgres -D ${PGDATA}
(code=exited, status=1/FAILURE)
Main PID: 41908 (code=exited, status=1/FAILURE)
CPU: 331ms

Jul 26 09:48:48 service01 systemd[1]: Starting EDB Postgres Advanced Server
16...
Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST LOG:
redirecting log output to logging collector process
Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
HINT: Future log output will appear in directory "log".
Jul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced Server
16.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
exited, code=exited, status=1/FAILURE
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Killing process
41909 (edb-postgres) with signal SIGKILL.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Failed with result
'exit-code'.
[root@service01 ~]#

why is it unable to perform a restore and recovery from an incr backup ?

On The Repo Server
[root@service02 ~]# sudo -u postgres pgbackrest --stanza=Demo_Repo info
stanza: Demo_Repo
status: ok
cipher: aes-256-cbc

db (current)
wal archive min/max (16):
000000020000000000000021/0000000B0000000000000041

full backup: 20240719-122703F
timestamp start/stop: 2024-07-19 12:27:03+05:30 / 2024-07-19
12:27:06+05:30
wal start/stop: 00000002000000000000002A /
00000002000000000000002A
database size: 61.7MB, database backup size: 61.7MB
repo1: backup size: 9.6MB

incr backup: 20240719-122703F_20240719-123353I
timestamp start/stop: 2024-07-19 12:33:53+05:30 / 2024-07-19
12:33:56+05:30
wal start/stop: 00000002000000000000002C /
00000002000000000000002C
database size: 61.7MB, database backup size: 6.4MB
repo1: backup size: 6.2KB
backup reference list: 20240719-122703F

        diff backup: 20240719-122703F_20240719-123408D
            timestamp start/stop: 2024-07-19 12:34:08+05:30 / 2024-07-19
12:34:10+05:30
            wal start/stop: 00000002000000000000002E /
00000002000000000000002E
            database size: 61.7MB, database backup size: 6.4MB
            repo1: backup size: 6.4KB
            backup reference list: 20240719-122703F

incr backup: 20240719-122703F_20240723-110212I
timestamp start/stop: 2024-07-23 11:02:12+05:30 / 2024-07-23
11:02:15+05:30
wal start/stop: 000000070000000000000038 /
000000070000000000000038
database size: 48MB, database backup size: 6.4MB
repo1: backup size: 9.8KB
backup reference list: 20240719-122703F,
20240719-122703F_20240719-123408D

incr backup: 20240719-122703F_20240723-141818I
timestamp start/stop: 2024-07-23 14:18:18+05:30 / 2024-07-23
14:18:22+05:30
wal start/stop: 00000008000000000000003C /
00000008000000000000003C
database size: 75.4MB, database backup size: 33.8MB
repo1: backup size: 4.7MB
backup reference list: 20240719-122703F,
20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I

incr backup: 20240719-122703F_20240724-094727I
timestamp start/stop: 2024-07-24 09:47:27+05:30 / 2024-07-24
09:47:30+05:30
wal start/stop: 00000009000000000000003D /
00000009000000000000003D
database size: 62MB, database backup size: 11.5MB
repo1: backup size: 72KB
backup reference list: 20240719-122703F,
20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I,
20240719-122703F_20240723-141818

Muhammad Ikram

mmikram@gmail.com

over 1 year ago

In reply to: KK CHN (#1)

Re: PgbackRest PointTIme Recovery : server unable to start back

Hi KK CHN

Could you check server logs ?
Your service trace suggests that it started server and then failure happened

ul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced Server
16.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
exited, code=exited, status=1/FAILURE

Regards,
Ikram

On Fri, Jul 26, 2024 at 10:04 AM KK CHN <kkchn.in@gmail.com> wrote:

List,

Reference: https://pgbackrest.org/user-guide-rhel.html#pitr
I am following the PTR on RHEL9 EPAS16.
I am able to do a backup(Full, diff and incr) and restore from a full
backup and restart of EPAS16 works fine.

But when I do an incremental backup after doing the procedures
mentioned in the PTR section of the above reference link and try
restoring the EDB database from the INCR backup and then starting up the
EPAS16 the server always results in dead state

My repo server is another machine. If I do a full restore on the DB
server ( sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
restore) it works and the server starts without any issue.
Restoring from Incremental backup tty output shows successful but edb
service start results in failure.

Any help is much appreciated.

Krishane.

STEPS followed:

after dropping the table pg-primary *⇒* Drop the important table
([section]
stopped the EDB server.

$ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
--set=20240719-122703F_20240724-094727I --target-timeline=current
--type=time "--target=2024-07-24 09:44:01.3255+05:30"
--target-action=promote restore
.....................................................

2024-07-26 09:48:06.343 P00 INFO: restore command end: completed
successfully (1035ms)

But

[root@rservice01 ~]# sudo systemctl start edb-as-16.service
[root@service01 ~]# sudo systemctl status edb-as-16.service
× edb-as-16.service - EDB Postgres Advanced Server 16
Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled;
preset: disabled)
* Active: failed* (Result: exit-code) since Fri 2024-07-26 09:48:50
IST; 8s ago
Duration: 242ms
Process: 41903 ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir
${PGDATA} (code=exited, status=0/SUCCESS)
Process: 41908 ExecStart=/usr/edb/as16/bin/edb-postgres -D ${PGDATA}
(code=exited, status=1/FAILURE)
Main PID: 41908 (code=exited, status=1/FAILURE)
CPU: 331ms

Jul 26 09:48:48 service01 systemd[1]: Starting EDB Postgres Advanced
Server 16...
Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
LOG: redirecting log output to logging collector process
Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
HINT: Future log output will appear in directory "log".
Jul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced Server
16.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
exited, code=exited, status=1/FAILURE
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Killing process
41909 (edb-postgres) with signal SIGKILL.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Failed with
result 'exit-code'.
[root@service01 ~]#

why is it unable to perform a restore and recovery from an incr backup ?

On The Repo Server
[root@service02 ~]# sudo -u postgres pgbackrest --stanza=Demo_Repo info
stanza: Demo_Repo
status: ok
cipher: aes-256-cbc

db (current)
wal archive min/max (16):
000000020000000000000021/0000000B0000000000000041

full backup: 20240719-122703F
timestamp start/stop: 2024-07-19 12:27:03+05:30 / 2024-07-19
12:27:06+05:30
wal start/stop: 00000002000000000000002A /
00000002000000000000002A
database size: 61.7MB, database backup size: 61.7MB
repo1: backup size: 9.6MB

incr backup: 20240719-122703F_20240719-123353I
timestamp start/stop: 2024-07-19 12:33:53+05:30 / 2024-07-19
12:33:56+05:30
wal start/stop: 00000002000000000000002C /
00000002000000000000002C
database size: 61.7MB, database backup size: 6.4MB
repo1: backup size: 6.2KB
backup reference list: 20240719-122703F
diff backup: 20240719-122703F_20240719-123408D
timestamp start/stop: 2024-07-19 12:34:08+05:30 / 2024-07-19
12:34:10+05:30
wal start/stop: 00000002000000000000002E /
00000002000000000000002E
database size: 61.7MB, database backup size: 6.4MB
repo1: backup size: 6.4KB
backup reference list: 20240719-122703F
incr backup: 20240719-122703F_20240723-110212I
timestamp start/stop: 2024-07-23 11:02:12+05:30 / 2024-07-23
11:02:15+05:30
wal start/stop: 000000070000000000000038 /
000000070000000000000038
database size: 48MB, database backup size: 6.4MB
repo1: backup size: 9.8KB
backup reference list: 20240719-122703F,
20240719-122703F_20240719-123408D

incr backup: 20240719-122703F_20240723-141818I
timestamp start/stop: 2024-07-23 14:18:18+05:30 / 2024-07-23
14:18:22+05:30
wal start/stop: 00000008000000000000003C /
00000008000000000000003C
database size: 75.4MB, database backup size: 33.8MB
repo1: backup size: 4.7MB
backup reference list: 20240719-122703F,
20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I

incr backup: 20240719-122703F_20240724-094727I
timestamp start/stop: 2024-07-24 09:47:27+05:30 / 2024-07-24
09:47:30+05:30
wal start/stop: 00000009000000000000003D /
00000009000000000000003D
database size: 62MB, database backup size: 11.5MB
repo1: backup size: 72KB
backup reference list: 20240719-122703F,
20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I,
20240719-122703F_20240723-141818

--
Muhammad Ikram

KK CHN

kkchn.in@gmail.com

over 1 year ago

In reply to: Muhammad Ikram (#2)

Re: PgbackRest PointTIme Recovery : server unable to start back

here the log outputs....

When I ran

$sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
--set=20240719-122703F_20240724-094727I --target-timeline=current
--type=time "--target=2024-07-24 09:44:01.3255+05:30"
--target-action=promote restore

The Edb server log out put shows
...........................................................................
................................................................
2024-07-23 15:34:12 IST ERROR: syntax error at or near "selecct" at
character 1
2024-07-23 15:34:12 IST STATEMENT: selecct current_timestamp
2024-07-23 15:34:26 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:38:20 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:42:17 IST FATAL: database "test1" does not exist
2024-07-24 09:43:56 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:45:07 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:45:16 IST ERROR: relation "important_table" does not exist
at character 58
2024-07-24 09:45:16 IST STATEMENT: begin; drop table important_table;
commit; select * from important_table;
2024-07-24 09:46:00 IST LOG: checkpoint starting: time
2024-07-24 09:46:02 IST LOG: checkpoint complete: wrote 16 buffers (0.1%);
0 WAL file(s) added, 0 removed, 0 recycled; write=1.506 s, sync=0.001 s,
total=1.509 s; sync files=9, longest=0.001 s, average=0.001 s; distance=93
kB, estimate=171 kB; lsn=0/3C045768, redo lsn=0/3C045730
2024-07-24 09:47:27 IST LOG: checkpoint starting: immediate force wait
2024-07-24 09:47:27 IST LOG: checkpoint complete: wrote 2 buffers (0.0%);
0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 s,
total=0.003 s; sync files=1, longest=0.001 s, average=0.001 s;
distance=16106 kB, estimate=16106 kB; lsn=0/3D000060, redo lsn=0/3D000028
2024-07-24 09:47:27.788 P00 INFO: archive-push command begin 2.52.1:
[pg_wal/00000009000000000000003C] --exec-id=34036-c1d6de0b
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-24 09:47:28.130 P00 INFO: pushed WAL file
'00000009000000000000003C' to the archive
2024-07-24 09:47:28.230 P00 INFO: archive-push command end: completed
successfully (444ms)

When I ran $ sudo systemctl start edb-as-16.service

4-07-26 11:32:56 IST LOG: starting PostgreSQL 16.3 (EnterpriseDB Advanced
Server 16.3.0) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1
20231218 (Red Hat 11.4.1-3), 64-bit
2024-07-26 11:32:56 IST LOG: listening on IPv4 address "0.0.0.0", port 5444
2024-07-26 11:32:56 IST LOG: listening on IPv6 address "::", port 5444
2024-07-26 11:32:56 IST LOG: listening on Unix socket "/tmp/.s.PGSQL.5444"
2024-07-26 11:32:56 IST LOG: database system was interrupted; last known
up at 2024-07-24 09:47:27 IST
2024-07-26 11:32:56 IST LOG: starting point-in-time recovery to 2024-07-24
09:44:01.3255+05:30
2024-07-26 11:32:56 IST LOG: starting backup recovery with redo LSN
0/3D000028, checkpoint LSN 0/3D000060, on timeline ID 9
2024-07-26 11:32:56.475 P00 INFO: archive-get command begin 2.52.1:
[00000009.history, pg_wal/RECOVERYHISTORY] --exec-id=43293-1d583a8e
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:56.709 P00 INFO: found 00000009.history in the repo1:
16-1 archive
2024-07-26 11:32:56.809 P00 INFO: archive-get command end: completed
successfully (335ms)
2024-07-26 11:32:56 IST LOG: restored log file "00000009.history" from
archive
2024-07-26 11:32:56.816 P00 INFO: archive-get command begin 2.52.1:
[00000009000000000000003D, pg_wal/RECOVERYXLOG] --exec-id=43295-55c5d6f7
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:56.999 P00 INFO: found 00000009000000000000003D in the
repo1: 16-1 archive
2024-07-26 11:32:57.099 P00 INFO: archive-get command end: completed
successfully (285ms)
2024-07-26 11:32:57 IST LOG: restored log file "00000009000000000000003D"
from archive
2024-07-26 11:32:57 IST LOG: redo starts at 0/3D000028
2024-07-26 11:32:57.114 P00 INFO: archive-get command begin 2.52.1:
[00000009000000000000003E, pg_wal/RECOVERYXLOG] --exec-id=43297-34867c05
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.282 P00 INFO: found 00000009000000000000003E in the
repo1: 16-1 archive
2024-07-26 11:32:57.382 P00 INFO: archive-get command end: completed
successfully (270ms)
2024-07-26 11:32:57 IST LOG: restored log file "00000009000000000000003E"
from archive
2024-07-26 11:32:57.400 P00 INFO: archive-get command begin 2.52.1:
[00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43299-e2db2e1b
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.521 P00 INFO: unable to find 00000009000000000000003F
in the archive
2024-07-26 11:32:57.621 P00 INFO: archive-get command end: completed
successfully (222ms)
2024-07-26 11:32:57 IST LOG: completed backup recovery with redo LSN
0/3D000028 and end LSN 0/3D000100
2024-07-26 11:32:57 IST LOG: consistent recovery state reached at
0/3D000100
2024-07-26 11:32:57 IST LOG: database system is ready to accept read-only
connections
2024-07-26 11:32:57.632 P00 INFO: archive-get command begin 2.52.1:
[00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43301-f613dae9
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.761 P00 INFO: unable to find 00000009000000000000003F
in the archive
2024-07-26 11:32:57.861 P00 INFO: archive-get command end: completed
successfully (231ms)
2024-07-26 11:32:57 IST LOG: redo done at 0/3E000060 system usage: CPU:
user: 0.00 s, system: 0.00 s, elapsed: 0.75 s
2024-07-26 11:32:57 IST FATAL: recovery ended before configured recovery
target was reached
2024-07-26 11:32:57 IST LOG: startup process (PID 43292) exited with exit
code 1

ONLY inference I can make is

INFO unable to find 00000009000000000000003F in the archive ( This
means the EDB server (10.10.20.6 ) unable to push the archives to the
Repo server(10.10.20.7 ) ? Is that the reason for the recovery and
start backing of edb server fails ?

the pg_hba.conf entry in the EDB Server machine is as

host all all 127.0.0.1/32 ident
host all all 10.10.20.7/32 scram-sha-256
#host all all 10.10.20.7/32 trust
# IPv6 local connections:
host all all ::1/128 ident
#host all all 10.10.20.7/24 trust

# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all peer
host replication all 10.10.20.7/32
scram-sha-256
host replication all 127.0.0.1/32 ident
host replication all ::1/128 ident

Do I have to change anything in pg_hba.conf ?

my EDB Server conf as this

archive_mode = on
archive_command = 'pgbackrest --stanza=Demo_Repo archive-push %p'
log_filename = 'postgresql.log'
max_wal_senders = 5
wal_level = replica

Any help ?

Krishane

On Fri, Jul 26, 2024 at 10:45 AM Muhammad Ikram <mmikram@gmail.com> wrote:

Show quoted text

Hi KK CHN

Could you check server logs ?
Your service trace suggests that it started server and then failure
happened

ul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced Server
16.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
exited, code=exited, status=1/FAILURE

Regards,
Ikram

On Fri, Jul 26, 2024 at 10:04 AM KK CHN <kkchn.in@gmail.com> wrote:
List,

Reference: https://pgbackrest.org/user-guide-rhel.html#pitr
I am following the PTR on RHEL9 EPAS16.
I am able to do a backup(Full, diff and incr) and restore from a full
backup and restart of EPAS16 works fine.

But when I do an incremental backup after doing the procedures
mentioned in the PTR section of the above reference link and try
restoring the EDB database from the INCR backup and then starting up the
EPAS16 the server always results in dead state

My repo server is another machine. If I do a full restore on the DB
server ( sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
restore) it works and the server starts without any issue.
Restoring from Incremental backup tty output shows successful but edb
service start results in failure.

Any help is much appreciated.

Krishane.

STEPS followed:

after dropping the table pg-primary *⇒* Drop the important table
([section]
stopped the EDB server.

$ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
--set=20240719-122703F_20240724-094727I --target-timeline=current
--type=time "--target=2024-07-24 09:44:01.3255+05:30"
--target-action=promote restore
.....................................................

2024-07-26 09:48:06.343 P00 INFO: restore command end: completed
successfully (1035ms)

But

[root@rservice01 ~]# sudo systemctl start edb-as-16.service
[root@service01 ~]# sudo systemctl status edb-as-16.service
× edb-as-16.service - EDB Postgres Advanced Server 16
Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled;
preset: disabled)
* Active: failed* (Result: exit-code) since Fri 2024-07-26 09:48:50
IST; 8s ago
Duration: 242ms
Process: 41903 ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir
${PGDATA} (code=exited, status=0/SUCCESS)
Process: 41908 ExecStart=/usr/edb/as16/bin/edb-postgres -D ${PGDATA}
(code=exited, status=1/FAILURE)
Main PID: 41908 (code=exited, status=1/FAILURE)
CPU: 331ms

Jul 26 09:48:48 service01 systemd[1]: Starting EDB Postgres Advanced
Server 16...
Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
LOG: redirecting log output to logging collector process
Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
HINT: Future log output will appear in directory "log".
Jul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced
Server 16.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
exited, code=exited, status=1/FAILURE
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Killing process
41909 (edb-postgres) with signal SIGKILL.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Failed with
result 'exit-code'.
[root@service01 ~]#

why is it unable to perform a restore and recovery from an incr backup ?

On The Repo Server
[root@service02 ~]# sudo -u postgres pgbackrest --stanza=Demo_Repo info
stanza: Demo_Repo
status: ok
cipher: aes-256-cbc

db (current)
wal archive min/max (16):
000000020000000000000021/0000000B0000000000000041

full backup: 20240719-122703F
timestamp start/stop: 2024-07-19 12:27:03+05:30 / 2024-07-19
12:27:06+05:30
wal start/stop: 00000002000000000000002A /
00000002000000000000002A
database size: 61.7MB, database backup size: 61.7MB
repo1: backup size: 9.6MB

incr backup: 20240719-122703F_20240719-123353I
timestamp start/stop: 2024-07-19 12:33:53+05:30 / 2024-07-19
12:33:56+05:30
wal start/stop: 00000002000000000000002C /
00000002000000000000002C
database size: 61.7MB, database backup size: 6.4MB
repo1: backup size: 6.2KB
backup reference list: 20240719-122703F
diff backup: 20240719-122703F_20240719-123408D
timestamp start/stop: 2024-07-19 12:34:08+05:30 / 2024-07-19
12:34:10+05:30
wal start/stop: 00000002000000000000002E /
00000002000000000000002E
database size: 61.7MB, database backup size: 6.4MB
repo1: backup size: 6.4KB
backup reference list: 20240719-122703F
incr backup: 20240719-122703F_20240723-110212I
timestamp start/stop: 2024-07-23 11:02:12+05:30 / 2024-07-23
11:02:15+05:30
wal start/stop: 000000070000000000000038 /
000000070000000000000038
database size: 48MB, database backup size: 6.4MB
repo1: backup size: 9.8KB
backup reference list: 20240719-122703F,
20240719-122703F_20240719-123408D

incr backup: 20240719-122703F_20240723-141818I
timestamp start/stop: 2024-07-23 14:18:18+05:30 / 2024-07-23
14:18:22+05:30
wal start/stop: 00000008000000000000003C /
00000008000000000000003C
database size: 75.4MB, database backup size: 33.8MB
repo1: backup size: 4.7MB
backup reference list: 20240719-122703F,
20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I

incr backup: 20240719-122703F_20240724-094727I
timestamp start/stop: 2024-07-24 09:47:27+05:30 / 2024-07-24
09:47:30+05:30
wal start/stop: 00000009000000000000003D /
00000009000000000000003D
database size: 62MB, database backup size: 11.5MB
repo1: backup size: 72KB
backup reference list: 20240719-122703F,
20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I,
20240719-122703F_20240723-141818
--
Muhammad Ikram

Muhammad Ikram

mmikram@gmail.com

over 1 year ago

In reply to: KK CHN (#3)

Re: PgbackRest PointTIme Recovery : server unable to start back

I don't think it has anything to do with the pg_hba.conf file. Archive
seems missing. I will suggest taking a full backup, as you have mentioned
that you are able to take full backup.

Regards,
Ikram

On Fri, Jul 26, 2024 at 11:24 AM KK CHN <kkchn.in@gmail.com> wrote:

here the log outputs....

When I ran

$sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
--set=20240719-122703F_20240724-094727I --target-timeline=current
--type=time "--target=2024-07-24 09:44:01.3255+05:30"
--target-action=promote restore

The Edb server log out put shows
...........................................................................
................................................................
2024-07-23 15:34:12 IST ERROR: syntax error at or near "selecct" at
character 1
2024-07-23 15:34:12 IST STATEMENT: selecct current_timestamp
2024-07-23 15:34:26 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:38:20 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:42:17 IST FATAL: database "test1" does not exist
2024-07-24 09:43:56 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:45:07 IST FATAL: database "enterprisedb" does not exist
2024-07-24 09:45:16 IST ERROR: relation "important_table" does not exist
at character 58
2024-07-24 09:45:16 IST STATEMENT: begin; drop table important_table;
commit; select * from important_table;
2024-07-24 09:46:00 IST LOG: checkpoint starting: time
2024-07-24 09:46:02 IST LOG: checkpoint complete: wrote 16 buffers
(0.1%); 0 WAL file(s) added, 0 removed, 0 recycled; write=1.506 s,
sync=0.001 s, total=1.509 s; sync files=9, longest=0.001 s, average=0.001
s; distance=93 kB, estimate=171 kB; lsn=0/3C045768, redo lsn=0/3C045730
2024-07-24 09:47:27 IST LOG: checkpoint starting: immediate force wait
2024-07-24 09:47:27 IST LOG: checkpoint complete: wrote 2 buffers (0.0%);
0 WAL file(s) added, 0 removed, 1 recycled; write=0.001 s, sync=0.001 s,
total=0.003 s; sync files=1, longest=0.001 s, average=0.001 s;
distance=16106 kB, estimate=16106 kB; lsn=0/3D000060, redo lsn=0/3D000028
2024-07-24 09:47:27.788 P00 INFO: archive-push command begin 2.52.1:
[pg_wal/00000009000000000000003C] --exec-id=34036-c1d6de0b
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-24 09:47:28.130 P00 INFO: pushed WAL file
'00000009000000000000003C' to the archive
2024-07-24 09:47:28.230 P00 INFO: archive-push command end: completed
successfully (444ms)

When I ran $ sudo systemctl start edb-as-16.service

4-07-26 11:32:56 IST LOG: starting PostgreSQL 16.3 (EnterpriseDB Advanced
Server 16.3.0) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.4.1
20231218 (Red Hat 11.4.1-3), 64-bit
2024-07-26 11:32:56 IST LOG: listening on IPv4 address "0.0.0.0", port
5444
2024-07-26 11:32:56 IST LOG: listening on IPv6 address "::", port 5444
2024-07-26 11:32:56 IST LOG: listening on Unix socket "/tmp/.s.PGSQL.5444"
2024-07-26 11:32:56 IST LOG: database system was interrupted; last known
up at 2024-07-24 09:47:27 IST
2024-07-26 11:32:56 IST LOG: starting point-in-time recovery to
2024-07-24 09:44:01.3255+05:30
2024-07-26 11:32:56 IST LOG: starting backup recovery with redo LSN
0/3D000028, checkpoint LSN 0/3D000060, on timeline ID 9
2024-07-26 11:32:56.475 P00 INFO: archive-get command begin 2.52.1:
[00000009.history, pg_wal/RECOVERYHISTORY] --exec-id=43293-1d583a8e
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:56.709 P00 INFO: found 00000009.history in the repo1:
16-1 archive
2024-07-26 11:32:56.809 P00 INFO: archive-get command end: completed
successfully (335ms)
2024-07-26 11:32:56 IST LOG: restored log file "00000009.history" from
archive
2024-07-26 11:32:56.816 P00 INFO: archive-get command begin 2.52.1:
[00000009000000000000003D, pg_wal/RECOVERYXLOG] --exec-id=43295-55c5d6f7
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:56.999 P00 INFO: found 00000009000000000000003D in the
repo1: 16-1 archive
2024-07-26 11:32:57.099 P00 INFO: archive-get command end: completed
successfully (285ms)
2024-07-26 11:32:57 IST LOG: restored log file "00000009000000000000003D"
from archive
2024-07-26 11:32:57 IST LOG: redo starts at 0/3D000028
2024-07-26 11:32:57.114 P00 INFO: archive-get command begin 2.52.1:
[00000009000000000000003E, pg_wal/RECOVERYXLOG] --exec-id=43297-34867c05
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.282 P00 INFO: found 00000009000000000000003E in the
repo1: 16-1 archive
2024-07-26 11:32:57.382 P00 INFO: archive-get command end: completed
successfully (270ms)
2024-07-26 11:32:57 IST LOG: restored log file "00000009000000000000003E"
from archive
2024-07-26 11:32:57.400 P00 INFO: archive-get command begin 2.52.1:
[00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43299-e2db2e1b
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.521 P00 INFO: unable to find
00000009000000000000003F in the archive
2024-07-26 11:32:57.621 P00 INFO: archive-get command end: completed
successfully (222ms)
2024-07-26 11:32:57 IST LOG: completed backup recovery with redo LSN
0/3D000028 and end LSN 0/3D000100
2024-07-26 11:32:57 IST LOG: consistent recovery state reached at
0/3D000100
2024-07-26 11:32:57 IST LOG: database system is ready to accept read-only
connections
2024-07-26 11:32:57.632 P00 INFO: archive-get command begin 2.52.1:
[00000009000000000000003F, pg_wal/RECOVERYXLOG] --exec-id=43301-f613dae9
--log-level-console=info --log-level-file=debug
--pg1-path=/var/lib/edb/as16/data --pg-version-force=16
--repo1-host=10.10.20.7 --repo1-host-user=postgres --stanza=Demo_Repo
2024-07-26 11:32:57.761 P00 INFO: unable to find
00000009000000000000003F in the archive
2024-07-26 11:32:57.861 P00 INFO: archive-get command end: completed
successfully (231ms)
2024-07-26 11:32:57 IST LOG: redo done at 0/3E000060 system usage: CPU:
user: 0.00 s, system: 0.00 s, elapsed: 0.75 s
2024-07-26 11:32:57 IST FATAL: recovery ended before configured recovery
target was reached
2024-07-26 11:32:57 IST LOG: startup process (PID 43292) exited with exit
code 1

ONLY inference I can make is

INFO unable to find 00000009000000000000003F in the archive ( This
means the EDB server (10.10.20.6 ) unable to push the archives to the
Repo server(10.10.20.7 ) ? Is that the reason for the recovery and
start backing of edb server fails ?

the pg_hba.conf entry in the EDB Server machine is as

host all all 127.0.0.1/32 ident
host all all 10.10.20.7/32
scram-sha-256
#host all all 10.10.20.7/32 trust
# IPv6 local connections:
host all all ::1/128 ident
#host all all 10.10.20.7/24 trust

# Allow replication connections from localhost, by a user with the
# replication privilege.
local replication all peer
host replication all 10.10.20.7/32
scram-sha-256
host replication all 127.0.0.1/32 ident
host replication all ::1/128 ident

Do I have to change anything in pg_hba.conf ?

my EDB Server conf as this

archive_mode = on
archive_command = 'pgbackrest --stanza=Demo_Repo archive-push %p'
log_filename = 'postgresql.log'
max_wal_senders = 5
wal_level = replica

Any help ?

Krishane

On Fri, Jul 26, 2024 at 10:45 AM Muhammad Ikram <mmikram@gmail.com> wrote:
Hi KK CHN

Could you check server logs ?
Your service trace suggests that it started server and then failure
happened

ul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced Server
16.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
exited, code=exited, status=1/FAILURE

Regards,
Ikram

On Fri, Jul 26, 2024 at 10:04 AM KK CHN <kkchn.in@gmail.com> wrote:
List,

Reference: https://pgbackrest.org/user-guide-rhel.html#pitr
I am following the PTR on RHEL9 EPAS16.
I am able to do a backup(Full, diff and incr) and restore from a
full backup and restart of EPAS16 works fine.

But when I do an incremental backup after doing the procedures
mentioned in the PTR section of the above reference link and try
restoring the EDB database from the INCR backup and then starting up the
EPAS16 the server always results in dead state

My repo server is another machine. If I do a full restore on the
DB server ( sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
restore) it works and the server starts without any issue.
Restoring from Incremental backup tty output shows successful but edb
service start results in failure.

Any help is much appreciated.

Krishane.

STEPS followed:

after dropping the table pg-primary *⇒* Drop the important table
([section]
stopped the EDB server.

$ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta
--set=20240719-122703F_20240724-094727I --target-timeline=current
--type=time "--target=2024-07-24 09:44:01.3255+05:30"
--target-action=promote restore
.....................................................

2024-07-26 09:48:06.343 P00 INFO: restore command end: completed
successfully (1035ms)

But

[root@rservice01 ~]# sudo systemctl start edb-as-16.service
[root@service01 ~]# sudo systemctl status edb-as-16.service
× edb-as-16.service - EDB Postgres Advanced Server 16
Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled;
preset: disabled)
* Active: failed* (Result: exit-code) since Fri 2024-07-26 09:48:50
IST; 8s ago
Duration: 242ms
Process: 41903 ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir
${PGDATA} (code=exited, status=0/SUCCESS)
Process: 41908 ExecStart=/usr/edb/as16/bin/edb-postgres -D ${PGDATA}
(code=exited, status=1/FAILURE)
Main PID: 41908 (code=exited, status=1/FAILURE)
CPU: 331ms

Jul 26 09:48:48 service01 systemd[1]: Starting EDB Postgres Advanced
Server 16...
Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
LOG: redirecting log output to logging collector process
Jul 26 09:48:48 service01 edb-postgres[41908]: 2024-07-26 09:48:48 IST
HINT: Future log output will appear in directory "log".
Jul 26 09:48:49 service01 systemd[1]: Started EDB Postgres Advanced
Server 16.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Main process
exited, code=exited, status=1/FAILURE
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Killing process
41909 (edb-postgres) with signal SIGKILL.
Jul 26 09:48:50 service01 systemd[1]: edb-as-16.service: Failed with
result 'exit-code'.
[root@service01 ~]#

why is it unable to perform a restore and recovery from an incr backup
?

On The Repo Server
[root@service02 ~]# sudo -u postgres pgbackrest --stanza=Demo_Repo info
stanza: Demo_Repo
status: ok
cipher: aes-256-cbc

db (current)
wal archive min/max (16):
000000020000000000000021/0000000B0000000000000041

full backup: 20240719-122703F
timestamp start/stop: 2024-07-19 12:27:03+05:30 / 2024-07-19
12:27:06+05:30
wal start/stop: 00000002000000000000002A /
00000002000000000000002A
database size: 61.7MB, database backup size: 61.7MB
repo1: backup size: 9.6MB

incr backup: 20240719-122703F_20240719-123353I
timestamp start/stop: 2024-07-19 12:33:53+05:30 / 2024-07-19
12:33:56+05:30
wal start/stop: 00000002000000000000002C /
00000002000000000000002C
database size: 61.7MB, database backup size: 6.4MB
repo1: backup size: 6.2KB
backup reference list: 20240719-122703F
diff backup: 20240719-122703F_20240719-123408D
timestamp start/stop: 2024-07-19 12:34:08+05:30 / 2024-07-19
12:34:10+05:30
wal start/stop: 00000002000000000000002E /
00000002000000000000002E
database size: 61.7MB, database backup size: 6.4MB
repo1: backup size: 6.4KB
backup reference list: 20240719-122703F
incr backup: 20240719-122703F_20240723-110212I
timestamp start/stop: 2024-07-23 11:02:12+05:30 / 2024-07-23
11:02:15+05:30
wal start/stop: 000000070000000000000038 /
000000070000000000000038
database size: 48MB, database backup size: 6.4MB
repo1: backup size: 9.8KB
backup reference list: 20240719-122703F,
20240719-122703F_20240719-123408D

incr backup: 20240719-122703F_20240723-141818I
timestamp start/stop: 2024-07-23 14:18:18+05:30 / 2024-07-23
14:18:22+05:30
wal start/stop: 00000008000000000000003C /
00000008000000000000003C
database size: 75.4MB, database backup size: 33.8MB
repo1: backup size: 4.7MB
backup reference list: 20240719-122703F,
20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I

incr backup: 20240719-122703F_20240724-094727I
timestamp start/stop: 2024-07-24 09:47:27+05:30 / 2024-07-24
09:47:30+05:30
wal start/stop: 00000009000000000000003D /
00000009000000000000003D
database size: 62MB, database backup size: 11.5MB
repo1: backup size: 72KB
backup reference list: 20240719-122703F,
20240719-122703F_20240719-123408D, 20240719-122703F_20240723-110212I,
20240719-122703F_20240723-141818
--
Muhammad Ikram

--
Muhammad Ikram