PostgreSQL container crash trouble.
Dear all.
I'm running PostgreSQL (enabled timescaledb extension) in the following
environment.
And I had db pod crashed error several times irregularly.
I want to know the cause of this fail.
Environment:
- Base: Kubernetes(microk8s)
- DB: PostgreSQL 12.4 (TimescaleDB) Container
- DB OS: Alpine Linux
- microk8s host OS: CentOS 7.6, Amazon Linux 2 (Occured under some hosts)
- DB data location: Mounted host directory (for data persistence)
1) PostgreSQL crash and cannot start Pod..
I noticed that the timescaledb pod was restarted repeatedly with the
following error.
---
PostgreSQL Database directory appears to contain a database; Skipping
initialization
[1]: LOG: database system is shut down ---
(Alpine 9.3.0) 9.3.0, 64-bit
[1]: LOG: database system is shut down ---
[1]: LOG: database system is shut down ---
[1]: LOG: database system is shut down ---
[20]: PANIC: could not locate a valid checkpoint record
[20]: PANIC: could not locate a valid checkpoint record
[20]: PANIC: could not locate a valid checkpoint record
[1]: LOG: database system is shut down ---
[1]: LOG: database system is shut down ---
[1]: LOG: database system is shut down ---
---
I thought it might be WAL trouble and tried to check with pg_controldata
and pg_waldump.
---
$ pg_controldata <PostgreSQL data dir>
pg_control version number: 1201
Catalog version number: 201909212
Database system identifier: 6909006008117780509
Database cluster state: in production
pg_control last modified: Mon Feb 15 21:11:21 2021
Latest checkpoint location: 15/8FB002C0
Latest checkpoint's REDO location: 15/8FAF3018
Latest checkpoint's REDO WAL file: 00000001000000150000008F
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:33988723
Latest checkpoint's NextOID: 117637
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 480
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 33988723
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint: Mon Feb 15 21:11:14 2021
Fake LSN counter for unlogged rels: 0/3E8
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: replica
wal_log_hints setting: off
max_connections setting: 50
max_worker_processes setting: 15
max_wal_senders setting: 10
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0
Mock authentication nonce:
871e705d9393cdb1b161d5562656e20db314779af25942fa10d840983dc4ba84
---
I checked the latest checkpoint's REDO WAL file.
---
$ pg_waldump -n 10 pg_wal/00000001000000150000008F
pg_waldump: fatal: WAL segment size must be a power of two between 1 MB and
1 GB, but the WAL file "00000001000000150000008F" header specifies 0 bytes
---
I cannot read wal data.
This file size is 16MB (according to the wal size setting).
But the content is all zero data. I checked this situation with "od"
command.
---
$ od -N 40 -A d -v pg_wal/00000001000000150000008F
0000000 000000 000000 000000 000000 000000 000000 000000 000000
0000016 000000 000000 000000 000000 000000 000000 000000 000000
0000032 000000 000000 000000 000000
0000040
---
The WAL log of the previous one could be seen with pg_waldump.
However, I could not find any log corresponding to the latest checkpoint
location (8FB002C0) listed in pg_controldata.
2) Run pg_resetwal and successful start Pod
I ran the pg_resetwal command to repair the WAL trouble.
And the DB pod successfully started.
But, I received select query fail for some tables.
3) failed select query
---
sampledb1=# select * from table1 limit 1;
ERROR: missing chunk number 0 for toast value 27678 in pg_toast_2619
---
I thought taht this error is raised for pg_statistic table (and related
toast table (pg_toast_2619)).
So, I deleted the broken rows in pg_statistic table and ran ANALYZE query.
After then, this tables is recovered.
4) another data trouble
I had an another data trouble after pg_resetwal.
In some table, SELECT query did not return any rows.
And INSERT query failed (no response. Waiting forever...)
This table have only primary key index.
I thought it might have been caused by an pkey index trouble after
pg_resetwal.
I didn't know how to repair the index of the primary key, and finally I
dropped table and restore.
I thought this wal trouble was caused by disk IO troubles. But any error
was not raised in OS syslog.
I want to know any other causes.
Beat regards.
On Mon, 2021-03-08 at 22:30 +0900, Daisuke Ikeda wrote:
I'm running PostgreSQL (enabled timescaledb extension) in the following environment.
And I had db pod crashed error several times irregularly.I want to know the cause of this fail.
Environment:
- Base: Kubernetes(microk8s)
- DB: PostgreSQL 12.4 (TimescaleDB) Container
- DB OS: Alpine Linux
- microk8s host OS: CentOS 7.6, Amazon Linux 2 (Occured under some hosts)
- DB data location: Mounted host directory (for data persistence)1) PostgreSQL crash and cannot start Pod..
I noticed that the timescaledb pod was restarted repeatedly with the following error.
---
PostgreSQL Database directory appears to contain a database; Skipping initialization
[1] LOG: starting PostgreSQL 12.4 on x86_64-pc-linux-musl, compiled by gcc (Alpine 9.3.0) 9.3.0, 64-bit
[1] LOG: listening on IPv4 address "0.0.0.0", port 5432
[1] LOG: listening on IPv6 address "::", port 5432
[1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
[20] LOG: database system was shut down at 2021-02-15 21:15:12 UTC
[20] LOG: invalid primary checkpoint record
[20] PANIC: could not locate a valid checkpoint record
[1] LOG: startup process (PID 20) was terminated by signal 6: Aborted
[1] LOG: aborting startup due to startup process failure
[1] LOG: database system is shut down
---I thought it might be WAL trouble and tried to check with pg_controldata and pg_waldump.
It is WAL trouble.
WAL does not contain the checkpoint from before the crash.
I thought it might be WAL trouble and tried to check with pg_controldata and pg_waldump.
---
$ pg_controldata <PostgreSQL data dir>
[...]
Latest checkpoint location: 15/8FB002C0
Latest checkpoint's REDO location: 15/8FAF3018
Latest checkpoint's REDO WAL file: 00000001000000150000008F
[...]I checked the latest checkpoint's REDO WAL file.
---
$ pg_waldump -n 10 pg_wal/00000001000000150000008F
pg_waldump: fatal: WAL segment size must be a power of two between 1 MB and 1 GB, but the WAL file "00000001000000150000008F" header specifies 0 bytes
---I cannot read wal data.
This file size is 16MB (according to the wal size setting).
But the content is all zero data. I checked this situation with "od" command.---
$ od -N 40 -A d -v pg_wal/00000001000000150000008F
0000000 000000 000000 000000 000000 000000 000000 000000 000000
0000016 000000 000000 000000 000000 000000 000000 000000 000000
0000032 000000 000000 000000 000000
0000040
---
Looks like modifications to this file were lost.
2) Run pg_resetwal and successful start Pod
I ran the pg_resetwal command to repair the WAL trouble.
And the DB pod successfully started.
Yes, but "pg_resetwal" on a crashed cluster leads to data corruption.
The best you can do now is salvage what you can.
But, I received select query fail for some tables.
3) failed select query
---
sampledb1=# select * from table1 limit 1;
ERROR: missing chunk number 0 for toast value 27678 in pg_toast_2619
---I thought taht this error is raised for pg_statistic table (and related toast table (pg_toast_2619)).
So, I deleted the broken rows in pg_statistic table and ran ANALYZE query.
After then, this tables is recovered.
Lucky you!
4) another data trouble
I had an another data trouble after pg_resetwal.
In some table, SELECT query did not return any rows.
And INSERT query failed (no response. Waiting forever...)
This table have only primary key index.
I thought it might have been caused by an pkey index trouble after pg_resetwal.
I didn't know how to repair the index of the primary key, and finally I dropped table and restore.
You should "pg_dumpall" the cluster and load it into a cluster
that was newly created with "initdb".
Any problems loading the data have to be resolved manually.
I thought this wal trouble was caused by disk IO troubles. But any error was not raised in OS syslog.
I want to know any other causes.
One cause might be unreliable storage that doesn't honor fsync requests
correctly. But given that your checkpoint location is pretty late in
the file, it seems unlikely that none of the data did make it to disk.
Somehow you lost the data for a WAL segment, and you cannot recover.
Of course, if you have a backup, you know what to do.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
Thank you for your reply!
It was helpful.
I'll check more about the state of writing on the disk storage.
And I will reinforce the backup & restore strategy.
2021年3月9日(火) 0:28 Laurenz Albe <laurenz.albe@cybertec.at>:
Show quoted text
On Mon, 2021-03-08 at 22:30 +0900, Daisuke Ikeda wrote:
I'm running PostgreSQL (enabled timescaledb extension) in the following
environment.
And I had db pod crashed error several times irregularly.
I want to know the cause of this fail.
Environment:
- Base: Kubernetes(microk8s)
- DB: PostgreSQL 12.4 (TimescaleDB) Container
- DB OS: Alpine Linux
- microk8s host OS: CentOS 7.6, Amazon Linux 2 (Occured under somehosts)
- DB data location: Mounted host directory (for data persistence)
1) PostgreSQL crash and cannot start Pod..
I noticed that the timescaledb pod was restarted repeatedly with the
following error.
---
PostgreSQL Database directory appears to contain a database; Skippinginitialization
[1] LOG: starting PostgreSQL 12.4 on x86_64-pc-linux-musl, compiled by
gcc (Alpine 9.3.0) 9.3.0, 64-bit
[1] LOG: listening on IPv4 address "0.0.0.0", port 5432
[1] LOG: listening on IPv6 address "::", port 5432
[1] LOG: listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
[20] LOG: database system was shut down at 2021-02-15 21:15:12 UTC
[20] LOG: invalid primary checkpoint record
[20] PANIC: could not locate a valid checkpoint record
[1] LOG: startup process (PID 20) was terminated by signal 6: Aborted
[1] LOG: aborting startup due to startup process failure
[1] LOG: database system is shut down
---I thought it might be WAL trouble and tried to check with pg_controldata
and pg_waldump.
It is WAL trouble.
WAL does not contain the checkpoint from before the crash.I thought it might be WAL trouble and tried to check with pg_controldata
and pg_waldump.
---
$ pg_controldata <PostgreSQL data dir>
[...]
Latest checkpoint location: 15/8FB002C0
Latest checkpoint's REDO location: 15/8FAF3018
Latest checkpoint's REDO WAL file: 00000001000000150000008F
[...]I checked the latest checkpoint's REDO WAL file.
---
$ pg_waldump -n 10 pg_wal/00000001000000150000008F
pg_waldump: fatal: WAL segment size must be a power of two between 1 MBand 1 GB, but the WAL file "00000001000000150000008F" header specifies 0
bytes---
I cannot read wal data.
This file size is 16MB (according to the wal size setting).
But the content is all zero data. I checked this situation with "od"command.
---
$ od -N 40 -A d -v pg_wal/00000001000000150000008F
0000000 000000 000000 000000 000000 000000 000000 000000 000000
0000016 000000 000000 000000 000000 000000 000000 000000 000000
0000032 000000 000000 000000 000000
0000040
---Looks like modifications to this file were lost.
2) Run pg_resetwal and successful start Pod
I ran the pg_resetwal command to repair the WAL trouble.
And the DB pod successfully started.Yes, but "pg_resetwal" on a crashed cluster leads to data corruption.
The best you can do now is salvage what you can.But, I received select query fail for some tables.
3) failed select query
---
sampledb1=# select * from table1 limit 1;
ERROR: missing chunk number 0 for toast value 27678 in pg_toast_2619
---I thought taht this error is raised for pg_statistic table (and related
toast table (pg_toast_2619)).
So, I deleted the broken rows in pg_statistic table and ran ANALYZE
query.
After then, this tables is recovered.
Lucky you!
4) another data trouble
I had an another data trouble after pg_resetwal.
In some table, SELECT query did not return any rows.
And INSERT query failed (no response. Waiting forever...)
This table have only primary key index.
I thought it might have been caused by an pkey index trouble afterpg_resetwal.
I didn't know how to repair the index of the primary key, and finally I
dropped table and restore.
You should "pg_dumpall" the cluster and load it into a cluster
that was newly created with "initdb".Any problems loading the data have to be resolved manually.
I thought this wal trouble was caused by disk IO troubles. But any error
was not raised in OS syslog.
I want to know any other causes.
One cause might be unreliable storage that doesn't honor fsync requests
correctly. But given that your checkpoint location is pretty late in
the file, it seems unlikely that none of the data did make it to disk.Somehow you lost the data for a WAL segment, and you cannot recover.
Of course, if you have a backup, you know what to do.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
Hi Laurenz,
You said use your backup.
Wouldn't it be enough to start instance crash recovery just before the corrupted wal in that case?
recovery_target_lsn = <<lsn for 00000001000000150000008F>>
recovery_target_inclusive = off
No need for a backup. Correct?
Cheers, Markus
Show quoted text
-----Ursprüngliche Nachricht-----
Von: Laurenz Albe <laurenz.albe@cybertec.at>
Gesendet: Montag, 8. März 2021 16:28
An: Daisuke Ikeda <dai.ikd123@gmail.com>; pgsql-general@lists.postgresql.org
Betreff: [Extern] Re: PostgreSQL container crash trouble.On Mon, 2021-03-08 at 22:30 +0900, Daisuke Ikeda wrote:
I'm running PostgreSQL (enabled timescaledb extension) in the following
environment.
And I had db pod crashed error several times irregularly.
I want to know the cause of this fail.
Environment:
- Base: Kubernetes(microk8s)
- DB: PostgreSQL 12.4 (TimescaleDB) Container
- DB OS: Alpine Linux
- microk8s host OS: CentOS 7.6, Amazon Linux 2 (Occured under some hosts)
- DB data location: Mounted host directory (for data persistence)1) PostgreSQL crash and cannot start Pod..
I noticed that the timescaledb pod was restarted repeatedly with the following
error.
---
PostgreSQL Database directory appears to contain a database; Skipping
initialization [1] LOG: starting PostgreSQL 12.4 on
x86_64-pc-linux-musl, compiled by gcc (Alpine 9.3.0) 9.3.0, 64-bit [1]
LOG: listening on IPv4 address "0.0.0.0", port 5432 [1] LOG:
listening on IPv6 address "::", port 5432 [1] LOG: listening on Unix socket"/var/run/postgresql/.s.PGSQL.5432"
[20] LOG: database system was shut down at 2021-02-15 21:15:12 UTC
[20] LOG: invalid primary checkpoint record [20] PANIC: could not
locate a valid checkpoint record [1] LOG: startup process (PID 20)
was terminated by signal 6: Aborted [1] LOG: aborting startup due to
startup process failure [1] LOG: database system is shut down
---I thought it might be WAL trouble and tried to check with pg_controldata and
pg_waldump.
It is WAL trouble.
WAL does not contain the checkpoint from before the crash.I thought it might be WAL trouble and tried to check with pg_controldata and
pg_waldump.
---
$ pg_controldata <PostgreSQL data dir> [...]
Latest checkpoint location: 15/8FB002C0
Latest checkpoint's REDO location: 15/8FAF3018
Latest checkpoint's REDO WAL file: 00000001000000150000008F
[...]I checked the latest checkpoint's REDO WAL file.
---
$ pg_waldump -n 10 pg_wal/00000001000000150000008F
pg_waldump: fatal: WAL segment size must be a power of two between 1
MB and 1 GB, but the WAL file "00000001000000150000008F" header
specifies 0 bytes
---I cannot read wal data.
This file size is 16MB (according to the wal size setting).
But the content is all zero data. I checked this situation with "od" command.---
$ od -N 40 -A d -v pg_wal/00000001000000150000008F
0000000 000000 000000 000000 000000 000000 000000 000000 000000
0000016 000000 000000 000000 000000 000000 000000 000000 000000
0000032 000000 000000 000000 000000
0000040
---Looks like modifications to this file were lost.
2) Run pg_resetwal and successful start Pod
I ran the pg_resetwal command to repair the WAL trouble.
And the DB pod successfully started.Yes, but "pg_resetwal" on a crashed cluster leads to data corruption.
The best you can do now is salvage what you can.But, I received select query fail for some tables.
3) failed select query
---
sampledb1=# select * from table1 limit 1;
ERROR: missing chunk number 0 for toast value 27678 in pg_toast_2619
---I thought taht this error is raised for pg_statistic table (and related toast table
(pg_toast_2619)).
So, I deleted the broken rows in pg_statistic table and ran ANALYZE query.
After then, this tables is recovered.Lucky you!
4) another data trouble
I had an another data trouble after pg_resetwal.
In some table, SELECT query did not return any rows.
And INSERT query failed (no response. Waiting forever...) This table
have only primary key index.
I thought it might have been caused by an pkey index trouble after pg_resetwal.
I didn't know how to repair the index of the primary key, and finally I droppedtable and restore.
You should "pg_dumpall" the cluster and load it into a cluster that was newly
created with "initdb".Any problems loading the data have to be resolved manually.
I thought this wal trouble was caused by disk IO troubles. But any error was not
raised in OS syslog.
I want to know any other causes.
One cause might be unreliable storage that doesn't honor fsync requests correctly.
But given that your checkpoint location is pretty late in the file, it seems unlikely
that none of the data did make it to disk.Somehow you lost the data for a WAL segment, and you cannot recover.
Of course, if you have a backup, you know what to do.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.comAchtung: Diese E-Mail wurde von einer externen Adresse verschickt. Klicken Sie
auf keine Links und öffnen Sie keine angehängten Dateien, wenn Sie den Absender
bzw. die Absenderin nicht kennen. Sind Sie sich unsicher, kontaktieren Sie den
Service Desk der Stadt Zürich.
Import Notes
Resolved by subject fallback
Markus, you should know about not top-posting on these lists.
On Thu, 2021-03-11 at 10:52 +0000, Zwettler Markus (OIZ) wrote:
On Mon, 2021-03-08 at 22:30 +0900, Daisuke Ikeda wrote:
I'm running PostgreSQL (enabled timescaledb extension) in the following environment.
And I had db pod crashed error several times irregularly.I want to know the cause of this fail.
1) PostgreSQL crash and cannot start Pod..
[20] LOG: invalid primary checkpoint record [20] PANIC: could not locate a valid checkpoint record
[1] LOG: startup process (PID 20) was terminated by signal 6: Aborted
[1] LOG: aborting startup due to startup process failureWAL does not contain the checkpoint from before the crash.
I thought it might be WAL trouble and tried to check with pg_controldata and pg_waldump.
---
$ pg_controldata <PostgreSQL data dir> [...]
Latest checkpoint location: 15/8FB002C0
Latest checkpoint's REDO location: 15/8FAF3018
Latest checkpoint's REDO WAL file: 00000001000000150000008F
[...]I checked the latest checkpoint's REDO WAL file.
But the content is all zero data. I checked this situation with "od" command.
---
$ od -N 40 -A d -v pg_wal/00000001000000150000008F
0000000 000000 000000 000000 000000 000000 000000 000000 000000
0000016 000000 000000 000000 000000 000000 000000 000000 000000
0000032 000000 000000 000000 000000
0000040
---Looks like modifications to this file were lost.
Somehow you lost the data for a WAL segment, and you cannot recover.
Of course, if you have a backup, you know what to do.
You said use your backup.
Wouldn't it be enough to start instance crash recovery just before the corrupted wal in that case?
recovery_target_lsn = <<lsn for 00000001000000150000008F>>
recovery_target_inclusive = offNo need for a backup. Correct?
No, not correct.
You could perhaps locate the checkpoint before that in the WAL with
"pg_waldump" and manufacture an appropriate "backup_label" to start
recovery from there.
But by the time you reach 00000001000000150000008F, your recovery would
stop, and the database would believe it was recovered.
However, there might be some data modifications in the data files from
a later point in time, which amounts to data corruption.
WAL is called WAL because you always have to write to WAL first,
only then to the data files. You cannot have a change to the data
files that is not WAL logged.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com