WAL contains references to invalid pages

Started by JotaCommalmost 13 years ago5 messagesgeneral
Jump to latest
#1JotaComm
jota.comm@gmail.com

Hello, guys

Yesterday I identified the following messages in my log file (slave):

user=,db= WARNING: page 6629 of relation base/20449/24818 is uninitialized
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= PANIC: WAL contains references to invalid pages
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= LOG: startup process (PID 26293) was terminated by signal 6:
Aborted
user=,db= LOG: terminating any other active server processes

Information:

PostgreSQL 9.2.3 (master and slave)

Operational System: CentOS release 6.3 (Final)

The parameter full_page_writes is enabled in both servers.

Analyzing the objects in my cluster (master) I identified the database
[20449]: and the relation [24818]. The relation 24818 is an index, so I ran the command REINDEX to try solving the problem. Immediately after, I tried to up the slave but I received the same errors.
the command REINDEX to try solving the problem. Immediately after, I tried
to up the slave but I received the same errors.

user=,db= WARNING: page 6629 of relation base/20449/24818 is uninitialized
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= PANIC: WAL contains references to invalid pages
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= LOG: startup process (PID 26293) was terminated by signal 6:
Aborted
user=,db= LOG: terminating any other active server processes

As the problem is in the wal file, so the process (above) doesn't work
according my wish.

Any idea?

Thanks a lot.

Regards

--
JotaComm
http://jotacomm.wordpress.com

#2Fabrízio de Royes Mello
fabriziomello@gmail.com
In reply to: JotaComm (#1)
Re: WAL contains references to invalid pages

On Thu, May 16, 2013 at 11:12 AM, JotaComm <jota.comm@gmail.com> wrote:

[...]
Yesterday I identified the following messages in my log file (slave):

user=,db= WARNING: page 6629 of relation base/20449/24818 is uninitialized
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= PANIC: WAL contains references to invalid pages
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= LOG: startup process (PID 26293) was terminated by signal 6:
Aborted
user=,db= LOG: terminating any other active server processes

Information:

PostgreSQL 9.2.3 (master and slave)
Operational System: CentOS release 6.3 (Final)
The parameter full_page_writes is enabled in both servers.

Analyzing the objects in my cluster (master) I identified the database
[20449] and the relation [24818]. The relation 24818 is an index, so I ran
the command REINDEX to try solving the problem. Immediately after, I tried
to up the slave but I received the same errors.

user=,db= WARNING: page 6629 of relation base/20449/24818 is uninitialized
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= PANIC: WAL contains references to invalid pages
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= LOG: startup process (PID 26293) was terminated by signal 6:
Aborted
user=,db= LOG: terminating any other active server processes

As the problem is in the wal file, so the process (above) doesn't work
according my wish.

Any idea?

Hi JotaComm,

IMHO as it is your slave you could just rebuild it.

However if you want to make an attempt to recover you can do:

1) make a physical backup of this cluster
2) in your postgresql.conf set 'zero_damaged_pages = on' [1]http://www.postgresql.org/docs/9.2/static/runtime-config-developer.html#GUC-ZERO-DAMAGED-PAGES
3) start your cluster

I really don't know if it will work, but you can try... :-)

Regards,

[1]: http://www.postgresql.org/docs/9.2/static/runtime-config-developer.html#GUC-ZERO-DAMAGED-PAGES
http://www.postgresql.org/docs/9.2/static/runtime-config-developer.html#GUC-ZERO-DAMAGED-PAGES

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Show quoted text

Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

#3JotaComm
jota.comm@gmail.com
In reply to: Fabrízio de Royes Mello (#2)
Re: WAL contains references to invalid pages

Hello, Fabrízio

2013/5/16 Fabrízio de Royes Mello <fabriziomello@gmail.com>

On Thu, May 16, 2013 at 11:12 AM, JotaComm <jota.comm@gmail.com> wrote:

[...]

Yesterday I identified the following messages in my log file (slave):

user=,db= WARNING: page 6629 of relation base/20449/24818 is
uninitialized
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= PANIC: WAL contains references to invalid pages
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= LOG: startup process (PID 26293) was terminated by signal 6:
Aborted
user=,db= LOG: terminating any other active server processes

Information:

PostgreSQL 9.2.3 (master and slave)
Operational System: CentOS release 6.3 (Final)
The parameter full_page_writes is enabled in both servers.

Analyzing the objects in my cluster (master) I identified the database
[20449] and the relation [24818]. The relation 24818 is an index, so I ran
the command REINDEX to try solving the problem. Immediately after, I tried
to up the slave but I received the same errors.

user=,db= WARNING: page 6629 of relation base/20449/24818 is
uninitialized
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= PANIC: WAL contains references to invalid pages
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= LOG: startup process (PID 26293) was terminated by signal 6:
Aborted
user=,db= LOG: terminating any other active server processes

As the problem is in the wal file, so the process (above) doesn't work
according my wish.

Any idea?

Hi JotaComm,

IMHO as it is your slave you could just rebuild it.

However if you want to make an attempt to recover you can do:

1) make a physical backup of this cluster
2) in your postgresql.conf set 'zero_damaged_pages = on' [1]
3) start your cluster

I really don't know if it will work, but you can try... :-)

Thanks for your suggestion :)

I tried it and I had the same errors. I believe that will be necessary to
rebuild the cluster, because the problem is in the wal file.

Regards,

[1]
http://www.postgresql.org/docs/9.2/static/runtime-config-developer.html#GUC-ZERO-DAMAGED-PAGES

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

Regards
--
JotaComm
http://jotacomm.wordpress.com

#4Adarsh Sharma
eddy.adarsh@gmail.com
In reply to: JotaComm (#3)
Re: WAL contains references to invalid pages

Try to take backups of that table & index only. If succeeded drop and
recreate them. May be it fix your issue.

Thanks

On Thu, May 16, 2013 at 11:14 PM, JotaComm <jota.comm@gmail.com> wrote:

Show quoted text

Hello, Fabrízio

2013/5/16 Fabrízio de Royes Mello <fabriziomello@gmail.com>

On Thu, May 16, 2013 at 11:12 AM, JotaComm <jota.comm@gmail.com> wrote:

[...]

Yesterday I identified the following messages in my log file (slave):

user=,db= WARNING: page 6629 of relation base/20449/24818 is
uninitialized
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= PANIC: WAL contains references to invalid pages
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= LOG: startup process (PID 26293) was terminated by signal 6:
Aborted
user=,db= LOG: terminating any other active server processes

Information:

PostgreSQL 9.2.3 (master and slave)
Operational System: CentOS release 6.3 (Final)
The parameter full_page_writes is enabled in both servers.

Analyzing the objects in my cluster (master) I identified the database
[20449] and the relation [24818]. The relation 24818 is an index, so I ran
the command REINDEX to try solving the problem. Immediately after, I tried
to up the slave but I received the same errors.

user=,db= WARNING: page 6629 of relation base/20449/24818 is
uninitialized
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= PANIC: WAL contains references to invalid pages
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= LOG: startup process (PID 26293) was terminated by signal 6:
Aborted
user=,db= LOG: terminating any other active server processes

As the problem is in the wal file, so the process (above) doesn't work
according my wish.

Any idea?

Hi JotaComm,

IMHO as it is your slave you could just rebuild it.

However if you want to make an attempt to recover you can do:

1) make a physical backup of this cluster
2) in your postgresql.conf set 'zero_damaged_pages = on' [1]
3) start your cluster

I really don't know if it will work, but you can try... :-)

Thanks for your suggestion :)

I tried it and I had the same errors. I believe that will be necessary to
rebuild the cluster, because the problem is in the wal file.

Regards,

[1]
http://www.postgresql.org/docs/9.2/static/runtime-config-developer.html#GUC-ZERO-DAMAGED-PAGES

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

Regards
--
JotaComm
http://jotacomm.wordpress.com

#5JotaComm
jota.comm@gmail.com
In reply to: JotaComm (#1)
Re: WAL contains references to invalid pages

Hello,

2013/5/21 Adarsh Sharma <eddy.adarsh@gmail.com>

Try to take backups of that table & index only. If succeeded drop and
recreate them. May be it fix your issue.

On Monday night I made the slave server. Yesterday I was analyzing the log
files and I found the following messages.

2013-05-21 15:13:48 BRT [30686]: [25-1] user=,db= WARNING: page 136714 of
relation base/79251/79262 is uninitialized
2013-05-21 15:13:48 BRT [30686]: [26-1] user=,db= CONTEXT: xlog redo
visible: rel 1663/79251/79262; blk 136714
2013-05-21 15:13:48 BRT [30686]: [27-1] user=,db= PANIC: WAL contains
references to invalid pages
2013-05-21 15:13:48 BRT [30686]: [28-1] user=,db= CONTEXT: xlog redo
visible: rel 1663/79251/79262; blk 136714
2013-05-21 15:13:49 BRT [30684]: [2-1] user=,db= LOG: startup process
(PID 30686) was terminated by signal 6: Aborted
2013-05-21 15:13:49 BRT [30684]: [3-1] user=,db= LOG: terminating any
other active server processes

It's the same problem, but now is in another table.

According the documentation:
http://www.postgresql.org/docs/9.2/interactive/release-9-2-3.html

-

Fix multiple problems in detection of when a consistent database state
has been reached during WAL replay (Fujii Masao, Heikki Linnakangas, Simon
Riggs, Andres Freund)
-

Fix detection of end-of-backup point when no actual redo work is
required (Heikki Linnakangas)

This mistake could result in incorrect "WAL ends before end of online
backup" errors.

I believe that my problem is described here. What do you think about it?

On Thu, May 16, 2013 at 11:14 PM, JotaComm <jota.comm@gmail.com> wrote:

Hello, Fabrízio

2013/5/16 Fabrízio de Royes Mello <fabriziomello@gmail.com>

On Thu, May 16, 2013 at 11:12 AM, JotaComm <jota.comm@gmail.com> wrote:

[...]

Yesterday I identified the following messages in my log file (slave):

user=,db= WARNING: page 6629 of relation base/20449/24818 is
uninitialized
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= PANIC: WAL contains references to invalid pages
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= LOG: startup process (PID 26293) was terminated by signal
6: Aborted
user=,db= LOG: terminating any other active server processes

Information:

PostgreSQL 9.2.3 (master and slave)
Operational System: CentOS release 6.3 (Final)
The parameter full_page_writes is enabled in both servers.

Analyzing the objects in my cluster (master) I identified the database
[20449] and the relation [24818]. The relation 24818 is an index, so I ran
the command REINDEX to try solving the problem. Immediately after, I tried
to up the slave but I received the same errors.

user=,db= WARNING: page 6629 of relation base/20449/24818 is
uninitialized
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= PANIC: WAL contains references to invalid pages
user=,db= CONTEXT: xlog redo vacuum: rel 1663/20449/24818; blk 6631,
lastBlockVacuumed 6626
user=,db= LOG: startup process (PID 26293) was terminated by signal
6: Aborted
user=,db= LOG: terminating any other active server processes

As the problem is in the wal file, so the process (above) doesn't work
according my wish.

Any idea?

Hi JotaComm,

IMHO as it is your slave you could just rebuild it.

However if you want to make an attempt to recover you can do:

1) make a physical backup of this cluster
2) in your postgresql.conf set 'zero_damaged_pages = on' [1]
3) start your cluster

I really don't know if it will work, but you can try... :-)

Thanks for your suggestion :)

I tried it and I had the same errors. I believe that will be necessary
to rebuild the cluster, because the problem is in the wal file.

Regards,

[1]
http://www.postgresql.org/docs/9.2/static/runtime-config-developer.html#GUC-ZERO-DAMAGED-PAGES

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Blog sobre TI: http://fabriziomello.blogspot.com
Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello

Regards
--
JotaComm
http://jotacomm.wordpress.com

Thanks a lot

Regards

--
JotaComm
http://jotacomm.wordpress.com

Thank you

Regards

--
JotaComm
http://jotacomm.wordpress.com