pg_rewind restore_command issue in PG12

Started by Amine Tengilimogluover 5 years ago7 messageshackers
Jump to latest
#1Amine Tengilimoglu
aminetengilimoglu@gmail.com

Hi;

In a situation where pg_rewind gets an error due to a missing wall, I
have set restore_command so that the needed wals can be read from the
archive (I don't want to manually copy the wal files), but I see it doesn't
work. What am I missing? Is restore_command not really working with
pg_rewind in PG12? Or how should I trigger pg_rewind to use
restore_command?

[image: image.png]

Thank you.

Attachments:

image.pngimage/png; name=image.pngDownload+0-4
#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Amine Tengilimoglu (#1)
Re: pg_rewind restore_command issue in PG12

On 03/01/2021 20:13, Amine Tengilimoglu wrote:

     In a situation where pg_rewind gets an error due to a missing
wall, I  have set restore_command so that the needed wals can be read
from the archive (I don't want to manually copy the wal files), but I
see it doesn't work. What am I missing?  Is restore_command not really
working with pg_rewind in PG12? Or  how should I trigger pg_rewind to
use restore_command?

Using restore_command is a new feature in pg_rewind in PostgreSQL 13. It
doesn't work on earlier versions.

- Heikki

#3Amine Tengilimoglu
aminetengilimoglu@gmail.com
In reply to: Heikki Linnakangas (#2)
Re: pg_rewind restore_command issue in PG12

When I read the pg_rewind PG12 doc. It says:

"... but if the target cluster ran for a long time after the divergence,
the old WAL files might no longer be present. In that case, they can be
manually copied from the WAL archive to the pg_wal directory,* or fetched
on startup by configuring **primary_conninfo
<https://www.postgresql.org/docs/12/runtime-config-replication.html#GUC-PRIMARY-CONNINFO&gt;
or restore_command
<https://www.postgresql.org/docs/12/runtime-config-wal.html#GUC-RESTORE-COMMAND&gt;*
.".

So I thought we could use restore_command. But when I try to use it , I
see it doesn't work either.

Thanks.

Heikki Linnakangas <hlinnaka@iki.fi>, 4 Oca 2021 Pzt, 15:42 tarihinde şunu
yazdı:

Show quoted text

On 03/01/2021 20:13, Amine Tengilimoglu wrote:

In a situation where pg_rewind gets an error due to a missing
wall, I have set restore_command so that the needed wals can be read
from the archive (I don't want to manually copy the wal files), but I
see it doesn't work. What am I missing? Is restore_command not really
working with pg_rewind in PG12? Or how should I trigger pg_rewind to
use restore_command?

Using restore_command is a new feature in pg_rewind in PostgreSQL 13. It
doesn't work on earlier versions.

- Heikki

#4Michael Paquier
michael@paquier.xyz
In reply to: Amine Tengilimoglu (#3)
Re: pg_rewind restore_command issue in PG12

On Mon, Jan 04, 2021 at 04:12:34PM +0300, Amine Tengilimoglu wrote:

When I read the pg_rewind PG12 doc. It says:

"... but if the target cluster ran for a long time after the divergence,
the old WAL files might no longer be present. In that case, they can be
manually copied from the WAL archive to the pg_wal directory,* or fetched
on startup by configuring **primary_conninfo
<https://www.postgresql.org/docs/12/runtime-config-replication.html#GUC-PRIMARY-CONNINFO&gt;
or restore_command
<https://www.postgresql.org/docs/12/runtime-config-wal.html#GUC-RESTORE-COMMAND&gt;*
.".

So I thought we could use restore_command. But when I try to use it , I
see it doesn't work either.

I agree with your point that the docs of 9.6~12 are confusing here.
It makes no sense to mention restore_command or primary_conninfo to
fetch WAL segments for the target to allow pg_rewind to find the point
of divergence because the target is already offline when we look at
that. Mentioning restore_command/primary_conninfo for recovery
purposes could make sense in the context in the follow-up paragraph
though, where the target gets restarted, after the rewind. But the
uses are different.

The docs of 13~ got that right when -c has been introduced by
rewording this sentence as "or run pg_rewind with the -c option to
automatically retrieve them from the WAL archive". So let's get rid
of ", or fetched on startup by configuring primary_conninfo or
restore_command." ("or fetched on startup by configuring
recovery.conf" in some older branches). This confusion has been
introduced by 878bd9a, down to 9.6.

Heikki, what do you think?
--
Michael

#5Amine Tengilimoglu
aminetengilimoglu@gmail.com
In reply to: Michael Paquier (#4)
Re: pg_rewind restore_command issue in PG12

Thank you Michael. I agree with you. Relevant part can be removed from the
document and eliminate the confusion at least.

Michael Paquier <michael@paquier.xyz>, 5 Oca 2021 Sal, 10:17 tarihinde şunu
yazdı:

Show quoted text

On Mon, Jan 04, 2021 at 04:12:34PM +0300, Amine Tengilimoglu wrote:

When I read the pg_rewind PG12 doc. It says:

"... but if the target cluster ran for a long time after the divergence,
the old WAL files might no longer be present. In that case, they can be
manually copied from the WAL archive to the pg_wal directory,* or fetched
on startup by configuring **primary_conninfo
<

https://www.postgresql.org/docs/12/runtime-config-replication.html#GUC-PRIMARY-CONNINFO

or restore_command
<

https://www.postgresql.org/docs/12/runtime-config-wal.html#GUC-RESTORE-COMMAND

*
.".

So I thought we could use restore_command. But when I try to use it , I
see it doesn't work either.

I agree with your point that the docs of 9.6~12 are confusing here.
It makes no sense to mention restore_command or primary_conninfo to
fetch WAL segments for the target to allow pg_rewind to find the point
of divergence because the target is already offline when we look at
that. Mentioning restore_command/primary_conninfo for recovery
purposes could make sense in the context in the follow-up paragraph
though, where the target gets restarted, after the rewind. But the
uses are different.

The docs of 13~ got that right when -c has been introduced by
rewording this sentence as "or run pg_rewind with the -c option to
automatically retrieve them from the WAL archive". So let's get rid
of ", or fetched on startup by configuring primary_conninfo or
restore_command." ("or fetched on startup by configuring
recovery.conf" in some older branches). This confusion has been
introduced by 878bd9a, down to 9.6.

Heikki, what do you think?
--
Michael

#6Michael Paquier
michael@paquier.xyz
In reply to: Amine Tengilimoglu (#5)
Re: pg_rewind restore_command issue in PG12

On Tue, Jan 05, 2021 at 11:54:42AM +0300, Amine Tengilimoglu wrote:

Thank you Michael. I agree with you. Relevant part can be removed from the
document and eliminate the confusion at least.

Okay, I got around this stuff, and committed a fix for 9.6~12. Thanks
for the report, Amine!
--
Michael

#7Amine Tengilimoglu
aminetengilimoglu@gmail.com
In reply to: Michael Paquier (#6)
Re: pg_rewind restore_command issue in PG12

You're welcome Michael!

Michael Paquier <michael@paquier.xyz>, 7 Oca 2021 Per, 14:54 tarihinde şunu
yazdı:

Show quoted text

On Tue, Jan 05, 2021 at 11:54:42AM +0300, Amine Tengilimoglu wrote:

Thank you Michael. I agree with you. Relevant part can be removed from

the

document and eliminate the confusion at least.

Okay, I got around this stuff, and committed a fix for 9.6~12. Thanks
for the report, Amine!
--
Michael