Automating Failover Resync & Re-Attach in pgpool2

Started by VASUKI M6 months ago4 messagesbugs

vasukim1992002@gmail.com

6 months ago

Dear PostgreSQL and Pgpool Communities,

While working with PostgreSQL failover scenarios, I noticed that the
process of re-attaching a standby node after a failover can be somewhat
manual and prone to delays, especially in production environments.

I explored automating this process using a combination of *pg_rewind* and *WAL
replay*, which allows a standby node to resynchronize and re-attach to the
primary automatically after a failover. This could reduce downtime and
simplify management of failover nodes in high-availability setups.

Automatically resynchronize after failover

Reduce downtime and ensure quicker recovery

Minimize manual operations and errors

Maintain consistent cluster state with less administrative overhead

I believe that integrating such an automated resync and re-attach feature
into *Pgpool-II* could be very valuable for PostgreSQL users, potentially
as an enhancement in a future release.

I wanted to share this idea with the community to get feedback,
suggestions, or any pointers on existing work that may align with this. I
am happy to contribute more details

Bo Peng

pengbo@sraoss.co.jp

6 months ago

In reply to: VASUKI M (#1)

Re: Automating Failover Resync & Re-Attach in pgpool2

Hi,

Thank you for your question.

While working with PostgreSQL failover scenarios, I noticed that the process of re-attaching a standby node
after a failover can be somewhat manual and prone to delays, especially in production environments.

After a failover, the standby nodes can be automatically attached to the new primary by setting "follow_primary_command".

https://www.pgpool.net/docs/latest/en/html/runtime-config-failover.html#RUNTIME-CONFIG-FAILOVER-SETTINGS

You can also automatically reattach a failed standby node by setting "auto_failback = on".

https://www.pgpool.net/docs/latest/ja/html/runtime-config-failover.html#GUC-AUTO-FAILBACK

---
Bo Peng <pengbo@sraoss.co.jp>
SRA OSS K.K.
TEL: 03-5979-2701 FAX: 03-5979-2702
Mobile: 080-7752-0749
URL: https://www.sraoss.co.jp/

________________________________________
差出人: VASUKI M <vasukim1992002@gmail.com>
送信: 2025 年 10 月 10 日 (金曜日) 21:17
宛先: pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Cc: bharatdb@cdac.in <bharatdb@cdac.in>; pgpool-general@lists.postgresql.org <pgpool-general@lists.postgresql.org>
件名: Automating Failover Resync & Re-Attach in pgpool2

Dear PostgreSQL and Pgpool Communities,
While working with PostgreSQL failover scenarios, I noticed that the process of re-attaching a standby node after a failover can be somewhat manual and prone to delays, especially in production environments.
I explored automating this process using a combination of pg_rewind and WAL replay, which allows a standby node to resynchronize and re-attach to the primary automatically after a failover. This could reduce downtime and simplify management of failover nodes in high-availability setups.
Automatically resynchronize after failover
Reduce downtime and ensure quicker recovery
Minimize manual operations and errors
Maintain consistent cluster state with less administrative overhead
I believe that integrating such an automated resync and re-attach feature into Pgpool-II could be very valuable for PostgreSQL users, potentially as an enhancement in a future release.
I wanted to share this idea with the community to get feedback, suggestions, or any pointers on existing work that may align with this. I am happy to contribute more details

Bo Peng

pengbo@sraoss.co.jp

6 months ago

In reply to: VASUKI M (#1)

Re: Automating Failover Resync & Re-Attach in pgpool2

Hi,

Thank you for your question.

While working with PostgreSQL failover scenarios, I noticed that the process of re-attaching a standby node
after a failover can be somewhat manual and prone to delays, especially in production environments.

After a failover, the standby nodes can be automatically attached to the new primary by setting "follow_primary_command".

https://www.pgpool.net/docs/latest/en/html/runtime-config-failover.html#RUNTIME-CONFIG-FAILOVER-SETTINGS

You can also automatically reattach a failed standby node by setting "auto_failback = on".

https://www.pgpool.net/docs/latest/ja/html/runtime-config-failover.html#GUC-AUTO-FAILBACK

---
Bo Peng <pengbo@sraoss.co.jp>
SRA OSS K.K.
TEL: 03-5979-2701 FAX: 03-5979-2702
Mobile: 080-7752-0749
URL: https://www.sraoss.co.jp/

Dear PostgreSQL and Pgpool Communities,While working with PostgreSQL failover scenarios, I noticed that the process of re-attaching a standby node after a failover can be somewhat manual and prone to delays, especially in production environments.I explored automating this process using a combination of pg_rewind and WAL replay, which allows a standby node to resynchronize and re-attach to the primary automatically after a failover. This could reduce downtime and simplify management of failover nodes in high-availability setups.Automatically resynchronize after failoverReduce downtime and ensure quicker recoveryMinimize manual operations and errorsMaintain consistent cluster state with less administrative overheadI believe that integrating such an automated resync and re-attach feature into Pgpool-II could be very valuable for PostgreSQL users, potentially as an enhancement in a future release.I wanted to share this idea with the community to get feedback, suggestions, or any pointers on existing work that may align with this. I am happy to contribute more details

VASUKI M

vasukim1992002@gmail.com

6 months ago

In reply to: VASUKI M (#1)

Fwd: Automating Failover Resync & Re-Attach in pgpool2

Hi Bo,

Thank you very much for your clarification and the helpful links on
follow_primary_command and auto_failback. I went through those sections in
the documentation, and I now understand that Pgpool-II can automatically
follow the new primary and reattach a standby node once it becomes
available again.

However, my idea was aimed at handling cases where the *old primary
diverges in timeline or LSN* after a failover — for example, when the new
primary executes additional writes before the old primary rejoins. In such
cases, the existing auto-failback or follow-primary mechanisms can’t
directly reattach the old node because its data is no longer in sync with
the current primary.

To address that, I was exploring a built-in *auto-resync enhancement* where
Pgpool-II could internally perform the following before reattaching:

*Detect timeline mismatch* between the new primary and the returning
node.
2.

*Automatically run pg_rewind* (or WAL-based replay) to synchronize the
old node’s data directory.
3.

*Restart and reattach the node* to the pool automatically once the
resync is complete.

This would essentially extend the existing auto_failback behavior to
include *automated resynchronization*, reducing manual intervention and
ensuring consistent cluster recovery even in timeline divergence scenarios.

I’m thinking of something like a new configuration section in pgpool.conf:

auto_resync = on
resync_method = 'pg_rewind'
resync_user = 'replicator'

The feature could hook into the existing failback workflow (perhaps in
failover.c or recovery.c), so that Pgpool performs resync + reattach
seamlessly when the failed node returns.

Would this be something the Pgpool team would consider as an enhancement?

Thanks again for your time and guidance.

Best regards,
*Vasuki M*
CDAC, Chennai
vasukim1992002@gmail.com

On Fri, 17 Oct 2025 at 13:30, Bo Peng <pengbo@sraoss.co.jp> wrote:

Show quoted text

Hi,

Thank you for your question.

While working with PostgreSQL failover scenarios, I noticed that the

process of re-attaching a standby node

after a failover can be somewhat manual and prone to delays, especially

in production environments.

After a failover, the standby nodes can be automatically attached to the
new primary by setting "follow_primary_command".

https://www.pgpool.net/docs/latest/en/html/runtime-config-failover.html#RUNTIME-CONFIG-FAILOVER-SETTINGS

You can also automatically reattach a failed standby node by setting
"auto_failback = on".

https://www.pgpool.net/docs/latest/ja/html/runtime-config-failover.html#GUC-AUTO-FAILBACK

---
Bo Peng <pengbo@sraoss.co.jp>
SRA OSS K.K.
TEL: 03-5979-2701 FAX: 03-5979-2702
Mobile: 080-7752-0749
URL: https://www.sraoss.co.jp/

________________________________________
差出人: VASUKI M <vasukim1992002@gmail.com>
送信: 2025 年 10 月 10 日 (金曜日) 21:17
宛先: pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>
Cc: bharatdb@cdac.in <bharatdb@cdac.in>;
pgpool-general@lists.postgresql.org <pgpool-general@lists.postgresql.org>
件名: Automating Failover Resync & Re-Attach in pgpool2

Dear PostgreSQL and Pgpool Communities,While working with PostgreSQL
failover scenarios, I noticed that the process of re-attaching a standby
node after a failover can be somewhat manual and prone to delays,
especially in production environments.I explored automating this process
using a combination of pg_rewind and WAL replay, which allows a standby
node to resynchronize and re-attach to the primary automatically after a
failover. This could reduce downtime and simplify management of failover
nodes in high-availability setups.Automatically resynchronize after
failoverReduce downtime and ensure quicker recoveryMinimize manual
operations and errorsMaintain consistent cluster state with less
administrative overheadI believe that integrating such an automated resync
and re-attach feature into Pgpool-II could be very valuable for PostgreSQL
users, potentially as an enhancement in a future release.I wanted to share
this idea with the community to get feedback, suggestions, or any pointers
on existing work that may align with this. I am happy to contribute more
details

Import Notes

Reply to msg id not found: CACTYHzhwTMS0p6VujLzXhQrr3SHTsWcSei6BMLfceSNbajwREQ@mail.gmail.com