Set 1s WaitLatch timeout if standby limit has expired in ResolveRecoveryConflictWithBufferPin

Started by Anthony Hsu8 months ago3 messages
Jump to latest
#1Anthony Hsu
erwaman@gmail.com

Hi,

I think there is a race scenario where a backend holding a conflicting
buffer pin isn't promptly canceled even when the standby limit has expired:

1. suppose there is a buffer pin conflict and standby limit has already
expired
2. startup process enters ResolveRecoveryConflictWithBufferPin and
broadcasts PROCSIG_RECOVERY_CONFLICT_BUFFERPIN here [A] but does not set
any timeouts
3. startup process waits to be signaled by UnpinBuffer() here [B]
4. some non-conflicting backend receives the buffer pin signal sent in (2),
checks and sees it is not blocking recovery, and *then* acquires a
conflicting buffer pin
5. then the original conflicting backend receives the buffer pin signal
sent in (2) and cancels itself, calling UnpinBuffer(). But the pin count
will still be > 1 (due to (4) + the pin startup holds), so startup process
will not be woken up

In this scenario, the startup process might not be woken up for an
arbitrarily long length of time. And the new conflicting backend (step (4)
above) won't get sent another PROCSIG_RECOVERY_CONFLICT_BUFFERPIN signal
telling it to cancel itself.

To handle this scenario, I think we should set a timeout when doing
WaitLatch if standby limit has already expired. This allows the startup
process to wake up in a reasonable time to recheck and send
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN again to any new conflicting backends.
I have attached a small patch with this proposed fix.

Thanks,
Anthony

[A]
https://github.com/postgres/postgres/blob/21c9756db6458f859e6579a6754c78154321cb39/src/backend/storage/ipc/standby.c#L806
[B]
https://github.com/postgres/postgres/blob/21c9756db6458f859e6579a6754c78154321cb39/src/backend/storage/ipc/standby.c#L843

Attachments:

v1-0001-Set-1s-WaitLatch-timeout-if-standby-limit-has-exp.patchapplication/octet-stream; name=v1-0001-Set-1s-WaitLatch-timeout-if-standby-limit-has-exp.patchDownload+28-10
#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Anthony Hsu (#1)
Re: Set 1s WaitLatch timeout if standby limit has expired in ResolveRecoveryConflictWithBufferPin

On 2025-Jul-06, Anthony Hsu wrote:

Hi,

I think there is a race scenario where a backend holding a conflicting
buffer pin isn't promptly canceled even when the standby limit has expired:

This patch seems to hav efallen through the cracks. I created a
commitfest entry for it,
https://commitfest.postgresql.org/patch/6445/

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"No tengo por qué estar de acuerdo con lo que pienso"
(Carlos Caszeli)

#3Anthony Hsu
erwaman@gmail.com
In reply to: Alvaro Herrera (#2)
Re: Set 1s WaitLatch timeout if standby limit has expired in ResolveRecoveryConflictWithBufferPin

Thanks Álvaro for creating a commitfest entry. I've rebased my patch on
master to fix the build issues. Also made some minor code comment changes
and added a detailed commit message. Please let me know if you have any
feedback or questions.

-Anthony

On Fri, Jan 30, 2026 at 9:42 AM Álvaro Herrera <alvherre@kurilemu.de> wrote:

Show quoted text

On 2025-Jul-06, Anthony Hsu wrote:

Hi,

I think there is a race scenario where a backend holding a conflicting
buffer pin isn't promptly canceled even when the standby limit has

expired:

This patch seems to hav efallen through the cracks. I created a
commitfest entry for it,
https://commitfest.postgresql.org/patch/6445/

--
Álvaro Herrera PostgreSQL Developer —
https://www.EnterpriseDB.com/
"No tengo por qué estar de acuerdo con lo que pienso"
(Carlos Caszeli)

Attachments:

v2-0001-Set-1s-WaitLatch-timeout-if-standby-limit-has-exp.patchapplication/octet-stream; name=v2-0001-Set-1s-WaitLatch-timeout-if-standby-limit-has-exp.patchDownload+33-11