pgsql: Fix race leading to incorrect conflict cause in InvalidatePossib

Started by Michael Paquierover 2 years ago3 messagescomitters
Jump to latest
#1Michael Paquier
michael@paquier.xyz

Fix race leading to incorrect conflict cause in InvalidatePossiblyObsoleteSlot()

The invalidation of an active slot is done in two steps:
- Termination of the backend holding it, if any.
- Report that the slot is obsolete, with a conflict cause depending on
the slot's data.

This can be racy because between these two steps the slot mutex would be
released while doing system calls, which means that the effective_xmin
and effective_catalog_xmin could advance during that time, detecting a
conflict cause different than the one originally wanted before the
process owning a slot is terminated.

Holding the mutex longer is not an option, so this commit changes the
code to record the LSNs stored in the slot during the termination of the
process owning the slot.

Bonus thanks to Alexander Lakhin for the various tests and the analysis.

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier, Bharath Rupireddy
Discussion: /messages/by-id/ZaTjW2Xh+TQUCOH0@ip-10-97-1-34.eu-west-3.compute.internal
Backpatch-through: 16

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/818fefd8fd4412d45eb542155cb2833a2b864acc

Modified Files
--------------
src/backend/replication/slot.c | 39 +++++++++++++++++++++++++++++++++------
1 file changed, 33 insertions(+), 6 deletions(-)

#2Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#1)
Re: pgsql: Fix race leading to incorrect conflict cause in InvalidatePossib

On Tue, Feb 20, 2024 at 10:14 AM Michael Paquier <michael@paquier.xyz> wrote:

Fix race leading to incorrect conflict cause in InvalidatePossiblyObsoleteSlot()

I found a typo with the code added by this commit - we've used
XLogRecPtr/InvalidXLogRecPtr for xmins in place of
TransactionId/InvalidTransactionId. Attached a patch to fix this.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Use-correct-datatype-for-xmin-variables-added-by-.patchapplication/octet-stream; name=v1-0001-Use-correct-datatype-for-xmin-variables-added-by-.patchDownload+2-3
#3Michael Paquier
michael@paquier.xyz
In reply to: Bharath Rupireddy (#2)
Re: pgsql: Fix race leading to incorrect conflict cause in InvalidatePossib

On Thu, Apr 11, 2024 at 12:38:06PM +0530, Bharath Rupireddy wrote:

I found a typo with the code added by this commit - we've used
XLogRecPtr/InvalidXLogRecPtr for xmins in place of
TransactionId/InvalidTransactionId. Attached a patch to fix this.

Thanks for the report. Will fix.
--
Michael