RE: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IO Failure Occurs
Sorry in advance for link-breaking message force by gmail..
/messages/by-id/CY4PR2101MB0804CE9836E582C0702214E8AAD30@CY4PR2101MB0804.namprd21.prod.outlook.com
I assume that we are in a consensus about the problem we are to fix
here.
0a 00000004`8080cc30 00000004`80dcf917 postgres!PGSemaphoreLock+0x65 [d:\orcasqlagsea10\14\s\src\backend\port\win32_sema.c @ 158]
0b 00000004`8080cc90 00000004`80db025c postgres!LWLockAcquire+0x137 [d:\orcasqlagsea10\14\s\src\backend\storage\lmgr\lwlock.c @ 1234]
0c 00000004`8080ccd0 00000004`80db25db postgres!AbortBufferIO+0x2c [d:\orcasqlagsea10\14\s\src\backend\storage\buffer\bufmgr.c @ 3995]
0d 00000004`8080cd20 00000004`80dbce36 postgres!AtProcExit_Buffers+0xb [d:\orcasqlagsea10\14\s\src\backend\storage\buffer\bufmgr.c @ 2479]
0e 00000004`8080cd50 00000004`80dbd1bd postgres!shmem_exit+0xf6 [d:\orcasqlagsea10\14\s\src\backend\storage\ipc\ipc.c @ 262]
0f 00000004`8080cd80 00000004`80dbccfd postgres!proc_exit_prepare+0x4d [d:\orcasqlagsea10\14\s\src\backend\storage\ipc\ipc.c @ 188]
10 00000004`8080cdb0 00000004`80ef9e74 postgres!proc_exit+0xd [d:\orcasqlagsea10\14\s\src\backend\storage\ipc\ipc.c @ 141]
11 00000004`8080cde0 00000004`80ddb6ef postgres!errfinish+0x204 [d:\orcasqlagsea10\14\s\src\backend\utils\error\elog.c @ 624]
12 00000004`8080ce50 00000004`80db0f59 postgres!mdread+0x12f [d:\orcasqlagsea10\14\s\src\backend\storage\smgr\md.c @ 806]
Ok, we are fixing this. The proposed patch lets LWLockReleaseAll()
called before InitBufferPoolBackend() by registering the former after
the latter into on_shmem_exit list. Even if it works, I think it's
neither clean nor safe to register multiple order-sensitive callbacks.
AtProcExit_Buffers has the following comment:
* During backend exit, ensure that we released all shared-buffer locks and
* assert that we have no remaining pins.
And the only caller of it is shmem_exit. More of that, all other
caller sites calls LWLockReleaseAll() just before calling it. If
that's the case, why don't we just release all LWLocks in shmem_exit
or in AtProcExit_Buffers before calling AbortBufferIO()? I think it's
sufficient that AtProcExit_Buffers calls it at the beginning. (The
comment for the funcgtion needs editing).
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center