RE: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IO Failure Occurs

Started by Kyotaro Horiguchiover 6 years ago1 messages
#1Kyotaro Horiguchi
horikyota.ntt@gmail.com

Sorry in advance for link-breaking message force by gmail..

/messages/by-id/CY4PR2101MB0804CE9836E582C0702214E8AAD30@CY4PR2101MB0804.namprd21.prod.outlook.com

I assume that we are in a consensus about the problem we are to fix
here.

0a 00000004`8080cc30 00000004`80dcf917 postgres!PGSemaphoreLock+0x65 [d:\orcasqlagsea10\14\s\src\backend\port\win32_sema.c @ 158]
0b 00000004`8080cc90 00000004`80db025c postgres!LWLockAcquire+0x137 [d:\orcasqlagsea10\14\s\src\backend\storage\lmgr\lwlock.c @ 1234]
0c 00000004`8080ccd0 00000004`80db25db postgres!AbortBufferIO+0x2c [d:\orcasqlagsea10\14\s\src\backend\storage\buffer\bufmgr.c @ 3995]
0d 00000004`8080cd20 00000004`80dbce36 postgres!AtProcExit_Buffers+0xb [d:\orcasqlagsea10\14\s\src\backend\storage\buffer\bufmgr.c @ 2479]
0e 00000004`8080cd50 00000004`80dbd1bd postgres!shmem_exit+0xf6 [d:\orcasqlagsea10\14\s\src\backend\storage\ipc\ipc.c @ 262]
0f 00000004`8080cd80 00000004`80dbccfd postgres!proc_exit_prepare+0x4d [d:\orcasqlagsea10\14\s\src\backend\storage\ipc\ipc.c @ 188]
10 00000004`8080cdb0 00000004`80ef9e74 postgres!proc_exit+0xd [d:\orcasqlagsea10\14\s\src\backend\storage\ipc\ipc.c @ 141]
11 00000004`8080cde0 00000004`80ddb6ef postgres!errfinish+0x204 [d:\orcasqlagsea10\14\s\src\backend\utils\error\elog.c @ 624]
12 00000004`8080ce50 00000004`80db0f59 postgres!mdread+0x12f [d:\orcasqlagsea10\14\s\src\backend\storage\smgr\md.c @ 806]

Ok, we are fixing this. The proposed patch lets LWLockReleaseAll()
called before InitBufferPoolBackend() by registering the former after
the latter into on_shmem_exit list. Even if it works, I think it's
neither clean nor safe to register multiple order-sensitive callbacks.

AtProcExit_Buffers has the following comment:

* During backend exit, ensure that we released all shared-buffer locks and
* assert that we have no remaining pins.

And the only caller of it is shmem_exit. More of that, all other
caller sites calls LWLockReleaseAll() just before calling it. If
that's the case, why don't we just release all LWLocks in shmem_exit
or in AtProcExit_Buffers before calling AbortBufferIO()? I think it's
sufficient that AtProcExit_Buffers calls it at the beginning. (The
comment for the funcgtion needs editing).

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center