BM_IO_ERROR flag is lost in TerminateBufferIO due to order of operations in UnlockBufHdrExt
Good day, Andres and hackers.
UnlockBufHdrExt does:
buf_state |= set_bits;
buf_state &= ~unset_bits;
buf_state &= ~BM_LOCKED;
TerminateBufferIO unconditionally does:
unset_flag_bits |= BM_IO_ERROR;
Due to this, AbortBufferIO and buffer_readv_complete_one are failed
to set BM_IO_ERROR with call to TerminateBufferIO.
It was found with proprietary code that was triggered on
PGAIO_RS_ERROR and made assertion on BM_IO_ERROR presence.
--
regards
Yura Sokolov aka funny-falcon
Hi,
On 2026-03-25 17:51:30 +0300, Yura Sokolov wrote:
UnlockBufHdrExt does:
buf_state |= set_bits;
buf_state &= ~unset_bits;
buf_state &= ~BM_LOCKED;TerminateBufferIO unconditionally does:
unset_flag_bits |= BM_IO_ERROR;
Due to this, AbortBufferIO and buffer_readv_complete_one are failed
to set BM_IO_ERROR with call to TerminateBufferIO.It was found with proprietary code that was triggered on
PGAIO_RS_ERROR and made assertion on BM_IO_ERROR presence.
That's clearly not right. Care to write a patch? I think we should add a
test for this in src/test/modules/test_aio too. As we don't rely on things
like BM_IO_ERROR this is quite easy to not notice.
Greetings,
Andres Freund
26.03.2026 23:29, Andres Freund wrote:
Hi,
On 2026-03-25 17:51:30 +0300, Yura Sokolov wrote:
UnlockBufHdrExt does:
buf_state |= set_bits;
buf_state &= ~unset_bits;
buf_state &= ~BM_LOCKED;TerminateBufferIO unconditionally does:
unset_flag_bits |= BM_IO_ERROR;
Due to this, AbortBufferIO and buffer_readv_complete_one are failed
to set BM_IO_ERROR with call to TerminateBufferIO.It was found with proprietary code that was triggered on
PGAIO_RS_ERROR and made assertion on BM_IO_ERROR presence.That's clearly not right. Care to write a patch? I think we should add a
test for this in src/test/modules/test_aio too. As we don't rely on things
like BM_IO_ERROR this is quite easy to not notice.
I thought, fix is too small to go the way of patches.
I don't mind if you just push it.
But if I can save your time on writing test, I'll try.
...
I believe, proper way is to add assertion in UnlockBufHdrExt:
Assert(!(set_bits & unset_bits));
And make TerminateBufferIO to exclude BM_IO_ERROR if it is among
set_flag_bits.
Is it ok?
--
regards
Yura Sokolov aka funny-falcon
Good day,
27.03.2026 12:04, Yura Sokolov wrote:
26.03.2026 23:29, Andres Freund wrote:
Hi,
On 2026-03-25 17:51:30 +0300, Yura Sokolov wrote:
UnlockBufHdrExt does:
buf_state |= set_bits;
buf_state &= ~unset_bits;
buf_state &= ~BM_LOCKED;TerminateBufferIO unconditionally does:
unset_flag_bits |= BM_IO_ERROR;
Due to this, AbortBufferIO and buffer_readv_complete_one are failed
to set BM_IO_ERROR with call to TerminateBufferIO.It was found with proprietary code that was triggered on
PGAIO_RS_ERROR and made assertion on BM_IO_ERROR presence.That's clearly not right. Care to write a patch? I think we should add a
test for this in src/test/modules/test_aio too. As we don't rely on things
like BM_IO_ERROR this is quite easy to not notice.I thought, fix is too small to go the way of patches.
I don't mind if you just push it.But if I can save your time on writing test, I'll try.
...
I believe, proper way is to add assertion in UnlockBufHdrExt:
Assert(!(set_bits & unset_bits));
And make TerminateBufferIO to exclude BM_IO_ERROR if it is among
set_flag_bits.Is it ok?
Here patchset is.
I've tried to modify 001_aio.pl to test presence BM_IO_ERROR after read
failure.
I've also added FlushBuffer failure test in second patch (v1-002) using
injection point. I don't know if 001-aio.pl is a good place for since write
is not async yet.
v1-003 just mades DebugPrintBufferRefcount prettier.
--
regards
Yura Sokolov aka funny-falcon
Attachments:
v1-0001-bufmgr-Fix-possibility-to-set-BM_IO_ERROR.patchtext/x-patch; charset=UTF-8; name=v1-0001-bufmgr-Fix-possibility-to-set-BM_IO_ERROR.patchDownload+57-3
v1-0002-bufmgr-Add-test-for-BM_IO_ERROR-presence-after-wr.patchtext/x-patch; charset=UTF-8; name=v1-0002-bufmgr-Add-test-for-BM_IO_ERROR-presence-after-wr.patchDownload+91-2
v1-0003-bufmgr-print-flag-names-in-DebugPrintBufferRefcou.patchtext/x-patch; charset=UTF-8; name=v1-0003-bufmgr-print-flag-names-in-DebugPrintBufferRefcou.patchDownload+36-13
27.03.2026 21:30, Yura Sokolov wrote:
26.03.2026 23:29, Andres Freund wrote:
That's clearly not right. Care to write a patch?
Here patchset is.
Good day, Andres.
Have you any recommendations for patch?
Should I create CF item for patch?
--
regards
Yura Sokolov aka funny-falcon
27.03.2026 21:30, Yura Sokolov пишет:
Good day,
27.03.2026 12:04, Yura Sokolov wrote:
26.03.2026 23:29, Andres Freund wrote:
Hi,
On 2026-03-25 17:51:30 +0300, Yura Sokolov wrote:
UnlockBufHdrExt does:
buf_state |= set_bits;
buf_state &= ~unset_bits;
buf_state &= ~BM_LOCKED;TerminateBufferIO unconditionally does:
unset_flag_bits |= BM_IO_ERROR;
Due to this, AbortBufferIO and buffer_readv_complete_one are failed
to set BM_IO_ERROR with call to TerminateBufferIO.It was found with proprietary code that was triggered on
PGAIO_RS_ERROR and made assertion on BM_IO_ERROR presence.That's clearly not right. Care to write a patch? I think we should add a
test for this in src/test/modules/test_aio too. As we don't rely on things
like BM_IO_ERROR this is quite easy to not notice.I thought, fix is too small to go the way of patches.
I don't mind if you just push it.But if I can save your time on writing test, I'll try.
...
I believe, proper way is to add assertion in UnlockBufHdrExt:
Assert(!(set_bits & unset_bits));
And make TerminateBufferIO to exclude BM_IO_ERROR if it is among
set_flag_bits.Is it ok?
Here patchset is.
I've tried to modify 001_aio.pl to test presence BM_IO_ERROR after read
failure.I've also added FlushBuffer failure test in second patch (v1-002) using
injection point. I don't know if 001-aio.pl is a good place for since write
is not async yet.v1-003 just mades DebugPrintBufferRefcount prettier.
rebased.
--
regards
Yura Sokolov aka funny-falcon
Attachments:
v2-0001-bufmgr-Fix-possibility-to-set-BM_IO_ERROR.patchtext/x-patch; charset=UTF-8; name=v2-0001-bufmgr-Fix-possibility-to-set-BM_IO_ERROR.patchDownload+57-3
v2-0002-bufmgr-Add-test-for-BM_IO_ERROR-presence-after-wr.patchtext/x-patch; charset=UTF-8; name=v2-0002-bufmgr-Add-test-for-BM_IO_ERROR-presence-after-wr.patchDownload+94-5
v2-0003-bufmgr-print-flag-names-in-DebugPrintBufferRefcou.patchtext/x-patch; charset=UTF-8; name=v2-0003-bufmgr-print-flag-names-in-DebugPrintBufferRefcou.patchDownload+36-13
On Mon, Mar 30, 2026 at 11:25:55AM +0300, Yura Sokolov wrote:
Have you any recommendations for patch?
Should I create CF item for patch?
The business with unset_flag_bits is new as of HEAD, so this could
qualify as an open item, to be tracked here:
https://wiki.postgresql.org/wiki/PostgreSQL_19_Open_Items
--
Michael