GetBufferDescriptor() being called for local buffers from MarkBufferDirtyHint()
Hi Andres,
82467f627bd478569de04f4a3f1993098e80c812 added MarkBufferDirtyHint()
which invokes GetBufferDescriptor() even for local buffers for which
id < 0. Since GetBufferDescriptor() declares id as uint32, -1 is
converted to a very large int32 value which is way larger than
NBuffers. Thus GetBufferDescriptor() may be returning something from
the BufferBlocks which probably has enough memory to accommodate that
memory access. But it's a bogus BufferDesc nevertheless. We are not
seeing any problem with this right now since MarkBufferDirtyHint()
uses the BufferDesc only when it's a shared buffer. Right fix is to
let that function handle local buffers first and then call
GetBufferDescriptor() as in the attached patch.
I caught this because of an Assertion added in GetBufferDescription()
in my shared buffer resizing patches. I think it's worth committing
that assertion and the related change to BufferManagerShmemInit()
separately from shared buffer resizing patches. Included those changes
in the attached patch as well.
--
Best Wishes,
Ashutosh Bapat
Attachments:
v20260606-0001-MarkBufferDirtyHint-calls-GetBufferDescrip.patchtext/x-patch; charset=US-ASCII; name=v20260606-0001-MarkBufferDirtyHint-calls-GetBufferDescrip.patchDownload+9-5
On Sat, Jun 06, 2026 at 01:37:42PM +0530, Ashutosh Bapat wrote:
82467f627bd478569de04f4a3f1993098e80c812 added MarkBufferDirtyHint()
which invokes GetBufferDescriptor() even for local buffers for which
id < 0. Since GetBufferDescriptor() declares id as uint32, -1 is
converted to a very large int32 value which is way larger than
NBuffers. Thus GetBufferDescriptor() may be returning something from
the BufferBlocks which probably has enough memory to accommodate that
memory access. But it's a bogus BufferDesc nevertheless. We are not
seeing any problem with this right now since MarkBufferDirtyHint()
uses the BufferDesc only when it's a shared buffer. Right fix is to
let that function handle local buffers first and then call
GetBufferDescriptor() as in the attached patch.
@@ -5831,8 +5831,6 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
{
BufferDesc *bufHdr;
- bufHdr = GetBufferDescriptor(buffer - 1);
-
if (!BufferIsValid(buffer))
elog(ERROR, "bad buffer ID: %d", buffer);
@@ -5842,6 +5840,8 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
return;
}
+ bufHdr = GetBufferDescriptor(buffer - 1);
Yep, that's clearly wrong. We are lucky that it does not blow up
today but that's a ticking bomb. Even with that in mind, the result
leads to a non-defined behavior.
- return &(BufferDescriptors[id]).bufferdesc;
+ BufferDesc *bdesc = &(BufferDescriptors[id]).bufferdesc;
+
+ Assert(bdesc->buf_id == id);
+
+ return bdesc;
[...]
- BufferDesc *buf = GetBufferDescriptor(i);
+ /* GetBufferDescriptor() expects buf_id to be set in the descriptor. */
+ BufferDesc *buf = &(BufferDescriptors[i]).bufferdesc;
I am not convinced by these two. For the first one, the assertion is
a bound check in disguise, which could also use NBuffers as of an (id
< NBuffers). With the inlining we care about the performance. This
also forces the second change in ShmemInit(), which makes even less
sense once we think about the first assertion.
Will fix the first issue on HEAD, that's wrong. Thanks for the
report.
--
Michael
On Wed, Jun 10, 2026 at 9:10 AM Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Jun 06, 2026 at 01:37:42PM +0530, Ashutosh Bapat wrote:
82467f627bd478569de04f4a3f1993098e80c812 added MarkBufferDirtyHint()
which invokes GetBufferDescriptor() even for local buffers for which
id < 0. Since GetBufferDescriptor() declares id as uint32, -1 is
converted to a very large int32 value which is way larger than
NBuffers. Thus GetBufferDescriptor() may be returning something from
the BufferBlocks which probably has enough memory to accommodate that
memory access. But it's a bogus BufferDesc nevertheless. We are not
seeing any problem with this right now since MarkBufferDirtyHint()
uses the BufferDesc only when it's a shared buffer. Right fix is to
let that function handle local buffers first and then call
GetBufferDescriptor() as in the attached patch.@@ -5831,8 +5831,6 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
{
BufferDesc *bufHdr;- bufHdr = GetBufferDescriptor(buffer - 1);
-
if (!BufferIsValid(buffer))
elog(ERROR, "bad buffer ID: %d", buffer);@@ -5842,6 +5840,8 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
return;
}+ bufHdr = GetBufferDescriptor(buffer - 1);
Yep, that's clearly wrong. We are lucky that it does not blow up
today but that's a ticking bomb. Even with that in mind, the result
leads to a non-defined behavior.- return &(BufferDescriptors[id]).bufferdesc; + BufferDesc *bdesc = &(BufferDescriptors[id]).bufferdesc; + + Assert(bdesc->buf_id == id); + + return bdesc; [...] - BufferDesc *buf = GetBufferDescriptor(i); + /* GetBufferDescriptor() expects buf_id to be set in the descriptor. */ + BufferDesc *buf = &(BufferDescriptors[i]).bufferdesc;I am not convinced by these two. For the first one, the assertion is
a bound check in disguise, which could also use NBuffers as of an (id
< NBuffers).
Just to clarify, it's stronger than just a bound check.
BufferDesc::buf_id comment says
/*
* Buffer's index number (from 0). The field never changes after
* initialization, so does not need locking.
*/
With the inlining we care about the performance. This
also forces the second change in ShmemInit(), which makes even less
sense once we think about the first assertion.
The field is writable and can be accidentally written to. Imagine a
bug *(wrong memory calculation) = <wrong buffer id> changing buf_id
without realising the change especially when <wrong buffer id> is
within 0 to NBuffers. There is no place where we check that the
assumption is true. But we use buf_id in a few critical places.
GetBufferDescriptor() is a good place for checking the assumption. I
agree that there's a small risk that it will make accessing the buffer
slower but only in Assert enabled builds which are anyway slower [1]/messages/by-id/CAExHW5tNMNQ5ennUM3QK+MQGw703M9T6z-LvBKKAcgpJ3KW14Q@mail.gmail.com.
FWIW, I ran pgbench on my laptop with nproc = 16 with scale = 10. I
didn't find any performance difference
Without assertion
$ for i in 1 2 3 4 5 6; do pgbench -d postgres -T 300 -c 4 -j 4 | grep tps; done
starting vacuum...end.
tps = 1081.714940 (without initial connection time)
starting vacuum...end.
tps = 1026.242729 (without initial connection time)
starting vacuum...end.
tps = 1080.696554 (without initial connection time)
starting vacuum...end.
tps = 1092.211269 (without initial connection time)
starting vacuum...end.
tps = 1058.359337 (without initial connection time)
starting vacuum...end.
tps = 1087.718979 (without initial connection time)
With Assertion
$ for i in 1 2 3 4 5 6 ; do pgbench -d postgres -T 300 -c 4 -j 4 |
grep tps; done
starting vacuum...end.
tps = 1065.345204 (without initial connection time)
starting vacuum...end.
tps = 1088.642703 (without initial connection time)
starting vacuum...end.
tps = 1077.954350 (without initial connection time)
starting vacuum...end.
tps = 1091.662205 (without initial connection time)
starting vacuum...end.
tps = 1029.412347 (without initial connection time)
starting vacuum...end.
tps = 1086.912667 (without initial connection time)
Said that, I am fine, if we don't want to include the Assertion. I
will keep it in my shared buffer resizing patches where I have found
it useful. Maybe we will reconsider it with resizable shared buffers
context.
Will fix the first issue on HEAD, that's wrong. Thanks for the
report.
Thanks.
[1]: /messages/by-id/CAExHW5tNMNQ5ennUM3QK+MQGw703M9T6z-LvBKKAcgpJ3KW14Q@mail.gmail.com
--
Best Wishes,
Ashutosh Bapat
On 2026-06-10 12:40:38 +0900, Michael Paquier wrote:
On Sat, Jun 06, 2026 at 01:37:42PM +0530, Ashutosh Bapat wrote:
82467f627bd478569de04f4a3f1993098e80c812 added MarkBufferDirtyHint()
which invokes GetBufferDescriptor() even for local buffers for which
id < 0. Since GetBufferDescriptor() declares id as uint32, -1 is
converted to a very large int32 value which is way larger than
NBuffers. Thus GetBufferDescriptor() may be returning something from
the BufferBlocks which probably has enough memory to accommodate that
memory access. But it's a bogus BufferDesc nevertheless. We are not
seeing any problem with this right now since MarkBufferDirtyHint()
uses the BufferDesc only when it's a shared buffer. Right fix is to
let that function handle local buffers first and then call
GetBufferDescriptor() as in the attached patch.@@ -5831,8 +5831,6 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
{
BufferDesc *bufHdr;- bufHdr = GetBufferDescriptor(buffer - 1);
-
if (!BufferIsValid(buffer))
elog(ERROR, "bad buffer ID: %d", buffer);@@ -5842,6 +5840,8 @@ MarkBufferDirtyHint(Buffer buffer, bool buffer_std)
return;
}+ bufHdr = GetBufferDescriptor(buffer - 1);
Yep, that's clearly wrong. We are lucky that it does not blow up
today but that's a ticking bomb.
I think it *should* blow up. It doesn't because we're lacking assertions in
GetBufferDescriptor(). But I don't think the assertions added in the patch are
quite right.
We can't trivially add the correct assertions, because somebody though it was
a good idea to give GetBufferDescriptor() a uint32 parameter, which seems
completely wrong to me.
Even with that in mind, the result leads to a non-defined behavior.
I'm not sure it really does, but it's clearly wrong.
Greetings,
Andres Freund
On Wed, Jun 10, 2026 at 10:36:22AM -0400, Andres Freund wrote:
I think it *should* blow up. It doesn't because we're lacking assertions in
GetBufferDescriptor(). But I don't think the assertions added in the patch are
quite right.We can't trivially add the correct assertions, because somebody though it was
a good idea to give GetBufferDescriptor() a uint32 parameter, which seems
completely wrong to me.
This one is not as old as I expected: 3ac88fddd92c. You're right that
switching that to be signed would be a correct first step forward.
--
Michael