BgBufferSync(): clarification about reusable_buffers variable
I have been reading the source code of the BgWriter, and there is some
code in BgBufferSync() that I don't fully understand.
In BgBufferSync(), we have the following code:
while (num_to_scan > 0 && reusable_buffers < upcoming_alloc_est)
{
int sync_state = SyncOneBuffer(next_to_clean, true,
wb_context);
if (++next_to_clean >= NBuffers)
{
next_to_clean = 0;
next_passes++;
}
num_to_scan--;
if (sync_state & BUF_WRITTEN)
{
reusable_buffers++;
if (++num_written >= bgwriter_lru_maxpages)
{
PendingBgWriterStats.maxwritten_clean++;
break;
}
}
else if (sync_state & BUF_REUSABLE)
reusable_buffers++;
}
In SyncOneBuffer(), we lock the bufHdr and then check if both the
refcount and usage_count are zero. If so, we mark the return value as
BUF_REUSABLE.
My understanding is that this means that the buffer could be reused
when am empty shared buffer is needed by a backend. However, in the
code above, we seem to track these in the reusable_buffers variable.
But that variable is always incremented when the buffer was written in
SyncOneBuffer() even though that buffer might have a non-zero refcount
or non-zero usage_count.
Initially, I thought this might be a bug, but then realized that the
reusable_buffers variable is used to make the BGWriter more or less
aggressive in reading pages, and now I'm unsure.
I discussed this with some of the PG committers at Microsoft, and we
all agree that the current code is a little confusing and it might
help if some more comments were added.
Does anybody know what the reusable_buffers variable really accounts
for, and whether the code as currently written is correct?
Thanks,
Marcel
--
Marcel van der Holst
mvdholst@gmail.com
Hi Marcel,
On Sat, Feb 8, 2025 at 6:24 AM M vd H <mvdholst@gmail.com> wrote:
I have been reading the source code of the BgWriter, and there is some
code in BgBufferSync() that I don't fully understand.In BgBufferSync(), we have the following code:
while (num_to_scan > 0 && reusable_buffers < upcoming_alloc_est)
{
int sync_state = SyncOneBuffer(next_to_clean, true,
wb_context);if (++next_to_clean >= NBuffers)
{
next_to_clean = 0;
next_passes++;
}
num_to_scan--;if (sync_state & BUF_WRITTEN)
{
reusable_buffers++;
if (++num_written >= bgwriter_lru_maxpages)
{
PendingBgWriterStats.maxwritten_clean++;
break;
}
}
else if (sync_state & BUF_REUSABLE)
reusable_buffers++;
}In SyncOneBuffer(), we lock the bufHdr and then check if both the
refcount and usage_count are zero. If so, we mark the return value as
BUF_REUSABLE.
My understanding is that this means that the buffer could be reused
when am empty shared buffer is needed by a backend. However, in the
code above, we seem to track these in the reusable_buffers variable.
But that variable is always incremented when the buffer was written in
SyncOneBuffer() even though that buffer might have a non-zero refcount
or non-zero usage_count.
I also think tha reusable_buffers keep track of the number of reusable
buffers. BgBufferSync() calls SyncOneBuffer() with skip_recently_used
= true. In that case, if SyncOneBuffer() finds the buffer with
refcount or usage_count non-zero, it just unlocks the header and
returns. Hence when called from BgBufferSync(), SyncOneBuffer() would
write a buffer only when it is not used. Hence the result would be 0
or BUF_REUSABLE or BUF_REUSABLE | BUF_WRITTEN. It can never be just
BUF_WRITTEN.
I guess, a patch like the one attached will be more readable and clear.
I ran pgbench for 5 minutes with this patch applied and didn't see the
Assert failing. But I don't think that's a good enough test to cover
all scenarios.
--
Best Wishes,
Ashutosh Bapat
Attachments:
bg_buffer_sync_clarity.patchapplication/x-patch; name=bg_buffer_sync_clarity.patchDownload
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 80b0d0c5ded..bdacd5ad980 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -3414,17 +3414,19 @@ BgBufferSync(WritebackContext *wb_context)
}
num_to_scan--;
+ if (sync_state & BUF_REUSABLE)
+ reusable_buffers++;
+
if (sync_state & BUF_WRITTEN)
{
- reusable_buffers++;
+ Assert(sync_state & BUF_REUSABLE);
+
if (++num_written >= bgwriter_lru_maxpages)
{
PendingBgWriterStats.maxwritten_clean++;
break;
}
}
- else if (sync_state & BUF_REUSABLE)
- reusable_buffers++;
}
PendingBgWriterStats.buf_written_clean += num_written;
Hi, tks for working on this. I had a chance to look at this while
googling BgBufferSync
function.
I also think tha reusable_buffers keep track of the number of reusable
buffers. BgBufferSync() calls SyncOneBuffer() with skip_recently_used
= true. In that case, if SyncOneBuffer() finds the buffer with
refcount or usage_count non-zero, it just unlocks the header and
returns. Hence when called from BgBufferSync(), SyncOneBuffer() would
write a buffer only when it is not used. Hence the result would be 0
or BUF_REUSABLE or BUF_REUSABLE | BUF_WRITTEN. It can never be just
BUF_WRITTEN.
Agrees. For this call stack, if skip_recently_used is set to be true,
sync_state cannot be BUF_WRITTEN alone.
I guess, a patch like the one attached will be more readable and clear.
I'm new to this part of code, and I found the patch version seems to be
more straightforward and less prone to misinterpretation.
I ran pgbench for 5 minutes with this patch applied and didn't see the
Assert failing. But I don't think that's a good enough test to cover
all scenarios.
The patch LGTM.
--
Show quoted text
Best Wishes,
Ashutosh Bapat
Here's a rebase.
Attachments:
bg_buffer_sync_clarity.patchapplication/octet-stream; name=bg_buffer_sync_clarity.patchDownload
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 0b317d2d80..93ba3dad52 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -3853,18 +3853,20 @@ BgBufferSync(WritebackContext *wb_context)
next_passes++;
}
num_to_scan--;
+
+ if (sync_state & BUF_REUSABLE)
+ reusable_buffers++;
if (sync_state & BUF_WRITTEN)
{
- reusable_buffers++;
+ Assert(sync_state & BUF_REUSABLE);
+
if (++num_written >= bgwriter_lru_maxpages)
{
PendingBgWriterStats.maxwritten_clean++;
break;
}
}
- else if (sync_state & BUF_REUSABLE)
- reusable_buffers++;
}
PendingBgWriterStats.buf_written_clean += num_written;