Extension Enhancement: Buffer Invalidation in pg_buffercache
I hope this email finds you well. I am excited to share that I have
extended the functionality of the `pg_buffercache` extension by
implementing buffer invalidation capability, as requested by some
PostgreSQL contributors for improved testing scenarios.
This marks my first time submitting a patch to pgsql-hackers, and I am
eager to receive your expert feedback on the changes made. Your
insights are invaluable, and any review or comments you provide will
be greatly appreciated.
The primary objective of this enhancement is to enable explicit buffer
invalidation within the `pg_buffercache` extension. By doing so, we
can simulate scenarios where buffers are invalidated and observe the
resulting behavior in PostgreSQL.
As part of this patch, a new function or mechanism has been introduced
to facilitate buffer invalidation. I would like to hear your thoughts
on whether this approach provides a good user interface for this
functionality. Additionally, I seek your evaluation of the buffer
locking protocol employed in the extension to ensure its correctness
and efficiency.
Please note that I plan to add comprehensive documentation once the
details of this enhancement are agreed upon. This documentation will
serve as a valuable resource for users and contributors alike. I
believe that your expertise will help uncover any potential issues and
opportunities for further improvement.
I have attached the patch file to this email for your convenience.
Your valuable time and consideration in reviewing this extension are
sincerely appreciated.
Thank you for your continued support and guidance. I am looking
forward to your feedback and collaboration in enhancing the PostgreSQL
ecosystem.
The working of the extension:
1. Creating the extension pg_buffercache and then call select query on
a table and note the buffer to be cleared.
pgbench=# create extension pg_buffercache;
CREATE EXTENSION
pgbench=# select count(*) from pgbench_accounts;
count
--------
100000
(1 row)
pgbench=# SELECT *
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
bufferid | relfilenode | reltablespace | reldatabase | relforknumber
| relblocknumber | isdirty | usagecount | pinning_backends
----------+-------------+---------------+-------------+---------------+----------------+---------+------------+------------------
233 | 16397 | 1663 | 16384 | 0
| 0 | f | 1 | 0
234 | 16397 | 1663 | 16384 | 0
| 1 | f | 1 | 0
235 | 16397 | 1663 | 16384 | 0
| 2 | f | 1 | 0
236 | 16397 | 1663 | 16384 | 0
| 3 | f | 1 | 0
237 | 16397 | 1663 | 16384 | 0
| 4 | f | 1 | 0
2. Clearing a single buffer by entering the bufferid.
pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1660
(1 row)
pgbench=# select pg_buffercache_invalidate(233);
pg_buffercache_invalidate
---------------------------
t
(1 row)
pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1659
(1 row)
3. Clearing the entire buffer for a relation using the function.
pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1659
(1 row)
pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1659
(1 row)
pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
0
(1 row)
Best regards,
Palak
Attachments:
v1-0001-Invalidate-Buffer-By-Bufnum.patchapplication/octet-stream; name=v1-0001-Invalidate-Buffer-By-Bufnum.patchDownload
From 6bab3ba50b43c1bf3aa4ab28ec51b98bd50ac444 Mon Sep 17 00:00:00 2001
From: Palak <palakchaturvedi2843@gmail.com>
Date: Fri, 30 Jun 2023 08:21:06 +0000
Subject: [PATCH v1] Invalidate Buffer By Bufnum
---
contrib/pg_buffercache/Makefile | 2 +-
.../pg_buffercache--1.4--1.5.sql | 6 ++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 27 ++++++++
src/backend/storage/buffer/bufmgr.c | 64 +++++++++++++++++++
src/include/storage/bufmgr.h | 3 +
6 files changed, 102 insertions(+), 2 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index d6b58d4da9..eae65ead9e 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -8,7 +8,7 @@ OBJS = \
EXTENSION = pg_buffercache
DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
- pg_buffercache--1.3--1.4.sql
+ pg_buffercache--1.3--1.4.sql pg_buffercache--1.4--1.5.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
REGRESS = pg_buffercache
diff --git a/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
new file mode 100644
index 0000000000..c7e47456d7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
@@ -0,0 +1,6 @@
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.5'" to load this file. \quit
+
+CREATE FUNCTION pg_buffercache_invalidate(IN int)
+RETURNS bool
+AS 'MODULE_PATHNAME', 'pg_buffercache_invalidate'
+LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index a82ae5f9bb..5ee875f77d 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.4'
+default_version = '1.5'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 3316732365..6dd02fe6af 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -64,6 +64,33 @@ PG_FUNCTION_INFO_V1(pg_buffercache_pages);
PG_FUNCTION_INFO_V1(pg_buffercache_summary);
PG_FUNCTION_INFO_V1(pg_buffercache_usage_counts);
+PG_FUNCTION_INFO_V1(pg_buffercache_invalidate);
+Datum
+pg_buffercache_invalidate(PG_FUNCTION_ARGS)
+{
+ Buffer bufnum;
+ bool result;
+
+ if (PG_ARGISNULL(0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("buffernum cannot be NULL")));
+ }
+
+ bufnum = PG_GETARG_INT32(0);
+ if (bufnum < 0 || bufnum > NBuffers)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("buffernum is not valid")));
+
+ }
+
+ result = TryInvalidateBuffer(bufnum);
+ PG_RETURN_BOOL(result);
+}
+
Datum
pg_buffercache_pages(PG_FUNCTION_ARGS)
{
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 3c59bbd04e..376afcf996 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -5592,3 +5592,67 @@ TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
(errcode(ERRCODE_SNAPSHOT_TOO_OLD),
errmsg("snapshot too old")));
}
+
+/*
+Try Invalidating a buffer using bufnum.
+If the buffer is invalid, the function returns false.
+The function checks for dirty buffer and flushes the dirty buffer before invalidating.
+If the buffer is still dirty it returns false.
+*/
+bool
+TryInvalidateBuffer(Buffer bufnum)
+{
+ BufferDesc *bufHdr = GetBufferDescriptor(bufnum - 1);
+ uint32 buf_state;
+
+ ReservePrivateRefCountEntry();
+
+ buf_state = LockBufHdr(bufHdr);
+ if ((buf_state & BM_VALID) == BM_VALID)
+ {
+ /*
+ * The buffer is pinned therefore cannot invalidate.
+ */
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+ if ((buf_state & BM_DIRTY) == BM_DIRTY)
+ {
+ /*
+ * Try once to flush the dirty buffer.
+ */
+ PinBuffer_Locked(bufHdr);
+ LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_SHARED);
+ FlushBuffer(bufHdr, NULL, IOOBJECT_RELATION, IOCONTEXT_NORMAL);
+ LWLockRelease(BufferDescriptorGetContentLock(bufHdr));
+ UnpinBuffer(bufHdr);
+ buf_state = LockBufHdr(bufHdr);
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+
+ /*
+ * If its dirty again or not valid anymore give up.
+ */
+
+ if ((buf_state & (BM_DIRTY | BM_VALID)) != (BM_VALID))
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+
+ }
+
+ InvalidateBuffer(bufHdr);
+ return true;
+ }
+ else
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+}
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 0f5fb6be00..4ba3d9089b 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -252,6 +252,9 @@ extern bool BgBufferSync(struct WritebackContext *wb_context);
extern void TestForOldSnapshot_impl(Snapshot snapshot, Relation relation);
+
+extern bool TryInvalidateBuffer(Buffer bufnum);
+
/* in buf_init.c */
extern void InitBufferPool(void);
extern Size BufferShmemSize(void);
--
2.25.1
On Fri, Jun 30, 2023 at 10:47 PM Palak Chaturvedi
<chaturvedipalak1911@gmail.com> wrote:
pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('pgbench_accounts'::regclass);
Hi Palak,
Thanks for working on this! I think this will be very useful for
testing existing workloads but also for testing future work on
prefetching with AIO (and DIO), work on putting SLRUs (or anything
else) into the buffer pool, nearby proposals for caching buffer
mapping information, etc etc.
Palak and I talked about this idea a bit last week (stimulated by a
recent thread[1]/messages/by-id/CAFSGpE3y_oMK1uHhcHxGxBxs+KrjMMdGrE+6HHOu0vttVET0UQ@mail.gmail.com, but the topic has certainly come up before), and we
discussed some different ways one could specify which pages are
dropped. For example, perhaps the pg_prewarm extension could have an
'unwarm' option instead. I personally thought the buffer ID-based
approach was quite good because it's extremely simple, while giving
the user the full power of SQL to say which buffers. Half a table?
Visibility map? Everything? Root page of an index? I think that's
probably better than something that requires more code and
complication but is less flexible in the end. It feels like the right
level of rawness for something primarily of interest to hackers and
advanced users. I don't think it matters that there is a window
between selecting a buffer ID and invalidating it, for the intended
use cases. That's my vote, anyway, let's see if others have other
ideas...
We also talked a bit about how one might control the kernel page cache
in more fine-grained ways for testing purposes, but it seems like the
pgfincore project has that covered with its pgfadvise_willneed() and
pgfadvise_dontneed(). IMHO that project could use more page-oriented
operations (instead of just counts and coarse grains operations) but
that's something that could be material for patches to send to the
extension maintainers. This work, in contrast, is more tangled up
with bufmgr.c internals, so it feels like this feature belongs in a
core contrib module.
Some initial thoughts on the patch:
I wonder if we should include a simple exercise in
contrib/pg_buffercache/sql/pg_buffercache.sql. One problem is that
it's not guaranteed to succeed in general. It doesn't wait for pins
to go away, and it doesn't retry cleaning dirty buffers after one
attempt, it just returns false, which I think is probably the right
approach, but it makes the behaviour too non-deterministic for simple
tests. Perhaps it's enough to include an exercise where we call it a
few times to hit a couple of cases, but not verify what effect it has.
It should be restricted by role, but I wonder which role it should be.
Testing for superuser is now out of fashion.
Where the Makefile mentions 1.4--1.5.sql, the meson.build file needs
to do the same. That's because PostgreSQL is currently in transition
from autoconf/gmake to meson/ninja[2]https://wiki.postgresql.org/wiki/Meson, so for now we have to maintain
both build systems. That's why it fails to build in some CI tasks[3]http://cfbot.cputube.org/palak-chaturvedi.html.
You can enable CI in your own GitHub account if you want to run test
builds on several operating systems, see [4]https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/tools/ci/README;hb=HEAD for info.
[1]: /messages/by-id/CAFSGpE3y_oMK1uHhcHxGxBxs+KrjMMdGrE+6HHOu0vttVET0UQ@mail.gmail.com
[2]: https://wiki.postgresql.org/wiki/Meson
[3]: http://cfbot.cputube.org/palak-chaturvedi.html
[4]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/tools/ci/README;hb=HEAD
Hi Thomas,
Thank you for your suggestions. I have added the sql in the meson
build as well.
Show quoted text
On Sat, 1 Jul 2023 at 03:39, Thomas Munro <thomas.munro@gmail.com> wrote:
On Fri, Jun 30, 2023 at 10:47 PM Palak Chaturvedi
<chaturvedipalak1911@gmail.com> wrote:pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('pgbench_accounts'::regclass);Hi Palak,
Thanks for working on this! I think this will be very useful for
testing existing workloads but also for testing future work on
prefetching with AIO (and DIO), work on putting SLRUs (or anything
else) into the buffer pool, nearby proposals for caching buffer
mapping information, etc etc.Palak and I talked about this idea a bit last week (stimulated by a
recent thread[1], but the topic has certainly come up before), and we
discussed some different ways one could specify which pages are
dropped. For example, perhaps the pg_prewarm extension could have an
'unwarm' option instead. I personally thought the buffer ID-based
approach was quite good because it's extremely simple, while giving
the user the full power of SQL to say which buffers. Half a table?
Visibility map? Everything? Root page of an index? I think that's
probably better than something that requires more code and
complication but is less flexible in the end. It feels like the right
level of rawness for something primarily of interest to hackers and
advanced users. I don't think it matters that there is a window
between selecting a buffer ID and invalidating it, for the intended
use cases. That's my vote, anyway, let's see if others have other
ideas...We also talked a bit about how one might control the kernel page cache
in more fine-grained ways for testing purposes, but it seems like the
pgfincore project has that covered with its pgfadvise_willneed() and
pgfadvise_dontneed(). IMHO that project could use more page-oriented
operations (instead of just counts and coarse grains operations) but
that's something that could be material for patches to send to the
extension maintainers. This work, in contrast, is more tangled up
with bufmgr.c internals, so it feels like this feature belongs in a
core contrib module.Some initial thoughts on the patch:
I wonder if we should include a simple exercise in
contrib/pg_buffercache/sql/pg_buffercache.sql. One problem is that
it's not guaranteed to succeed in general. It doesn't wait for pins
to go away, and it doesn't retry cleaning dirty buffers after one
attempt, it just returns false, which I think is probably the right
approach, but it makes the behaviour too non-deterministic for simple
tests. Perhaps it's enough to include an exercise where we call it a
few times to hit a couple of cases, but not verify what effect it has.It should be restricted by role, but I wonder which role it should be.
Testing for superuser is now out of fashion.Where the Makefile mentions 1.4--1.5.sql, the meson.build file needs
to do the same. That's because PostgreSQL is currently in transition
from autoconf/gmake to meson/ninja[2], so for now we have to maintain
both build systems. That's why it fails to build in some CI tasks[3].
You can enable CI in your own GitHub account if you want to run test
builds on several operating systems, see [4] for info.[1] /messages/by-id/CAFSGpE3y_oMK1uHhcHxGxBxs+KrjMMdGrE+6HHOu0vttVET0UQ@mail.gmail.com
[2] https://wiki.postgresql.org/wiki/Meson
[3] http://cfbot.cputube.org/palak-chaturvedi.html
[4] https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/tools/ci/README;hb=HEAD
Attachments:
v1-0001-Invalidate-Buffer-By-Bufnum.patchapplication/octet-stream; name=v1-0001-Invalidate-Buffer-By-Bufnum.patchDownload
From 02ea352d84ed87e156617de8b8020811680cb412 Mon Sep 17 00:00:00 2001
From: Palak <palakchaturvedi2843@gmail.com>
Date: Fri, 30 Jun 2023 08:21:06 +0000
Subject: [PATCH v1] Invalidate Buffer By Bufnum
---
contrib/pg_buffercache/Makefile | 2 +-
contrib/pg_buffercache/meson.build | 1 +
.../pg_buffercache--1.4--1.5.sql | 6 ++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 27 ++++++++
src/backend/storage/buffer/bufmgr.c | 64 +++++++++++++++++++
src/include/storage/bufmgr.h | 3 +
7 files changed, 103 insertions(+), 2 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index d6b58d4da9..eae65ead9e 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -8,7 +8,7 @@ OBJS = \
EXTENSION = pg_buffercache
DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
- pg_buffercache--1.3--1.4.sql
+ pg_buffercache--1.3--1.4.sql pg_buffercache--1.4--1.5.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
REGRESS = pg_buffercache
diff --git a/contrib/pg_buffercache/meson.build b/contrib/pg_buffercache/meson.build
index c51edf37d1..748463bc19 100644
--- a/contrib/pg_buffercache/meson.build
+++ b/contrib/pg_buffercache/meson.build
@@ -22,6 +22,7 @@ install_data(
'pg_buffercache--1.2--1.3.sql',
'pg_buffercache--1.2.sql',
'pg_buffercache--1.3--1.4.sql',
+ 'pg_buffercache--1.4--1.5.sql',
'pg_buffercache.control',
kwargs: contrib_data_args,
)
diff --git a/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
new file mode 100644
index 0000000000..c7e47456d7
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
@@ -0,0 +1,6 @@
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.5'" to load this file. \quit
+
+CREATE FUNCTION pg_buffercache_invalidate(IN int)
+RETURNS bool
+AS 'MODULE_PATHNAME', 'pg_buffercache_invalidate'
+LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index a82ae5f9bb..5ee875f77d 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.4'
+default_version = '1.5'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 3316732365..6dd02fe6af 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -64,6 +64,33 @@ PG_FUNCTION_INFO_V1(pg_buffercache_pages);
PG_FUNCTION_INFO_V1(pg_buffercache_summary);
PG_FUNCTION_INFO_V1(pg_buffercache_usage_counts);
+PG_FUNCTION_INFO_V1(pg_buffercache_invalidate);
+Datum
+pg_buffercache_invalidate(PG_FUNCTION_ARGS)
+{
+ Buffer bufnum;
+ bool result;
+
+ if (PG_ARGISNULL(0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("buffernum cannot be NULL")));
+ }
+
+ bufnum = PG_GETARG_INT32(0);
+ if (bufnum < 0 || bufnum > NBuffers)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("buffernum is not valid")));
+
+ }
+
+ result = TryInvalidateBuffer(bufnum);
+ PG_RETURN_BOOL(result);
+}
+
Datum
pg_buffercache_pages(PG_FUNCTION_ARGS)
{
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 3c59bbd04e..376afcf996 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -5592,3 +5592,67 @@ TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
(errcode(ERRCODE_SNAPSHOT_TOO_OLD),
errmsg("snapshot too old")));
}
+
+/*
+Try Invalidating a buffer using bufnum.
+If the buffer is invalid, the function returns false.
+The function checks for dirty buffer and flushes the dirty buffer before invalidating.
+If the buffer is still dirty it returns false.
+*/
+bool
+TryInvalidateBuffer(Buffer bufnum)
+{
+ BufferDesc *bufHdr = GetBufferDescriptor(bufnum - 1);
+ uint32 buf_state;
+
+ ReservePrivateRefCountEntry();
+
+ buf_state = LockBufHdr(bufHdr);
+ if ((buf_state & BM_VALID) == BM_VALID)
+ {
+ /*
+ * The buffer is pinned therefore cannot invalidate.
+ */
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+ if ((buf_state & BM_DIRTY) == BM_DIRTY)
+ {
+ /*
+ * Try once to flush the dirty buffer.
+ */
+ PinBuffer_Locked(bufHdr);
+ LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_SHARED);
+ FlushBuffer(bufHdr, NULL, IOOBJECT_RELATION, IOCONTEXT_NORMAL);
+ LWLockRelease(BufferDescriptorGetContentLock(bufHdr));
+ UnpinBuffer(bufHdr);
+ buf_state = LockBufHdr(bufHdr);
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+
+ /*
+ * If its dirty again or not valid anymore give up.
+ */
+
+ if ((buf_state & (BM_DIRTY | BM_VALID)) != (BM_VALID))
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+
+ }
+
+ InvalidateBuffer(bufHdr);
+ return true;
+ }
+ else
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+}
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 0f5fb6be00..4ba3d9089b 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -252,6 +252,9 @@ extern bool BgBufferSync(struct WritebackContext *wb_context);
extern void TestForOldSnapshot_impl(Snapshot snapshot, Relation relation);
+
+extern bool TryInvalidateBuffer(Buffer bufnum);
+
/* in buf_init.c */
extern void InitBufferPool(void);
extern Size BufferShmemSize(void);
--
2.25.1
On Mon, Jul 3, 2023 at 4:26 PM Palak Chaturvedi
<chaturvedipalak1911@gmail.com> wrote:
Hi Thomas,
Thank you for your suggestions. I have added the sql in the meson
build as well.On Sat, 1 Jul 2023 at 03:39, Thomas Munro <thomas.munro@gmail.com> wrote:
On Fri, Jun 30, 2023 at 10:47 PM Palak Chaturvedi
<chaturvedipalak1911@gmail.com> wrote:pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('pgbench_accounts'::regclass);Hi Palak,
Thanks for working on this! I think this will be very useful for
testing existing workloads but also for testing future work on
prefetching with AIO (and DIO), work on putting SLRUs (or anything
else) into the buffer pool, nearby proposals for caching buffer
mapping information, etc etc.Palak and I talked about this idea a bit last week (stimulated by a
recent thread[1], but the topic has certainly come up before), and we
discussed some different ways one could specify which pages are
dropped. For example, perhaps the pg_prewarm extension could have an
'unwarm' option instead. I personally thought the buffer ID-based
approach was quite good because it's extremely simple, while giving
the user the full power of SQL to say which buffers. Half a table?
Visibility map? Everything? Root page of an index? I think that's
probably better than something that requires more code and
complication but is less flexible in the end. It feels like the right
level of rawness for something primarily of interest to hackers and
advanced users. I don't think it matters that there is a window
between selecting a buffer ID and invalidating it, for the intended
use cases. That's my vote, anyway, let's see if others have other
ideas...We also talked a bit about how one might control the kernel page cache
in more fine-grained ways for testing purposes, but it seems like the
pgfincore project has that covered with its pgfadvise_willneed() and
pgfadvise_dontneed(). IMHO that project could use more page-oriented
operations (instead of just counts and coarse grains operations) but
that's something that could be material for patches to send to the
extension maintainers. This work, in contrast, is more tangled up
with bufmgr.c internals, so it feels like this feature belongs in a
core contrib module.Some initial thoughts on the patch:
I wonder if we should include a simple exercise in
contrib/pg_buffercache/sql/pg_buffercache.sql. One problem is that
it's not guaranteed to succeed in general. It doesn't wait for pins
to go away, and it doesn't retry cleaning dirty buffers after one
attempt, it just returns false, which I think is probably the right
approach, but it makes the behaviour too non-deterministic for simple
tests. Perhaps it's enough to include an exercise where we call it a
few times to hit a couple of cases, but not verify what effect it has.It should be restricted by role, but I wonder which role it should be.
Testing for superuser is now out of fashion.Where the Makefile mentions 1.4--1.5.sql, the meson.build file needs
to do the same. That's because PostgreSQL is currently in transition
from autoconf/gmake to meson/ninja[2], so for now we have to maintain
both build systems. That's why it fails to build in some CI tasks[3].
You can enable CI in your own GitHub account if you want to run test
builds on several operating systems, see [4] for info.[1] /messages/by-id/CAFSGpE3y_oMK1uHhcHxGxBxs+KrjMMdGrE+6HHOu0vttVET0UQ@mail.gmail.com
[2] https://wiki.postgresql.org/wiki/Meson
[3] http://cfbot.cputube.org/palak-chaturvedi.html
[4] https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/tools/ci/README;hb=HEAD
newbie question:
quote from: https://www.interdb.jp/pg/pgsql08.html
Pinned: When the corresponding buffer pool slot stores a page and any PostgreSQL processes are accessing the page (i.e. refcount and usage_count are greater than or equal to 1), the state of this buffer descriptor is pinned.
Unpinned: When the corresponding buffer pool slot stores a page but no PostgreSQL processes are accessing the page (i.e. usage_count is greater than or equal to 1, but refcount is 0), the state of this buffer descriptor is unpinned.
So do you need to check BUF_STATE_GET_REFCOUNT(buf_state) and
BUF_STATE_GET_USAGECOUNT(state)?
hi,
I don't think we need to check the usage count. Because we are
clearing all the buffers that are not pinned.
Checking the usage count is for buffer replacement since we are not
replacing it does not matter.
Show quoted text
On Mon, 3 Jul 2023 at 21:16, jian he <jian.universality@gmail.com> wrote:
On Mon, Jul 3, 2023 at 4:26 PM Palak Chaturvedi
<chaturvedipalak1911@gmail.com> wrote:Hi Thomas,
Thank you for your suggestions. I have added the sql in the meson
build as well.On Sat, 1 Jul 2023 at 03:39, Thomas Munro <thomas.munro@gmail.com> wrote:
On Fri, Jun 30, 2023 at 10:47 PM Palak Chaturvedi
<chaturvedipalak1911@gmail.com> wrote:pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('pgbench_accounts'::regclass);Hi Palak,
Thanks for working on this! I think this will be very useful for
testing existing workloads but also for testing future work on
prefetching with AIO (and DIO), work on putting SLRUs (or anything
else) into the buffer pool, nearby proposals for caching buffer
mapping information, etc etc.Palak and I talked about this idea a bit last week (stimulated by a
recent thread[1], but the topic has certainly come up before), and we
discussed some different ways one could specify which pages are
dropped. For example, perhaps the pg_prewarm extension could have an
'unwarm' option instead. I personally thought the buffer ID-based
approach was quite good because it's extremely simple, while giving
the user the full power of SQL to say which buffers. Half a table?
Visibility map? Everything? Root page of an index? I think that's
probably better than something that requires more code and
complication but is less flexible in the end. It feels like the right
level of rawness for something primarily of interest to hackers and
advanced users. I don't think it matters that there is a window
between selecting a buffer ID and invalidating it, for the intended
use cases. That's my vote, anyway, let's see if others have other
ideas...We also talked a bit about how one might control the kernel page cache
in more fine-grained ways for testing purposes, but it seems like the
pgfincore project has that covered with its pgfadvise_willneed() and
pgfadvise_dontneed(). IMHO that project could use more page-oriented
operations (instead of just counts and coarse grains operations) but
that's something that could be material for patches to send to the
extension maintainers. This work, in contrast, is more tangled up
with bufmgr.c internals, so it feels like this feature belongs in a
core contrib module.Some initial thoughts on the patch:
I wonder if we should include a simple exercise in
contrib/pg_buffercache/sql/pg_buffercache.sql. One problem is that
it's not guaranteed to succeed in general. It doesn't wait for pins
to go away, and it doesn't retry cleaning dirty buffers after one
attempt, it just returns false, which I think is probably the right
approach, but it makes the behaviour too non-deterministic for simple
tests. Perhaps it's enough to include an exercise where we call it a
few times to hit a couple of cases, but not verify what effect it has.It should be restricted by role, but I wonder which role it should be.
Testing for superuser is now out of fashion.Where the Makefile mentions 1.4--1.5.sql, the meson.build file needs
to do the same. That's because PostgreSQL is currently in transition
from autoconf/gmake to meson/ninja[2], so for now we have to maintain
both build systems. That's why it fails to build in some CI tasks[3].
You can enable CI in your own GitHub account if you want to run test
builds on several operating systems, see [4] for info.[1] /messages/by-id/CAFSGpE3y_oMK1uHhcHxGxBxs+KrjMMdGrE+6HHOu0vttVET0UQ@mail.gmail.com
[2] https://wiki.postgresql.org/wiki/Meson
[3] http://cfbot.cputube.org/palak-chaturvedi.html
[4] https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/tools/ci/README;hb=HEADnewbie question:
quote from: https://www.interdb.jp/pg/pgsql08.htmlPinned: When the corresponding buffer pool slot stores a page and any PostgreSQL processes are accessing the page (i.e. refcount and usage_count are greater than or equal to 1), the state of this buffer descriptor is pinned.
Unpinned: When the corresponding buffer pool slot stores a page but no PostgreSQL processes are accessing the page (i.e. usage_count is greater than or equal to 1, but refcount is 0), the state of this buffer descriptor is unpinned.So do you need to check BUF_STATE_GET_REFCOUNT(buf_state) and
BUF_STATE_GET_USAGECOUNT(state)?
On Mon, 03 Jul 2023 at 16:26, Palak Chaturvedi <chaturvedipalak1911@gmail.com> wrote:
Hi Thomas,
Thank you for your suggestions. I have added the sql in the meson
build as well.On Sat, 1 Jul 2023 at 03:39, Thomas Munro <thomas.munro@gmail.com> wrote:
On Fri, Jun 30, 2023 at 10:47 PM Palak Chaturvedi
<chaturvedipalak1911@gmail.com> wrote:pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('pgbench_accounts'::regclass);Hi Palak,
Thanks for working on this! I think this will be very useful for
testing existing workloads but also for testing future work on
prefetching with AIO (and DIO), work on putting SLRUs (or anything
else) into the buffer pool, nearby proposals for caching buffer
mapping information, etc etc.Palak and I talked about this idea a bit last week (stimulated by a
recent thread[1], but the topic has certainly come up before), and we
discussed some different ways one could specify which pages are
dropped. For example, perhaps the pg_prewarm extension could have an
'unwarm' option instead. I personally thought the buffer ID-based
approach was quite good because it's extremely simple, while giving
the user the full power of SQL to say which buffers. Half a table?
Visibility map? Everything? Root page of an index? I think that's
probably better than something that requires more code and
complication but is less flexible in the end. It feels like the right
level of rawness for something primarily of interest to hackers and
advanced users. I don't think it matters that there is a window
between selecting a buffer ID and invalidating it, for the intended
use cases. That's my vote, anyway, let's see if others have other
ideas...We also talked a bit about how one might control the kernel page cache
in more fine-grained ways for testing purposes, but it seems like the
pgfincore project has that covered with its pgfadvise_willneed() and
pgfadvise_dontneed(). IMHO that project could use more page-oriented
operations (instead of just counts and coarse grains operations) but
that's something that could be material for patches to send to the
extension maintainers. This work, in contrast, is more tangled up
with bufmgr.c internals, so it feels like this feature belongs in a
core contrib module.Some initial thoughts on the patch:
I wonder if we should include a simple exercise in
contrib/pg_buffercache/sql/pg_buffercache.sql. One problem is that
it's not guaranteed to succeed in general. It doesn't wait for pins
to go away, and it doesn't retry cleaning dirty buffers after one
attempt, it just returns false, which I think is probably the right
approach, but it makes the behaviour too non-deterministic for simple
tests. Perhaps it's enough to include an exercise where we call it a
few times to hit a couple of cases, but not verify what effect it has.It should be restricted by role, but I wonder which role it should be.
Testing for superuser is now out of fashion.Where the Makefile mentions 1.4--1.5.sql, the meson.build file needs
to do the same. That's because PostgreSQL is currently in transition
from autoconf/gmake to meson/ninja[2], so for now we have to maintain
both build systems. That's why it fails to build in some CI tasks[3].
You can enable CI in your own GitHub account if you want to run test
builds on several operating systems, see [4] for info.[1] /messages/by-id/CAFSGpE3y_oMK1uHhcHxGxBxs+KrjMMdGrE+6HHOu0vttVET0UQ@mail.gmail.com
[2] https://wiki.postgresql.org/wiki/Meson
[3] http://cfbot.cputube.org/palak-chaturvedi.html
[4] https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/tools/ci/README;hb=HEAD
I think, zero is not a valid buffer identifier. See src/include/storage/buf.h.
+ bufnum = PG_GETARG_INT32(0);
+ if (bufnum < 0 || bufnum > NBuffers)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("buffernum is not valid")));
+
+ }
If we use SELECT pg_buffercache_invalidate(0), it will crash.
--
Regrads,
Japin Li.
the following will also crash. no idea why.
begin;
select count(*) from onek;
select relpages from pg_class where relname = 'onek'; --queryA
SELECT count(*) FROM pg_buffercache WHERE relfilenode =
pg_relation_filenode('onek'::regclass); --queryB
insert into onek values(default);
select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('onek'::regclass);
---------------------------------
queryA returns 35, queryB returns 37.
----------------------------------
crash info:
test_dev=*# insert into onek values(default);
INSERT 0 1
test_dev=*# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('onek'::regclass);
TRAP: failed Assert("resarr->nitems < resarr->maxitems"), File:
"../../Desktop/pg_sources/main/postgres/src/backend/utils/resowner/resowner.c",
Line: 275, PID: 1533312
postgres: jian test_dev [local]
SELECT(ExceptionalCondition+0xa1)[0x55fc8f8d14e1]
postgres: jian test_dev [local] SELECT(+0x9e7ab3)[0x55fc8f915ab3]
postgres: jian test_dev [local]
SELECT(ResourceOwnerRememberBuffer+0x1d)[0x55fc8f91696d]
postgres: jian test_dev [local] SELECT(+0x78ab17)[0x55fc8f6b8b17]
postgres: jian test_dev [local]
SELECT(TryInvalidateBuffer+0x6d)[0x55fc8f6c507d]
/home/jian/postgres/pg16_test/lib/pg_buffercache.so(pg_buffercache_invalidate+0x3d)[0x7f2361837abd]
postgres: jian test_dev [local] SELECT(+0x57eebc)[0x55fc8f4acebc]
postgres: jian test_dev [local]
SELECT(ExecInterpExprStillValid+0x3c)[0x55fc8f4a6e2c]
postgres: jian test_dev [local] SELECT(+0x5a0f16)[0x55fc8f4cef16]
postgres: jian test_dev [local] SELECT(+0x5a3588)[0x55fc8f4d1588]
postgres: jian test_dev [local] SELECT(+0x58f747)[0x55fc8f4bd747]
postgres: jian test_dev [local]
SELECT(standard_ExecutorRun+0x1f0)[0x55fc8f4b29f0]
postgres: jian test_dev [local] SELECT(ExecutorRun+0x46)[0x55fc8f4b2d16]
postgres: jian test_dev [local] SELECT(+0x7eb3b0)[0x55fc8f7193b0]
postgres: jian test_dev [local] SELECT(PortalRun+0x1eb)[0x55fc8f71b7ab]
postgres: jian test_dev [local] SELECT(+0x7e8cf4)[0x55fc8f716cf4]
postgres: jian test_dev [local] SELECT(PostgresMain+0x134f)[0x55fc8f71869f]
postgres: jian test_dev [local] SELECT(+0x70f80c)[0x55fc8f63d80c]
postgres: jian test_dev [local]
SELECT(PostmasterMain+0x1758)[0x55fc8f63f278]
postgres: jian test_dev [local] SELECT(main+0x27e)[0x55fc8f27067e]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f2361629d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f2361629e40]
postgres: jian test_dev [local] SELECT(_start+0x25)[0x55fc8f272bb5]
2023-07-04 16:56:13.088 CST [1532822] LOG: server process (PID 1533312)
was terminated by signal 6: Aborted
2023-07-04 16:56:13.088 CST [1532822] DETAIL: Failed process was running:
select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('onek'::regclass);
2023-07-04 16:56:13.088 CST [1532822] LOG: terminating any other active
server processes
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: 2023-07-04
16:56:13.091 CST [1533381] FATAL: the database system is in recovery mode
Failed.
The connection to the server was lost. Attempting reset: Failed.
On Tue, 04 Jul 2023 at 17:00, jian he <jian.universality@gmail.com> wrote:
the following will also crash. no idea why.
begin;
select count(*) from onek;
select relpages from pg_class where relname = 'onek'; --queryASELECT count(*) FROM pg_buffercache WHERE relfilenode =
pg_relation_filenode('onek'::regclass); --queryBinsert into onek values(default);
select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('onek'::regclass);---------------------------------
queryA returns 35, queryB returns 37.
----------------------------------
crash info:
test_dev=*# insert into onek values(default);
INSERT 0 1
test_dev=*# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('onek'::regclass);
TRAP: failed Assert("resarr->nitems < resarr->maxitems"), File:
"../../Desktop/pg_sources/main/postgres/src/backend/utils/resowner/resowner.c",
Line: 275, PID: 1533312
According to the comments of ResourceArrayAdd(), the caller must have previously
done ResourceArrayEnlarge(). I tried to call ResourceOwnerEnlargeBuffers() before
PinBuffer_Locked(), so it can avoid this crash.
if ((buf_state & BM_DIRTY) == BM_DIRTY)
{
+ /* make sure we can handle the pin */
+ ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
+
/*
* Try once to flush the dirty buffer.
*/
PinBuffer_Locked(bufHdr);
--
Regrads,
Japin Li.
On Tue, Jul 4, 2023 at 5:45 PM Japin Li <japinli@hotmail.com> wrote:
On Tue, 04 Jul 2023 at 17:00, jian he <jian.universality@gmail.com> wrote:
the following will also crash. no idea why.
begin;
select count(*) from onek;
select relpages from pg_class where relname = 'onek'; --queryASELECT count(*) FROM pg_buffercache WHERE relfilenode =
pg_relation_filenode('onek'::regclass); --queryBinsert into onek values(default);
select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('onek'::regclass);---------------------------------
queryA returns 35, queryB returns 37.
----------------------------------
crash info:
test_dev=*# insert into onek values(default);
INSERT 0 1
test_dev=*# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('onek'::regclass);
TRAP: failed Assert("resarr->nitems < resarr->maxitems"), File:"../../Desktop/pg_sources/main/postgres/src/backend/utils/resowner/resowner.c",
Line: 275, PID: 1533312
According to the comments of ResourceArrayAdd(), the caller must have
previously
done ResourceArrayEnlarge(). I tried to call ResourceOwnerEnlargeBuffers()
before
PinBuffer_Locked(), so it can avoid this crash.if ((buf_state & BM_DIRTY) == BM_DIRTY) { + /* make sure we can handle the pin */ + ResourceOwnerEnlargeBuffers(CurrentResourceOwner); + /* * Try once to flush the dirty buffer. */ PinBuffer_Locked(bufHdr);--
Regrads,
Japin Li.
thanks. tested flush pg_catalog, public schema, now, both works as pitched.
On Sat, Jul 1, 2023 at 6:09 AM Thomas Munro <thomas.munro@gmail.com> wrote:
It should be restricted by role, but I wonder which role it should be.
Testing for superuser is now out of fashion.
as pg_buffercache/pg_buffercache--1.2--1.3.sql. You need pg_maintain
privilege to use pg_buffercache.
The following query works on a single user. Obviously you need a role who
can gain pg_monitor privilege.
begin;
create role test login nosuperuser;
grant select, insert on onek to test;
grant pg_monitor to test;
set role test;
select count(*) from onek;
insert into onek values(default);
(SELECT count(*) FROM pg_buffercache WHERE relfilenode =
pg_relation_filenode('onek'::regclass))
except
(
select count(pg_buffercache_invalidate(bufferid))
from pg_buffercache where relfilenode =
pg_relation_filenode('onek'::regclass)
);
rollback;
+1 for the idea. It's going to be more useful to test and understand
the buffer management of PostgreSQL and it can be used to explicitly
free up the buffers if there are any such requirements.
I had a quick look over the patch. Following are the comments.
First, The TryInvalidateBuffer() tries to flush the buffer if it is
dirty and then tries to invalidate it if it meets the requirement.
Instead of directly doing this can we provide an option to the caller
to mention whether to invalidate the dirty buffers or not. For
example, TryInvalidateBuffer(Buffer bufnum, bool force), if the force
is set to FALSE, then ignore invalidating dirty buffers. Otherwise,
flush the dirty buffer and try to invalidate.
Second, In TryInvalidateBuffer(), it first checks if the reference
count is greater than zero and then checks for dirty buffers. Will
there be a scenario where the buffer is dirty and its reference count
is zero? Can you please provide more information on this or adjust the
code accordingly.
+/* +Try Invalidating a buffer using bufnum. +If the buffer is invalid, the function returns false. +The function checks for dirty buffer and flushes the dirty buffer before invalidating. +If the buffer is still dirty it returns false. +*/ +bool
The star(*) and space are missing here. Please refer to the style of
function comments and change accordingly.
Thanks & Regards,
Nitin Jadhav
On Fri, Jun 30, 2023 at 4:17 PM Palak Chaturvedi
<chaturvedipalak1911@gmail.com> wrote:
Show quoted text
I hope this email finds you well. I am excited to share that I have
extended the functionality of the `pg_buffercache` extension by
implementing buffer invalidation capability, as requested by some
PostgreSQL contributors for improved testing scenarios.This marks my first time submitting a patch to pgsql-hackers, and I am
eager to receive your expert feedback on the changes made. Your
insights are invaluable, and any review or comments you provide will
be greatly appreciated.The primary objective of this enhancement is to enable explicit buffer
invalidation within the `pg_buffercache` extension. By doing so, we
can simulate scenarios where buffers are invalidated and observe the
resulting behavior in PostgreSQL.As part of this patch, a new function or mechanism has been introduced
to facilitate buffer invalidation. I would like to hear your thoughts
on whether this approach provides a good user interface for this
functionality. Additionally, I seek your evaluation of the buffer
locking protocol employed in the extension to ensure its correctness
and efficiency.Please note that I plan to add comprehensive documentation once the
details of this enhancement are agreed upon. This documentation will
serve as a valuable resource for users and contributors alike. I
believe that your expertise will help uncover any potential issues and
opportunities for further improvement.I have attached the patch file to this email for your convenience.
Your valuable time and consideration in reviewing this extension are
sincerely appreciated.Thank you for your continued support and guidance. I am looking
forward to your feedback and collaboration in enhancing the PostgreSQL
ecosystem.The working of the extension:
1. Creating the extension pg_buffercache and then call select query on
a table and note the buffer to be cleared.
pgbench=# create extension pg_buffercache;
CREATE EXTENSION
pgbench=# select count(*) from pgbench_accounts;
count
--------
100000
(1 row)pgbench=# SELECT *
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
bufferid | relfilenode | reltablespace | reldatabase | relforknumber
| relblocknumber | isdirty | usagecount | pinning_backends
----------+-------------+---------------+-------------+---------------+----------------+---------+------------+------------------
233 | 16397 | 1663 | 16384 | 0
| 0 | f | 1 | 0
234 | 16397 | 1663 | 16384 | 0
| 1 | f | 1 | 0
235 | 16397 | 1663 | 16384 | 0
| 2 | f | 1 | 0
236 | 16397 | 1663 | 16384 | 0
| 3 | f | 1 | 0
237 | 16397 | 1663 | 16384 | 0
| 4 | f | 1 | 02. Clearing a single buffer by entering the bufferid.
pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1660
(1 row)pgbench=# select pg_buffercache_invalidate(233);
pg_buffercache_invalidate
---------------------------
t
(1 row)pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1659
(1 row)3. Clearing the entire buffer for a relation using the function.
pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1659
(1 row)pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1659
(1 row)pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
0
(1 row)Best regards,
Palak
Hey Nitin,
Will
there be a scenario where the buffer is dirty and its reference count
is zero?
There might be a buffer that has been dirtied but is not pinned or
being used currently by a process. So checking the refcount and then
dirty buffers helps.
First, The TryInvalidateBuffer() tries to flush the buffer if it is
dirty and then tries to invalidate it if it meets the requirement.
Instead of directly doing this can we provide an option to the caller
to mention whether to invalidate the dirty buffers or not.
Yes that can be implemented with a default value of force. Will
implement it in the next patch.
Show quoted text
On Wed, 5 Jul 2023 at 17:53, Nitin Jadhav <nitinjadhavpostgres@gmail.com> wrote:
+1 for the idea. It's going to be more useful to test and understand
the buffer management of PostgreSQL and it can be used to explicitly
free up the buffers if there are any such requirements.I had a quick look over the patch. Following are the comments.
First, The TryInvalidateBuffer() tries to flush the buffer if it is
dirty and then tries to invalidate it if it meets the requirement.
Instead of directly doing this can we provide an option to the caller
to mention whether to invalidate the dirty buffers or not. For
example, TryInvalidateBuffer(Buffer bufnum, bool force), if the force
is set to FALSE, then ignore invalidating dirty buffers. Otherwise,
flush the dirty buffer and try to invalidate.Second, In TryInvalidateBuffer(), it first checks if the reference
count is greater than zero and then checks for dirty buffers. Will
there be a scenario where the buffer is dirty and its reference count
is zero? Can you please provide more information on this or adjust the
code accordingly.+/* +Try Invalidating a buffer using bufnum. +If the buffer is invalid, the function returns false. +The function checks for dirty buffer and flushes the dirty buffer before invalidating. +If the buffer is still dirty it returns false. +*/ +boolThe star(*) and space are missing here. Please refer to the style of
function comments and change accordingly.Thanks & Regards,
Nitin JadhavOn Fri, Jun 30, 2023 at 4:17 PM Palak Chaturvedi
<chaturvedipalak1911@gmail.com> wrote:I hope this email finds you well. I am excited to share that I have
extended the functionality of the `pg_buffercache` extension by
implementing buffer invalidation capability, as requested by some
PostgreSQL contributors for improved testing scenarios.This marks my first time submitting a patch to pgsql-hackers, and I am
eager to receive your expert feedback on the changes made. Your
insights are invaluable, and any review or comments you provide will
be greatly appreciated.The primary objective of this enhancement is to enable explicit buffer
invalidation within the `pg_buffercache` extension. By doing so, we
can simulate scenarios where buffers are invalidated and observe the
resulting behavior in PostgreSQL.As part of this patch, a new function or mechanism has been introduced
to facilitate buffer invalidation. I would like to hear your thoughts
on whether this approach provides a good user interface for this
functionality. Additionally, I seek your evaluation of the buffer
locking protocol employed in the extension to ensure its correctness
and efficiency.Please note that I plan to add comprehensive documentation once the
details of this enhancement are agreed upon. This documentation will
serve as a valuable resource for users and contributors alike. I
believe that your expertise will help uncover any potential issues and
opportunities for further improvement.I have attached the patch file to this email for your convenience.
Your valuable time and consideration in reviewing this extension are
sincerely appreciated.Thank you for your continued support and guidance. I am looking
forward to your feedback and collaboration in enhancing the PostgreSQL
ecosystem.The working of the extension:
1. Creating the extension pg_buffercache and then call select query on
a table and note the buffer to be cleared.
pgbench=# create extension pg_buffercache;
CREATE EXTENSION
pgbench=# select count(*) from pgbench_accounts;
count
--------
100000
(1 row)pgbench=# SELECT *
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
bufferid | relfilenode | reltablespace | reldatabase | relforknumber
| relblocknumber | isdirty | usagecount | pinning_backends
----------+-------------+---------------+-------------+---------------+----------------+---------+------------+------------------
233 | 16397 | 1663 | 16384 | 0
| 0 | f | 1 | 0
234 | 16397 | 1663 | 16384 | 0
| 1 | f | 1 | 0
235 | 16397 | 1663 | 16384 | 0
| 2 | f | 1 | 0
236 | 16397 | 1663 | 16384 | 0
| 3 | f | 1 | 0
237 | 16397 | 1663 | 16384 | 0
| 4 | f | 1 | 02. Clearing a single buffer by entering the bufferid.
pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1660
(1 row)pgbench=# select pg_buffercache_invalidate(233);
pg_buffercache_invalidate
---------------------------
t
(1 row)pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1659
(1 row)3. Clearing the entire buffer for a relation using the function.
pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1659
(1 row)pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1659
(1 row)pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
0
(1 row)Best regards,
Palak
Can you please review the new patch of the extension with implemented
force variable.
On Tue, 11 Jul 2023 at 18:08, Palak Chaturvedi
<chaturvedipalak1911@gmail.com> wrote:
Show quoted text
Hey Nitin,
Will
there be a scenario where the buffer is dirty and its reference count
is zero?There might be a buffer that has been dirtied but is not pinned or
being used currently by a process. So checking the refcount and then
dirty buffers helps.First, The TryInvalidateBuffer() tries to flush the buffer if it is
dirty and then tries to invalidate it if it meets the requirement.
Instead of directly doing this can we provide an option to the caller
to mention whether to invalidate the dirty buffers or not.
Yes that can be implemented with a default value of force. Will
implement it in the next patch.On Wed, 5 Jul 2023 at 17:53, Nitin Jadhav <nitinjadhavpostgres@gmail.com> wrote:
+1 for the idea. It's going to be more useful to test and understand
the buffer management of PostgreSQL and it can be used to explicitly
free up the buffers if there are any such requirements.I had a quick look over the patch. Following are the comments.
First, The TryInvalidateBuffer() tries to flush the buffer if it is
dirty and then tries to invalidate it if it meets the requirement.
Instead of directly doing this can we provide an option to the caller
to mention whether to invalidate the dirty buffers or not. For
example, TryInvalidateBuffer(Buffer bufnum, bool force), if the force
is set to FALSE, then ignore invalidating dirty buffers. Otherwise,
flush the dirty buffer and try to invalidate.Second, In TryInvalidateBuffer(), it first checks if the reference
count is greater than zero and then checks for dirty buffers. Will
there be a scenario where the buffer is dirty and its reference count
is zero? Can you please provide more information on this or adjust the
code accordingly.+/* +Try Invalidating a buffer using bufnum. +If the buffer is invalid, the function returns false. +The function checks for dirty buffer and flushes the dirty buffer before invalidating. +If the buffer is still dirty it returns false. +*/ +boolThe star(*) and space are missing here. Please refer to the style of
function comments and change accordingly.Thanks & Regards,
Nitin JadhavOn Fri, Jun 30, 2023 at 4:17 PM Palak Chaturvedi
<chaturvedipalak1911@gmail.com> wrote:I hope this email finds you well. I am excited to share that I have
extended the functionality of the `pg_buffercache` extension by
implementing buffer invalidation capability, as requested by some
PostgreSQL contributors for improved testing scenarios.This marks my first time submitting a patch to pgsql-hackers, and I am
eager to receive your expert feedback on the changes made. Your
insights are invaluable, and any review or comments you provide will
be greatly appreciated.The primary objective of this enhancement is to enable explicit buffer
invalidation within the `pg_buffercache` extension. By doing so, we
can simulate scenarios where buffers are invalidated and observe the
resulting behavior in PostgreSQL.As part of this patch, a new function or mechanism has been introduced
to facilitate buffer invalidation. I would like to hear your thoughts
on whether this approach provides a good user interface for this
functionality. Additionally, I seek your evaluation of the buffer
locking protocol employed in the extension to ensure its correctness
and efficiency.Please note that I plan to add comprehensive documentation once the
details of this enhancement are agreed upon. This documentation will
serve as a valuable resource for users and contributors alike. I
believe that your expertise will help uncover any potential issues and
opportunities for further improvement.I have attached the patch file to this email for your convenience.
Your valuable time and consideration in reviewing this extension are
sincerely appreciated.Thank you for your continued support and guidance. I am looking
forward to your feedback and collaboration in enhancing the PostgreSQL
ecosystem.The working of the extension:
1. Creating the extension pg_buffercache and then call select query on
a table and note the buffer to be cleared.
pgbench=# create extension pg_buffercache;
CREATE EXTENSION
pgbench=# select count(*) from pgbench_accounts;
count
--------
100000
(1 row)pgbench=# SELECT *
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
bufferid | relfilenode | reltablespace | reldatabase | relforknumber
| relblocknumber | isdirty | usagecount | pinning_backends
----------+-------------+---------------+-------------+---------------+----------------+---------+------------+------------------
233 | 16397 | 1663 | 16384 | 0
| 0 | f | 1 | 0
234 | 16397 | 1663 | 16384 | 0
| 1 | f | 1 | 0
235 | 16397 | 1663 | 16384 | 0
| 2 | f | 1 | 0
236 | 16397 | 1663 | 16384 | 0
| 3 | f | 1 | 0
237 | 16397 | 1663 | 16384 | 0
| 4 | f | 1 | 02. Clearing a single buffer by entering the bufferid.
pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1660
(1 row)pgbench=# select pg_buffercache_invalidate(233);
pg_buffercache_invalidate
---------------------------
t
(1 row)pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1659
(1 row)3. Clearing the entire buffer for a relation using the function.
pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1659
(1 row)pgbench=# select count(pg_buffercache_invalidate(bufferid)) from
pg_buffercache where relfilenode =
pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
1659
(1 row)pgbench=# SELECT count(*)
FROM pg_buffercache
WHERE relfilenode = pg_relation_filenode('pgbench_accounts'::regclass);
count
-------
0
(1 row)Best regards,
Palak
Attachments:
v2-0001-Invalidate-Buffer-By-Bufnum.patchapplication/octet-stream; name=v2-0001-Invalidate-Buffer-By-Bufnum.patchDownload
From 80165aa62e3ba54ac04f6465c9a39380c2932c25 Mon Sep 17 00:00:00 2001
From: Palak <palakchaturvedi2843@gmail.com>
Date: Fri, 30 Jun 2023 08:21:06 +0000
Subject: [PATCH v2] Invalidate Buffer By Bufnum
---
contrib/pg_buffercache/Makefile | 2 +-
contrib/pg_buffercache/meson.build | 1 +
.../pg_buffercache--1.4--1.5.sql | 6 +
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 34 ++++
src/backend/storage/buffer/bufmgr.c | 71 +++++++
src/include/storage/bufmgr.h | 3 +
v1-0001-Invalidate-Buffer-By-Bufnum.patch | 191 ++++++++++++++++++
8 files changed, 308 insertions(+), 2 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
create mode 100644 v1-0001-Invalidate-Buffer-By-Bufnum.patch
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index d6b58d4da9..eae65ead9e 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -8,7 +8,7 @@ OBJS = \
EXTENSION = pg_buffercache
DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
- pg_buffercache--1.3--1.4.sql
+ pg_buffercache--1.3--1.4.sql pg_buffercache--1.4--1.5.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
REGRESS = pg_buffercache
diff --git a/contrib/pg_buffercache/meson.build b/contrib/pg_buffercache/meson.build
index c51edf37d1..748463bc19 100644
--- a/contrib/pg_buffercache/meson.build
+++ b/contrib/pg_buffercache/meson.build
@@ -22,6 +22,7 @@ install_data(
'pg_buffercache--1.2--1.3.sql',
'pg_buffercache--1.2.sql',
'pg_buffercache--1.3--1.4.sql',
+ 'pg_buffercache--1.4--1.5.sql',
'pg_buffercache.control',
kwargs: contrib_data_args,
)
diff --git a/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
new file mode 100644
index 0000000000..c96848c77d
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
@@ -0,0 +1,6 @@
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.5'" to load this file. \quit
+
+CREATE FUNCTION pg_buffercache_invalidate(IN int, IN bool default true)
+RETURNS bool
+AS 'MODULE_PATHNAME', 'pg_buffercache_invalidate'
+LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index a82ae5f9bb..5ee875f77d 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.4'
+default_version = '1.5'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 3316732365..93ee4ef724 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -64,6 +64,40 @@ PG_FUNCTION_INFO_V1(pg_buffercache_pages);
PG_FUNCTION_INFO_V1(pg_buffercache_summary);
PG_FUNCTION_INFO_V1(pg_buffercache_usage_counts);
+PG_FUNCTION_INFO_V1(pg_buffercache_invalidate);
+Datum
+pg_buffercache_invalidate(PG_FUNCTION_ARGS)
+{
+ Buffer bufnum;
+ bool result;
+ bool force;
+
+ if (PG_ARGISNULL(0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("buffernum cannot be NULL")));
+ }
+
+ bufnum = PG_GETARG_INT32(0);
+ if (bufnum <= 0 || bufnum > NBuffers)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("buffernum is not valid")));
+
+ }
+
+ /*
+ * Check whether to force invalidate the dirty buffer. The default value of force is true.
+ */
+
+ force = PG_GETARG_BOOL(1);
+
+ result = TryInvalidateBuffer(bufnum, force);
+ PG_RETURN_BOOL(result);
+}
+
Datum
pg_buffercache_pages(PG_FUNCTION_ARGS)
{
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 3c59bbd04e..5b874a83ae 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -5592,3 +5592,74 @@ TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
(errcode(ERRCODE_SNAPSHOT_TOO_OLD),
errmsg("snapshot too old")));
}
+
+/*
+ * Try Invalidating a buffer using bufnum.
+ * If the buffer is invalid, the function returns false.
+ * The function checks for dirty buffer and flushes the dirty buffer before invalidating.
+ * If the buffer is still dirty it returns false.
+ */
+bool
+TryInvalidateBuffer(Buffer bufnum, bool force)
+{
+ BufferDesc *bufHdr = GetBufferDescriptor(bufnum - 1);
+ uint32 buf_state;
+ ReservePrivateRefCountEntry();
+
+ buf_state = LockBufHdr(bufHdr);
+ if ((buf_state & BM_VALID) == BM_VALID)
+ {
+ /*
+ * The buffer is pinned therefore cannot invalidate.
+ */
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+ if ((buf_state & BM_DIRTY) == BM_DIRTY)
+ {
+ /*
+ * If the buffer is dirty and the user has not asked to clear the dirty buffer return false.
+ * Otherwise clear the dirty buffer.
+ */
+ if(!force){
+ return false;
+ }
+ /*
+ * Try once to flush the dirty buffer.
+ */
+ ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
+ PinBuffer_Locked(bufHdr);
+ LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_SHARED);
+ FlushBuffer(bufHdr, NULL, IOOBJECT_RELATION, IOCONTEXT_NORMAL);
+ LWLockRelease(BufferDescriptorGetContentLock(bufHdr));
+ UnpinBuffer(bufHdr);
+ buf_state = LockBufHdr(bufHdr);
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+
+ /*
+ * If its dirty again or not valid anymore give up.
+ */
+
+ if ((buf_state & (BM_DIRTY | BM_VALID)) != (BM_VALID))
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+
+ }
+
+ InvalidateBuffer(bufHdr);
+ return true;
+ }
+ else
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+}
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 0f5fb6be00..1fdb7ae3f0 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -252,6 +252,9 @@ extern bool BgBufferSync(struct WritebackContext *wb_context);
extern void TestForOldSnapshot_impl(Snapshot snapshot, Relation relation);
+
+extern bool TryInvalidateBuffer(Buffer bufnum, bool force);
+
/* in buf_init.c */
extern void InitBufferPool(void);
extern Size BufferShmemSize(void);
diff --git a/v1-0001-Invalidate-Buffer-By-Bufnum.patch b/v1-0001-Invalidate-Buffer-By-Bufnum.patch
new file mode 100644
index 0000000000..7ee605f88e
--- /dev/null
+++ b/v1-0001-Invalidate-Buffer-By-Bufnum.patch
@@ -0,0 +1,191 @@
+From 02ea352d84ed87e156617de8b8020811680cb412 Mon Sep 17 00:00:00 2001
+From: Palak <palakchaturvedi2843@gmail.com>
+Date: Fri, 30 Jun 2023 08:21:06 +0000
+Subject: [PATCH v1] Invalidate Buffer By Bufnum
+
+---
+ contrib/pg_buffercache/Makefile | 2 +-
+ contrib/pg_buffercache/meson.build | 1 +
+ .../pg_buffercache--1.4--1.5.sql | 6 ++
+ contrib/pg_buffercache/pg_buffercache.control | 2 +-
+ contrib/pg_buffercache/pg_buffercache_pages.c | 27 ++++++++
+ src/backend/storage/buffer/bufmgr.c | 64 +++++++++++++++++++
+ src/include/storage/bufmgr.h | 3 +
+ 7 files changed, 103 insertions(+), 2 deletions(-)
+ create mode 100644 contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
+
+diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
+index d6b58d4da9..eae65ead9e 100644
+--- a/contrib/pg_buffercache/Makefile
++++ b/contrib/pg_buffercache/Makefile
+@@ -8,7 +8,7 @@ OBJS = \
+ EXTENSION = pg_buffercache
+ DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
+ pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
+- pg_buffercache--1.3--1.4.sql
++ pg_buffercache--1.3--1.4.sql pg_buffercache--1.4--1.5.sql
+ PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
+
+ REGRESS = pg_buffercache
+diff --git a/contrib/pg_buffercache/meson.build b/contrib/pg_buffercache/meson.build
+index c51edf37d1..748463bc19 100644
+--- a/contrib/pg_buffercache/meson.build
++++ b/contrib/pg_buffercache/meson.build
+@@ -22,6 +22,7 @@ install_data(
+ 'pg_buffercache--1.2--1.3.sql',
+ 'pg_buffercache--1.2.sql',
+ 'pg_buffercache--1.3--1.4.sql',
++ 'pg_buffercache--1.4--1.5.sql',
+ 'pg_buffercache.control',
+ kwargs: contrib_data_args,
+ )
+diff --git a/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
+new file mode 100644
+index 0000000000..c7e47456d7
+--- /dev/null
++++ b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
+@@ -0,0 +1,6 @@
++\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.5'" to load this file. \quit
++
++CREATE FUNCTION pg_buffercache_invalidate(IN int)
++RETURNS bool
++AS 'MODULE_PATHNAME', 'pg_buffercache_invalidate'
++LANGUAGE C PARALLEL SAFE;
+diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
+index a82ae5f9bb..5ee875f77d 100644
+--- a/contrib/pg_buffercache/pg_buffercache.control
++++ b/contrib/pg_buffercache/pg_buffercache.control
+@@ -1,5 +1,5 @@
+ # pg_buffercache extension
+ comment = 'examine the shared buffer cache'
+-default_version = '1.4'
++default_version = '1.5'
+ module_pathname = '$libdir/pg_buffercache'
+ relocatable = true
+diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
+index 3316732365..6dd02fe6af 100644
+--- a/contrib/pg_buffercache/pg_buffercache_pages.c
++++ b/contrib/pg_buffercache/pg_buffercache_pages.c
+@@ -64,6 +64,33 @@ PG_FUNCTION_INFO_V1(pg_buffercache_pages);
+ PG_FUNCTION_INFO_V1(pg_buffercache_summary);
+ PG_FUNCTION_INFO_V1(pg_buffercache_usage_counts);
+
++PG_FUNCTION_INFO_V1(pg_buffercache_invalidate);
++Datum
++pg_buffercache_invalidate(PG_FUNCTION_ARGS)
++{
++ Buffer bufnum;
++ bool result;
++
++ if (PG_ARGISNULL(0))
++ {
++ ereport(ERROR,
++ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
++ errmsg("buffernum cannot be NULL")));
++ }
++
++ bufnum = PG_GETARG_INT32(0);
++ if (bufnum < 0 || bufnum > NBuffers)
++ {
++ ereport(ERROR,
++ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
++ errmsg("buffernum is not valid")));
++
++ }
++
++ result = TryInvalidateBuffer(bufnum);
++ PG_RETURN_BOOL(result);
++}
++
+ Datum
+ pg_buffercache_pages(PG_FUNCTION_ARGS)
+ {
+diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
+index 3c59bbd04e..376afcf996 100644
+--- a/src/backend/storage/buffer/bufmgr.c
++++ b/src/backend/storage/buffer/bufmgr.c
+@@ -5592,3 +5592,67 @@ TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
+ (errcode(ERRCODE_SNAPSHOT_TOO_OLD),
+ errmsg("snapshot too old")));
+ }
++
++/*
++Try Invalidating a buffer using bufnum.
++If the buffer is invalid, the function returns false.
++The function checks for dirty buffer and flushes the dirty buffer before invalidating.
++If the buffer is still dirty it returns false.
++*/
++bool
++TryInvalidateBuffer(Buffer bufnum)
++{
++ BufferDesc *bufHdr = GetBufferDescriptor(bufnum - 1);
++ uint32 buf_state;
++
++ ReservePrivateRefCountEntry();
++
++ buf_state = LockBufHdr(bufHdr);
++ if ((buf_state & BM_VALID) == BM_VALID)
++ {
++ /*
++ * The buffer is pinned therefore cannot invalidate.
++ */
++ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
++ {
++ UnlockBufHdr(bufHdr, buf_state);
++ return false;
++ }
++ if ((buf_state & BM_DIRTY) == BM_DIRTY)
++ {
++ /*
++ * Try once to flush the dirty buffer.
++ */
++ PinBuffer_Locked(bufHdr);
++ LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_SHARED);
++ FlushBuffer(bufHdr, NULL, IOOBJECT_RELATION, IOCONTEXT_NORMAL);
++ LWLockRelease(BufferDescriptorGetContentLock(bufHdr));
++ UnpinBuffer(bufHdr);
++ buf_state = LockBufHdr(bufHdr);
++ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
++ {
++ UnlockBufHdr(bufHdr, buf_state);
++ return false;
++ }
++
++ /*
++ * If its dirty again or not valid anymore give up.
++ */
++
++ if ((buf_state & (BM_DIRTY | BM_VALID)) != (BM_VALID))
++ {
++ UnlockBufHdr(bufHdr, buf_state);
++ return false;
++ }
++
++ }
++
++ InvalidateBuffer(bufHdr);
++ return true;
++ }
++ else
++ {
++ UnlockBufHdr(bufHdr, buf_state);
++ return false;
++ }
++}
+diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
+index 0f5fb6be00..4ba3d9089b 100644
+--- a/src/include/storage/bufmgr.h
++++ b/src/include/storage/bufmgr.h
+@@ -252,6 +252,9 @@ extern bool BgBufferSync(struct WritebackContext *wb_context);
+
+ extern void TestForOldSnapshot_impl(Snapshot snapshot, Relation relation);
+
++
++extern bool TryInvalidateBuffer(Buffer bufnum);
++
+ /* in buf_init.c */
+ extern void InitBufferPool(void);
+ extern Size BufferShmemSize(void);
+--
+2.25.1
+
--
2.25.1
Hi,
I wanted this feature a couple times before...
On 2023-07-03 13:56:29 +0530, Palak Chaturvedi wrote:
+PG_FUNCTION_INFO_V1(pg_buffercache_invalidate); +Datum +pg_buffercache_invalidate(PG_FUNCTION_ARGS)
I don't think "invalidating" is the right terminology. Note that we already
have InvalidateBuffer() - but it's something we can't allow users to do, as it
throws away dirty buffer contents (it's used for things like dropping a
table).
How about using "discarding" for this functionality?
Using the buffer ID as the identifier doesn't seem great, because what that
buffer is used for, could have changed since the buffer ID has been acquired
(via the pg_buffercache view presumably)?
My suspicion is that the usual usecase for this would be to drop all buffers
that can be dropped?
+ if (bufnum < 0 || bufnum > NBuffers) + { + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("buffernum is not valid"))); + + } + + result = TryInvalidateBuffer(bufnum); + PG_RETURN_BOOL(result); +}
I think this should be restricted to superuser by default (by revoking
permissions from PUBLIC). We allow normal users to use pg_prewarm(...), true -
but we perform an ACL check on the relation, so it can only be used for
relations you have access too. This function could be used to affect
performance of other users quite substantially.
+/* +Try Invalidating a buffer using bufnum. +If the buffer is invalid, the function returns false. +The function checks for dirty buffer and flushes the dirty buffer before invalidating. +If the buffer is still dirty it returns false. +*/ +bool +TryInvalidateBuffer(Buffer bufnum) +{ + BufferDesc *bufHdr = GetBufferDescriptor(bufnum - 1); + uint32 buf_state; + + ReservePrivateRefCountEntry(); + + buf_state = LockBufHdr(bufHdr); + if ((buf_state & BM_VALID) == BM_VALID) + { + /* + * The buffer is pinned therefore cannot invalidate. + */ + if (BUF_STATE_GET_REFCOUNT(buf_state) > 0) + { + UnlockBufHdr(bufHdr, buf_state); + return false; + } + if ((buf_state & BM_DIRTY) == BM_DIRTY) + { + /* + * Try once to flush the dirty buffer. + */ + PinBuffer_Locked(bufHdr); + LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_SHARED); + FlushBuffer(bufHdr, NULL, IOOBJECT_RELATION, IOCONTEXT_NORMAL); + LWLockRelease(BufferDescriptorGetContentLock(bufHdr)); + UnpinBuffer(bufHdr); + buf_state = LockBufHdr(bufHdr); + if (BUF_STATE_GET_REFCOUNT(buf_state) > 0) + { + UnlockBufHdr(bufHdr, buf_state); + return false; + } + + /* + * If its dirty again or not valid anymore give up. + */ + + if ((buf_state & (BM_DIRTY | BM_VALID)) != (BM_VALID)) + { + UnlockBufHdr(bufHdr, buf_state); + return false; + } + + } + + InvalidateBuffer(bufHdr);
I'm wary of using InvalidateBuffer() here, it's typically used for different
purposes, including throwing valid contents away. That seems a bit scary.
I think you should be able to just use InvalidateVictimBuffer()?
Greetings,
Andres Freund
On Wed, Jul 19, 2023 at 12:45 PM Andres Freund <andres@anarazel.de> wrote:
I don't think "invalidating" is the right terminology. Note that we already
have InvalidateBuffer() - but it's something we can't allow users to do, as it
throws away dirty buffer contents (it's used for things like dropping a
table).How about using "discarding" for this functionality?
+1
Using the buffer ID as the identifier doesn't seem great, because what that
buffer is used for, could have changed since the buffer ID has been acquired
(via the pg_buffercache view presumably)?My suspicion is that the usual usecase for this would be to drop all buffers
that can be dropped?
Well the idea was to be able to drop less than everything. Instead of
having to bike-shed what the user interface should look like to
specify what subset of everything you want to drop, you can just write
SQL queries (mostly likely involving the pg_buffercache view, indeed).
It's true that buffer IDs can change underneath your feet between
SELECT and discard, but the whole concept is inherently racy like
that. Suppose we instead had pg_unwarm('my_table') or whatever
instead. Immediately after it runs and before it even returns, some
blocks of my_table can finish up coming back into the pool. It's also
interesting to be able to kick individual pages out when testing code
that caches buffers IDs for ReadRecentBuffer(), and other buffer-pool
work. Hence desire to not try to be clever at all here, and just come
up with the absolute bare minimum thing that can kick buffers out by
ID and leave the rest up to hackers/experts who are willing and able
to write queries to supply them. You can still drop everything that
can be dropped -- generate_series. Or whatever you want.
Hello
I had a look at the patch and tested it on CI bot, it compiles and tests fine both autoconf and meson. I noticed that the v2 patch contains the v1 patch file as well. Not sure if intended but put there my accident.
I don't think "invalidating" is the right terminology. Note that we already
have InvalidateBuffer() - but it's something we can't allow users to do, as it
throws away dirty buffer contents (it's used for things like dropping a
table).How about using "discarding" for this functionality?
I think "invalidating" is the right terminology here, it is exactly what the feature is doing, it tries to invalidate a buffer ID by calling InvalidateBuffer() routine inside buffer manager and calls FlushBuffer() before invalidating if marked dirty.
The problem here is that InvalidateBuffer() could be dangerous because it allows a user to invalidate buffer that may have data in other tables not owned by the current user,
I think it all comes down to the purpose of this feature. Based on the description in this email thread, I feel like this feature should be categorized as a developer-only feature, to be used by PG developer to experiment and observe some development works by invalidating one more more specific buffers..... If this is the case, it may be helpful to add a "DEVELOPER_OPTIONS" in GUC, which allows or disallows the TryInvalidateBuffer() to run or to return error if user does not have this developer option enabled.
If the purpose of this feature is for general users, then it would make sense to have something like pg_unwarm (exactly opposite of pg_prewarm) that takes table name (instead of buffer ID) and drop all buffers associated with that table name. There will be permission checks as well so a user cannot pg_unwarm a table owned by someone else. User in this case won't be able to invalidate a particular buffer, but he/she should not have to as a regular user anyway.
thanks!
Cary Huang
-------------
HighGo Software Inc. (Canada)
cary.huang@highgo.ca
www.highgo.ca
Hii,
Thanks for your feedback. We have decided to add a role for the
extension to solve that problem.
And concerning to pg_unwarm table I think we can create a new function
to do that but I think a general user would not require to clear a
table from buffercache.
We can use bufferid and where statements to do the same if a
superuser/(specific role) requests it.
Thanks.
Show quoted text
On Sat, 29 Jul 2023 at 02:55, Cary Huang <cary.huang@highgo.ca> wrote:
Hello
I had a look at the patch and tested it on CI bot, it compiles and tests fine both autoconf and meson. I noticed that the v2 patch contains the v1 patch file as well. Not sure if intended but put there my accident.
I don't think "invalidating" is the right terminology. Note that we already
have InvalidateBuffer() - but it's something we can't allow users to do, as it
throws away dirty buffer contents (it's used for things like dropping a
table).How about using "discarding" for this functionality?
I think "invalidating" is the right terminology here, it is exactly what the feature is doing, it tries to invalidate a buffer ID by calling InvalidateBuffer() routine inside buffer manager and calls FlushBuffer() before invalidating if marked dirty.
The problem here is that InvalidateBuffer() could be dangerous because it allows a user to invalidate buffer that may have data in other tables not owned by the current user,
I think it all comes down to the purpose of this feature. Based on the description in this email thread, I feel like this feature should be categorized as a developer-only feature, to be used by PG developer to experiment and observe some development works by invalidating one more more specific buffers..... If this is the case, it may be helpful to add a "DEVELOPER_OPTIONS" in GUC, which allows or disallows the TryInvalidateBuffer() to run or to return error if user does not have this developer option enabled.
If the purpose of this feature is for general users, then it would make sense to have something like pg_unwarm (exactly opposite of pg_prewarm) that takes table name (instead of buffer ID) and drop all buffers associated with that table name. There will be permission checks as well so a user cannot pg_unwarm a table owned by someone else. User in this case won't be able to invalidate a particular buffer, but he/she should not have to as a regular user anyway.
thanks!
Cary Huang
-------------
HighGo Software Inc. (Canada)
cary.huang@highgo.ca
www.highgo.ca
Le 01/07/2023 à 00:09, Thomas Munro a écrit :
On Fri, Jun 30, 2023 at 10:47 PM Palak Chaturvedi
<chaturvedipalak1911@gmail.com> wrote:
We also talked a bit about how one might control the kernel page cache
in more fine-grained ways for testing purposes, but it seems like the
pgfincore project has that covered with its pgfadvise_willneed() and
pgfadvise_dontneed(). IMHO that project could use more page-oriented
operations (instead of just counts and coarse grains operations) but
that's something that could be material for patches to send to the
extension maintainers. This work, in contrast, is more tangled up
with bufmgr.c internals, so it feels like this feature belongs in a
core contrib module.
Precisely what pgfincore is doing/offering already.
Happy to propose to postgresql tree if there are interest. Next step for
pgfincore is to add cachestat() syscall and evaluates benefits for
PostgreSQL cost estimators of this new call.
Here an example to achieve the warm/unwarm, each bit is a PostgreSQL
page, so here we warm cache with the first 3 and remove the last 3 from
cache (system cache, not shared buffers).
-- Loading and Unloading
cedric=# select * from pgfadvise_loader('pgbench_accounts', 0, true,
true, B'111000');
relpath | os_page_size | os_pages_free | pages_loaded |
pages_unloaded
------------------+--------------+---------------+--------------+----------------
base/11874/16447 | 4096 | 408376 | 3 |
3
---
Cédric Villemain +33 (0)6 20 30 22 52
https://Data-Bene.io
PostgreSQL Expertise, Support, Training, R&D
Hi Palak,
I did a quick review of the patch:
+CREATE FUNCTION pg_buffercache_invalidate(IN int, IN bool default true)
+RETURNS bool
+AS 'MODULE_PATHNAME', 'pg_buffercache_invalidate'
+LANGUAGE C PARALLEL SAFE;
--> Not enforced anywhere, but you can also add a comment to the
function, for end users...
+PG_FUNCTION_INFO_V1(pg_buffercache_invalidate);
+Datum
+pg_buffercache_invalidate(PG_FUNCTION_ARGS)
+{
+ Buffer bufnum;
"Buffer blocknum" is not correct in this context I believe. Buffer is
when you have to manage Local buffer too (negative number).
Here uint32 is probably the good choice at the end, as used in
pg_buffercache in other places.
Also in this extension bufferid is used, not buffernum.
+ bufnum = PG_GETARG_INT32(0);
+ if (bufnum <= 0 || bufnum > NBuffers)
maybe have a look at pageinspect and its PG_GETARG_UINT32.
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("buffernum is not valid")));
https://www.postgresql.org/docs/16/error-style-guide.html let me think
that message like 'buffernum is not valid' can be enhanced: out of
range, cannot be negative or exceed number of shared buffers.... ? Maybe
add the value to the message.
+
+ }
+
+ /*
+ * Check whether to force invalidate the dirty buffer. The default
value of force is true.
+ */
+
+ force = PG_GETARG_BOOL(1);
I think you also need to test PG_ARGISNULL with force parameter.
+/*
+ * Try Invalidating a buffer using bufnum.
+ * If the buffer is invalid, the function returns false.
+ * The function checks for dirty buffer and flushes the dirty buffer
before invalidating.
+ * If the buffer is still dirty it returns false.
+ */
+bool
+TryInvalidateBuffer(Buffer bufnum, bool force)
+{
+ BufferDesc *bufHdr = GetBufferDescriptor(bufnum - 1);
this is not safe, GetBufferDescriptor() accepts uint, but can receive
negative here. Use uint32 and bufferid.
+ uint32 buf_state;
+ ReservePrivateRefCountEntry();
+
+ buf_state = LockBufHdr(bufHdr);
+ if ((buf_state & BM_VALID) == BM_VALID)
+ {
+ /*
+ * The buffer is pinned therefore cannot invalidate.
+ */
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+ if ((buf_state & BM_DIRTY) == BM_DIRTY)
+ {
+ /*
+ * If the buffer is dirty and the user has not asked to
clear the dirty buffer return false.
+ * Otherwise clear the dirty buffer.
+ */
+ if(!force){
+ return false;
probably need to unlockbuffer here too.
+ }
+ /*
+ * Try once to flush the dirty buffer.
+ */
+ ResourceOwnerEnlargeBuffers(CurrentResourceOwner);
+ PinBuffer_Locked(bufHdr);
+ LWLockAcquire(BufferDescriptorGetContentLock(bufHdr),
LW_SHARED);
+ FlushBuffer(bufHdr, NULL, IOOBJECT_RELATION, IOCONTEXT_NORMAL);
+ LWLockRelease(BufferDescriptorGetContentLock(bufHdr));
+ UnpinBuffer(bufHdr);
I am unsure of this area (the code is correct, but I wonder why there is
no static code for this part -from pin to unpin- in PostgreSQL), and
maybe better to go with FlushOneBuffer() ?
Also it is probably required to account for the shared buffer eviction
in some pg_stat* view or table.
Not sure how disk syncing is handled after this sequence nor if it's
important ?
+ buf_state = LockBufHdr(bufHdr);
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+
+ /*
+ * If its dirty again or not valid anymore give up.
+ */
+
+ if ((buf_state & (BM_DIRTY | BM_VALID)) != (BM_VALID))
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+
+ }
+
+ InvalidateBuffer(bufHdr);
+ return true;
+ }
+ else
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
Maybe safe to remove the else {} ...
Maybe more tempting to start the big if with the following instead less
nested...):
+ if ((buf_state & BM_VALID) != BM_VALID)
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
Doc and test are absent.
---
Cédric Villemain +33 (0)6 20 30 22 52
https://Data-Bene.io
PostgreSQL Expertise, Support, Training, R&D
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">On 1/3/24 10:25 AM, Cédric Villemain
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:8b8c709a-c7cc-4965-8296-64f549c27501@abcsql.com">Hi
Palak,
<br>
<br>
I did a quick review of the patch:
<br>
<br>
+CREATE FUNCTION pg_buffercache_invalidate(IN int, IN bool default
true)
<br>
+RETURNS bool
<br>
+AS 'MODULE_PATHNAME', 'pg_buffercache_invalidate'
<br>
+LANGUAGE C PARALLEL SAFE;
<br>
<br>
--> Not enforced anywhere, but you can also add a comment to
the function, for end users...
<br>
</blockquote>
<p>The arguments should also have names...</p>
<blockquote type="cite"
cite="mid:8b8c709a-c7cc-4965-8296-64f549c27501@abcsql.com"><br>
+ force = PG_GETARG_BOOL(1);
<br>
<br>
I think you also need to test PG_ARGISNULL with force parameter.
<br>
</blockquote>
Actually, that's true for the first argument as well. Or, just mark
the function as STRICT.<br>
<pre class="moz-signature" cols="72">--
Jim Nasby, Data Architect, Austin TX</pre>
</body>
</html>
Hi Palak,
there is currently even more interest in your patch as it should help
building tests for on-going development around cache/read
management/effects.
Do you expect to be able to follow-up in the coming future ?
Thank you,
Cédric
On 04/01/2024 00:15, Jim Nasby wrote:
On 1/3/24 10:25 AM, Cédric Villemain wrote:
Hi Palak,
I did a quick review of the patch:
+CREATE FUNCTION pg_buffercache_invalidate(IN int, IN bool default true) +RETURNS bool +AS 'MODULE_PATHNAME', 'pg_buffercache_invalidate' +LANGUAGE C PARALLEL SAFE;--> Not enforced anywhere, but you can also add a comment to the
function, for end users...The arguments should also have names...
+ force = PG_GETARG_BOOL(1);
I think you also need to test PG_ARGISNULL with force parameter.
Actually, that's true for the first argument as well. Or, just mark the
function as STRICT.--
Jim Nasby, Data Architect, Austin TX
--
---
Cédric Villemain +33 (0)6 20 30 22 52
https://Data-Bene.io
PostgreSQL Expertise, Support, Training, R&D
[Sorry to those who received this message twice -- the first time got
bounced by the list because of a defunct email address in the CC
list.]
Here is a rebase of Palak's v2 patch. I didn't change anything except
for the required resource manager API change, a pgindent run, and
removal of a stray file, and there is still some feedback to be
addressed before we can get this in, but I wanted to fix the bitrot
and re-open this CF item because this is very useful work. It's
essential for testing the prefetching-related stuff happening in
various other threads, where you want to be able to get the buffer
pool into various interesting states.
Attachments:
v3-0001-Invalidate-Buffer-By-Bufnum.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Invalidate-Buffer-By-Bufnum.patchDownload
From 911776e07a767e8775f9e43948f4145f72d1de65 Mon Sep 17 00:00:00 2001
From: Palak <palakchaturvedi2843@gmail.com>
Date: Fri, 30 Jun 2023 08:21:06 +0000
Subject: [PATCH v3] Invalidate Buffer By Bufnum
---
contrib/pg_buffercache/Makefile | 2 +-
contrib/pg_buffercache/meson.build | 1 +
.../pg_buffercache--1.4--1.5.sql | 6 ++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 35 +++++++++
src/backend/storage/buffer/bufmgr.c | 74 +++++++++++++++++++
src/include/storage/bufmgr.h | 2 +
7 files changed, 120 insertions(+), 2 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index d6b58d4da94..eae65ead9e5 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -8,7 +8,7 @@ OBJS = \
EXTENSION = pg_buffercache
DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
- pg_buffercache--1.3--1.4.sql
+ pg_buffercache--1.3--1.4.sql pg_buffercache--1.4--1.5.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
REGRESS = pg_buffercache
diff --git a/contrib/pg_buffercache/meson.build b/contrib/pg_buffercache/meson.build
index c86e33cc958..1ca3452918b 100644
--- a/contrib/pg_buffercache/meson.build
+++ b/contrib/pg_buffercache/meson.build
@@ -22,6 +22,7 @@ install_data(
'pg_buffercache--1.2--1.3.sql',
'pg_buffercache--1.2.sql',
'pg_buffercache--1.3--1.4.sql',
+ 'pg_buffercache--1.4--1.5.sql',
'pg_buffercache.control',
kwargs: contrib_data_args,
)
diff --git a/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
new file mode 100644
index 00000000000..c96848c77d8
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
@@ -0,0 +1,6 @@
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.5'" to load this file. \quit
+
+CREATE FUNCTION pg_buffercache_invalidate(IN int, IN bool default true)
+RETURNS bool
+AS 'MODULE_PATHNAME', 'pg_buffercache_invalidate'
+LANGUAGE C PARALLEL SAFE;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index a82ae5f9bb5..5ee875f77dd 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.4'
+default_version = '1.5'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 33167323653..485cd988dd4 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -64,6 +64,41 @@ PG_FUNCTION_INFO_V1(pg_buffercache_pages);
PG_FUNCTION_INFO_V1(pg_buffercache_summary);
PG_FUNCTION_INFO_V1(pg_buffercache_usage_counts);
+PG_FUNCTION_INFO_V1(pg_buffercache_invalidate);
+Datum
+pg_buffercache_invalidate(PG_FUNCTION_ARGS)
+{
+ Buffer bufnum;
+ bool result;
+ bool force;
+
+ if (PG_ARGISNULL(0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("buffernum cannot be NULL")));
+ }
+
+ bufnum = PG_GETARG_INT32(0);
+ if (bufnum <= 0 || bufnum > NBuffers)
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("buffernum is not valid")));
+
+ }
+
+ /*
+ * Check whether to force invalidate the dirty buffer. The default value
+ * of force is true.
+ */
+
+ force = PG_GETARG_BOOL(1);
+
+ result = TryInvalidateBuffer(bufnum, force);
+ PG_RETURN_BOOL(result);
+}
+
Datum
pg_buffercache_pages(PG_FUNCTION_ARGS)
{
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index bdf89bbc4dc..1d0f185edbc 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -5705,3 +5705,77 @@ ResOwnerPrintBufferPin(Datum res)
{
return DebugPrintBufferRefcount(DatumGetInt32(res));
}
+
+/*
+ * Try Invalidating a buffer using bufnum.
+ * If the buffer is invalid, the function returns false.
+ * The function checks for dirty buffer and flushes the dirty buffer before invalidating.
+ * If the buffer is still dirty it returns false.
+ */
+bool
+TryInvalidateBuffer(Buffer bufnum, bool force)
+{
+ BufferDesc *bufHdr = GetBufferDescriptor(bufnum - 1);
+ uint32 buf_state;
+
+ ReservePrivateRefCountEntry();
+
+ buf_state = LockBufHdr(bufHdr);
+ if ((buf_state & BM_VALID) == BM_VALID)
+ {
+ /*
+ * The buffer is pinned therefore cannot invalidate.
+ */
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+ if ((buf_state & BM_DIRTY) == BM_DIRTY)
+ {
+ /*
+ * If the buffer is dirty and the user has not asked to clear the
+ * dirty buffer return false. Otherwise clear the dirty buffer.
+ */
+ if (!force)
+ {
+ return false;
+ }
+
+ /*
+ * Try once to flush the dirty buffer.
+ */
+ ResourceOwnerEnlarge(CurrentResourceOwner);
+ PinBuffer_Locked(bufHdr);
+ LWLockAcquire(BufferDescriptorGetContentLock(bufHdr), LW_SHARED);
+ FlushBuffer(bufHdr, NULL, IOOBJECT_RELATION, IOCONTEXT_NORMAL);
+ LWLockRelease(BufferDescriptorGetContentLock(bufHdr));
+ UnpinBuffer(bufHdr);
+ buf_state = LockBufHdr(bufHdr);
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+
+ /*
+ * If its dirty again or not valid anymore give up.
+ */
+
+ if ((buf_state & (BM_DIRTY | BM_VALID)) != (BM_VALID))
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+
+ }
+
+ InvalidateBuffer(bufHdr);
+ return true;
+ }
+ else
+ {
+ UnlockBufHdr(bufHdr, buf_state);
+ return false;
+ }
+}
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index d51d46d3353..a7d2dd2de0a 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -250,6 +250,8 @@ extern bool HoldingBufferPinThatDelaysRecovery(void);
extern bool BgBufferSync(struct WritebackContext *wb_context);
+extern bool TryInvalidateBuffer(Buffer bufnum, bool force);
+
/* in buf_init.c */
extern void InitBufferPool(void);
extern Size BufferShmemSize(void);
--
2.39.2
Import Notes
Reply to msg id not found: CALDaNm20+pri5yvEF-4HZqCwRjHB+qjWHLcfanyEjnwgF96z6Q@mail.gmail.com
Quite an interesting patch, in my opinion. I've decided to work on it a
bit, did some refactoring (sorry) and add
basic tests. Also, I try to take into account as much as possible notes on
the patch, mentioned by Cédric Villemain.
and maybe better to go with FlushOneBuffer() ?
It's a good idea, but I'm not sure at the moment. I'll try to dig some
deeper into it. At least, FlushOneBuffer does
not work for a local buffers. So, we have to decide whatever
pg_buffercache_invalidate should or should not
work for local buffers. For now, I don't see why it should not. Maybe I
miss something?
--
Best regards,
Maxim Orlov.
Attachments:
v4-0001-Invalidate-Buffer-By-Bufnum.patchapplication/octet-stream; name=v4-0001-Invalidate-Buffer-By-Bufnum.patchDownload
From 9b7e523bd4960d42847b849cde226f7d97deb1b9 Mon Sep 17 00:00:00 2001
From: Palak <palakchaturvedi2843@gmail.com>
Date: Fri, 30 Jun 2023 08:21:06 +0000
Subject: [PATCH v4] Invalidate Buffer By Bufnum
---
contrib/pg_buffercache/Makefile | 2 +-
.../expected/pg_buffercache.out | 31 ++++++++
contrib/pg_buffercache/meson.build | 1 +
.../pg_buffercache--1.4--1.5.sql | 6 ++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 16 ++++
contrib/pg_buffercache/sql/pg_buffercache.sql | 9 +++
src/backend/storage/buffer/bufmgr.c | 74 +++++++++++++++++++
src/include/storage/bufmgr.h | 2 +
9 files changed, 141 insertions(+), 2 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index d6b58d4da9..eae65ead9e 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -8,7 +8,7 @@ OBJS = \
EXTENSION = pg_buffercache
DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
- pg_buffercache--1.3--1.4.sql
+ pg_buffercache--1.3--1.4.sql pg_buffercache--1.4--1.5.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
REGRESS = pg_buffercache
diff --git a/contrib/pg_buffercache/expected/pg_buffercache.out b/contrib/pg_buffercache/expected/pg_buffercache.out
index b745dc69ea..75d6f41164 100644
--- a/contrib/pg_buffercache/expected/pg_buffercache.out
+++ b/contrib/pg_buffercache/expected/pg_buffercache.out
@@ -55,3 +55,34 @@ SELECT count(*) > 0 FROM pg_buffercache_usage_counts();
t
(1 row)
+-- Check pg_buffercache_invalidate
+SELECT pg_buffercache_invalidate(NULL, NULL);
+ pg_buffercache_invalidate
+---------------------------
+
+(1 row)
+
+SELECT pg_buffercache_invalidate(1, NULL);
+ pg_buffercache_invalidate
+---------------------------
+
+(1 row)
+
+SELECT pg_buffercache_invalidate(NULL, false);
+ pg_buffercache_invalidate
+---------------------------
+
+(1 row)
+
+SELECT pg_buffercache_invalidate(NULL);
+ pg_buffercache_invalidate
+---------------------------
+
+(1 row)
+
+SELECT pg_buffercache_invalidate(1);
+ pg_buffercache_invalidate
+---------------------------
+ t
+(1 row)
+
diff --git a/contrib/pg_buffercache/meson.build b/contrib/pg_buffercache/meson.build
index c86e33cc95..1ca3452918 100644
--- a/contrib/pg_buffercache/meson.build
+++ b/contrib/pg_buffercache/meson.build
@@ -22,6 +22,7 @@ install_data(
'pg_buffercache--1.2--1.3.sql',
'pg_buffercache--1.2.sql',
'pg_buffercache--1.3--1.4.sql',
+ 'pg_buffercache--1.4--1.5.sql',
'pg_buffercache.control',
kwargs: contrib_data_args,
)
diff --git a/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
new file mode 100644
index 0000000000..c582260f50
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
@@ -0,0 +1,6 @@
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.5'" to load this file. \quit
+
+CREATE FUNCTION pg_buffercache_invalidate(IN int, IN bool default true)
+RETURNS bool
+AS 'MODULE_PATHNAME', 'pg_buffercache_invalidate'
+LANGUAGE C PARALLEL SAFE STRICT;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index a82ae5f9bb..5ee875f77d 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.4'
+default_version = '1.5'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 3316732365..b010f13c98 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -63,6 +63,7 @@ typedef struct
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
PG_FUNCTION_INFO_V1(pg_buffercache_summary);
PG_FUNCTION_INFO_V1(pg_buffercache_usage_counts);
+PG_FUNCTION_INFO_V1(pg_buffercache_invalidate);
Datum
pg_buffercache_pages(PG_FUNCTION_ARGS)
@@ -347,3 +348,18 @@ pg_buffercache_usage_counts(PG_FUNCTION_ARGS)
return (Datum) 0;
}
+
+/*
+ * Perform buffer invaidation.
+ */
+Datum
+pg_buffercache_invalidate(PG_FUNCTION_ARGS)
+{
+ Buffer buf = PG_GETARG_INT32(0);
+ bool force = PG_GETARG_BOOL(1);
+
+ if (buf > NBuffers || buf < -NLocBuffer || BufferIsInvalid(buf))
+ elog(ERROR, "bad buffer ID: %d", buf);
+
+ PG_RETURN_BOOL(TryInvalidateBuffer(buf, force));
+}
diff --git a/contrib/pg_buffercache/sql/pg_buffercache.sql b/contrib/pg_buffercache/sql/pg_buffercache.sql
index 944fbb1bea..f6671b96cc 100644
--- a/contrib/pg_buffercache/sql/pg_buffercache.sql
+++ b/contrib/pg_buffercache/sql/pg_buffercache.sql
@@ -26,3 +26,12 @@ SET ROLE pg_monitor;
SELECT count(*) > 0 FROM pg_buffercache;
SELECT buffers_used + buffers_unused > 0 FROM pg_buffercache_summary();
SELECT count(*) > 0 FROM pg_buffercache_usage_counts();
+
+-- Check pg_buffercache_invalidate
+SELECT pg_buffercache_invalidate(NULL, NULL);
+SELECT pg_buffercache_invalidate(1, NULL);
+SELECT pg_buffercache_invalidate(NULL, false);
+SELECT pg_buffercache_invalidate(NULL);
+
+
+SELECT pg_buffercache_invalidate(1);
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index f0f8d4259c..4db993d1a6 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -5704,3 +5704,77 @@ ResOwnerPrintBufferPin(Datum res)
{
return DebugPrintBufferRefcount(DatumGetInt32(res));
}
+
+/*
+ * Try Invalidating a shared buffer.
+ *
+ * If the buffer is invalid, the function returns false. The function checks
+ * for dirty buffer and flushes the dirty buffer before invalidating. If the
+ * buffer is still dirty it returns false.
+ */
+bool
+TryInvalidateBuffer(Buffer buf, bool force)
+{
+ BufferDesc *desc;
+ uint32 buf_state;
+
+ ReservePrivateRefCountEntry();
+
+ desc = BufferIsLocal(buf) ? GetLocalBufferDescriptor(-buf - 1) :
+ GetBufferDescriptor(buf - 1);
+
+ buf_state = LockBufHdr(desc);
+ if ((buf_state & BM_VALID) != BM_VALID)
+ {
+ UnlockBufHdr(desc, buf_state);
+ return false;
+ }
+
+ /* The buffer is pinned therefore cannot invalidate. */
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(desc, buf_state);
+ return false;
+ }
+
+ if ((buf_state & BM_DIRTY) == BM_DIRTY)
+ {
+ /*
+ * If the buffer is dirty and the user has not asked to clear the
+ * dirty buffer return false. Otherwise clear the dirty buffer.
+ */
+ if (!force)
+ {
+ UnlockBufHdr(desc, buf_state);
+ return false;
+ }
+
+ /* Try once to flush the dirty buffer. */
+ ResourceOwnerEnlarge(CurrentResourceOwner);
+ PinBuffer_Locked(desc);
+ LWLockAcquire(BufferDescriptorGetContentLock(desc), LW_SHARED);
+ FlushBuffer(desc, NULL, IOOBJECT_RELATION, IOCONTEXT_NORMAL);
+ LWLockRelease(BufferDescriptorGetContentLock(desc));
+
+ UnpinBuffer(desc);
+
+ /* If buffer is still used by someone, return false. */
+ buf_state = LockBufHdr(desc);
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(desc, buf_state);
+ return false;
+ }
+
+ /* If its dirty again or not valid anymore give up. */
+ if ((buf_state & (BM_DIRTY | BM_VALID)) != (BM_VALID))
+ {
+ UnlockBufHdr(desc, buf_state);
+ return false;
+ }
+
+ }
+
+ InvalidateBuffer(desc); /* releases spinlock */
+ return true;
+}
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index d51d46d335..a0069b0e0e 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -250,6 +250,8 @@ extern bool HoldingBufferPinThatDelaysRecovery(void);
extern bool BgBufferSync(struct WritebackContext *wb_context);
+extern bool TryInvalidateBuffer(Buffer buf, bool force);
+
/* in buf_init.c */
extern void InitBufferPool(void);
extern Size BufferShmemSize(void);
--
2.43.0
On Fri, Mar 8, 2024 at 6:20 AM Maxim Orlov <orlovmg@gmail.com> wrote:
Quite an interesting patch, in my opinion. I've decided to work on it a bit, did some refactoring (sorry) and add
basic tests. Also, I try to take into account as much as possible notes on the patch, mentioned by Cédric Villemain.
Thanks! Unfortunately I don't think it's possible to include a
regression test that looks at the output, because it'd be
non-deterministic. Any other backend could pin or dirty the buffer
you try to evict, changing the behaviour.
and maybe better to go with FlushOneBuffer() ?
It's a good idea, but I'm not sure at the moment. I'll try to dig some deeper into it. At least, FlushOneBuffer does
not work for a local buffers. So, we have to decide whatever pg_buffercache_invalidate should or should not
work for local buffers. For now, I don't see why it should not. Maybe I miss something?
I think it's OK to ignore local buffers for now. pg_buffercache
generally doesn't support/show them so I don't feel inclined to
support them for this. I removed a few traces of local support.
It didn't seem appropriate to use the pg_monitor role for this, so I
made it superuser-only. I don't think it makes much sense to use this
on any kind of production system so I don't think we need a new role
for it, and existing roles don't seem too appropriate. pageinspect et
al use the same approach.
I added a VOLATILE qualifier to the function.
I added some documentation.
I changed the name to pg_buffercache_evict().
I got rid of the 'force' flag which was used to say 'I only want to
evict this buffer it is clean'. I don't really see the point in that,
we might as well keep it simple. You could filter buffers on
"isdirty" if you want.
I added comments to scare anyone off using EvictBuffer() for anything
much, and marking it as something for developer convenience. (I am
aware of an experimental patch that uses this same function as part of
a buffer pool resizing operation, but that has other infrastructure to
make that safe and would adjust those remarks accordingly.)
I wondered whether it should really be testing for BM_TAG_VALID
rather than BM_VALID. Arguably, but it doesn't seem important for
now. The distinction would arise if someone had tried to read in a
buffer, got an I/O error and abandoned ship, leaving a buffer with a
valid tag but not valid contents. Anyone who tries to ReadBuffer() it
will then try to read it again, but in the meantime this function
won't be able to evict it (it'll just return false). Doesn't seem
that obvious to me that this obscure case needs to be handled. That
doesn't happen *during* a non-error case, because then it's pinned and
we already return false in this code for pins.
I contemplated whether InvalidateBuffer() or InvalidateVictimBuffer()
would be better here and realised that Andres's intuition was probably
right when he suggested the latter up-thread. It is designed with the
right sort of arbitrary concurrent activity in mind, where the former
assumes things about locking and dropping, which could get us into
trouble if not now maybe in the future.
I ran the following diabolical buffer blaster loop while repeatedly
running installcheck:
do
$$
begin
loop
perform pg_buffercache_evict(bufferid)
from pg_buffercache
where random() <= 0.25;
end loop;
End;
$$;
The only ill-effect was a hot laptop.
Thoughts, objections, etc?
Very simple example of use:
create or replace function uncache_relation(name text)
returns boolean
begin atomic;
select bool_and(pg_buffercache_evict(bufferid))
from pg_buffercache
where reldatabase = (select oid
from pg_database
where datname = current_database())
and relfilenode = pg_relation_filenode(name);
end;
More interesting for those of us hacking on streaming I/O stuff was
the ability to evict just parts of things and see how the I/O merging
and I/O depth react.
Attachments:
v5-0001-Add-pg_buffercache_evict-function-for-testing.patchapplication/octet-stream; name=v5-0001-Add-pg_buffercache_evict-function-for-testing.patchDownload
From 8d6f0abb79e5986a77d2ba90a8e8b9d7b2be36af Mon Sep 17 00:00:00 2001
From: Palak <palakchaturvedi2843@gmail.com>
Date: Fri, 30 Jun 2023 08:21:06 +0000
Subject: [PATCH v5] Add pg_buffercache_evict() function for testing.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When testing buffer pool logic, it is very useful to be able to evict
arbitrary pages. It's a pretty crude tool but can be used in SQL
queries on the pg_buffercache view to express a wide range of buffer
invalidation scenarios. Superuser-only.
Author: Palak Chaturvedi <chaturvedipalak1911@gmail.com>
Author: Thomas Munro <thomas.munro@gmail.com> (small adjustments)
Reviewed-by: Nitin Jadhav <nitinjadhavpostgres@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Cary Huang <cary.huang@highgo.ca>
Reviewed-by: Cédric Villemain <cedric.villemain+pgsql@abcsql.com>
Reviewed-by: Jim Nasby <jim.nasby@gmail.com>
Reviewed-by: Maxim Orlov <orlovmg@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CALfch19pW48ZwWzUoRSpsaV9hqt0UPyaBPC4bOZ4W+c7FF566A@mail.gmail.com
---
contrib/pg_buffercache/Makefile | 2 +-
contrib/pg_buffercache/meson.build | 1 +
.../pg_buffercache--1.4--1.5.sql | 6 ++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 20 +++++++
doc/src/sgml/pgbuffercache.sgml | 37 ++++++++++--
src/backend/storage/buffer/bufmgr.c | 57 +++++++++++++++++++
src/include/storage/bufmgr.h | 2 +
8 files changed, 119 insertions(+), 8 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index d6b58d4da9..eae65ead9e 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -8,7 +8,7 @@ OBJS = \
EXTENSION = pg_buffercache
DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
- pg_buffercache--1.3--1.4.sql
+ pg_buffercache--1.3--1.4.sql pg_buffercache--1.4--1.5.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
REGRESS = pg_buffercache
diff --git a/contrib/pg_buffercache/meson.build b/contrib/pg_buffercache/meson.build
index c86e33cc95..1ca3452918 100644
--- a/contrib/pg_buffercache/meson.build
+++ b/contrib/pg_buffercache/meson.build
@@ -22,6 +22,7 @@ install_data(
'pg_buffercache--1.2--1.3.sql',
'pg_buffercache--1.2.sql',
'pg_buffercache--1.3--1.4.sql',
+ 'pg_buffercache--1.4--1.5.sql',
'pg_buffercache.control',
kwargs: contrib_data_args,
)
diff --git a/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
new file mode 100644
index 0000000000..0fb18ff786
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
@@ -0,0 +1,6 @@
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.5'" to load this file. \quit
+
+CREATE FUNCTION pg_buffercache_evict(IN int)
+RETURNS bool
+AS 'MODULE_PATHNAME', 'pg_buffercache_evict'
+LANGUAGE C PARALLEL SAFE VOLATILE STRICT;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index a82ae5f9bb..5ee875f77d 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.4'
+default_version = '1.5'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 3316732365..fc33dffeeb 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -63,6 +63,7 @@ typedef struct
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
PG_FUNCTION_INFO_V1(pg_buffercache_summary);
PG_FUNCTION_INFO_V1(pg_buffercache_usage_counts);
+PG_FUNCTION_INFO_V1(pg_buffercache_evict);
Datum
pg_buffercache_pages(PG_FUNCTION_ARGS)
@@ -347,3 +348,22 @@ pg_buffercache_usage_counts(PG_FUNCTION_ARGS)
return (Datum) 0;
}
+
+/*
+ * Try to evict a block from a shared buffer.
+ */
+Datum
+pg_buffercache_evict(PG_FUNCTION_ARGS)
+{
+ Buffer buf = PG_GETARG_INT32(0);
+
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to use pg_buffercache_evict function")));
+
+ if (buf > NBuffers || buf < -NLocBuffer || BufferIsInvalid(buf))
+ elog(ERROR, "bad buffer ID: %d", buf);
+
+ PG_RETURN_BOOL(EvictBuffer(buf));
+}
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index afe2d97834..0aaffd5786 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -21,11 +21,16 @@
<primary>pg_buffercache_summary</primary>
</indexterm>
+ <indexterm>
+ <primary>pg_buffercache_evict</primary>
+ </indexterm>
+
<para>
This module provides the <function>pg_buffercache_pages()</function>
function (wrapped in the <structname>pg_buffercache</structname> view),
- the <function>pg_buffercache_summary()</function> function, and the
- <function>pg_buffercache_usage_counts()</function> function.
+ the <function>pg_buffercache_summary()</function> function, the
+ <function>pg_buffercache_usage_counts()</function> function and
+ the <function>pg_buffercache_evict()</function> function.
</para>
<para>
@@ -47,9 +52,15 @@
</para>
<para>
- By default, use is restricted to superusers and roles with privileges of the
- <literal>pg_monitor</literal> role. Access may be granted to others
- using <command>GRANT</command>.
+ By default, use of the above functions is restricted to superusers and roles
+ with privileges of the <literal>pg_monitor</literal> role. Access may be
+ granted to others using <command>GRANT</command>.
+ </para>
+
+ <para>
+ The <function>pg_buffercache_evict()</function> function allows a block to
+ be evicted from the buffer pool given a buffer identifier. Use of this
+ function is restricted to superusers only.
</para>
<sect2 id="pgbuffercache-pg-buffercache">
@@ -351,7 +362,21 @@
</para>
</sect2>
- <sect2 id="pgbuffercache-sample-output">
+ <sect2 id="pgbuffercache-pg-buffercache-evict">
+ <title>The <structname>pg_buffercache_evict</structname> Function</title>
+ <para>
+ The <function>pg_buffercache_evict()</function> takes a buffer identifier
+ from the <structfield>bufferid</structfield> of the
+ <structname>pg_buffercache</structname> view. It returns true on success, and
+ false if the buffer wasn't valid, couldn't be evicted because it was pinned, or
+ became dirty again after an attempt to write it out. The result is
+ immediately out of date upon return, as the buffer might become valid again at
+ any time due to concurrent activity. The function is intended for developer
+ testing.
+ </para>
+ </sect2>
+
+<sect2 id="pgbuffercache-sample-output">
<title>Sample Output</title>
<screen>
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 06e9ffd2b0..673d596b26 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -6003,3 +6003,60 @@ ResOwnerPrintBufferPin(Datum res)
{
return DebugPrintBufferRefcount(DatumGetInt32(res));
}
+
+/*
+ * Try to evict a shared a buffer.
+ *
+ * This function is intended for testing/development use only. By the time it
+ * returns, the buffer may again be occupied, perhaps even by the same block.
+ *
+ * Returns true if the buffer was valid and it has now been made invalid.
+ * Returns false if the wasn't valid, or it couldn't be evicted due to a pin,
+ * or if the buffer becomes dirty again while we're trying to write it out.
+ */
+bool
+EvictBuffer(Buffer buf)
+{
+ BufferDesc *desc;
+ uint32 buf_state;
+ bool result;
+
+ /* Make sure we can pin the buffer if it turned out require writing. */
+ ReservePrivateRefCountEntry();
+
+ Assert(!BufferIsLocal(buf));
+ desc = GetBufferDescriptor(buf - 1);
+
+ /* Lock the header and check if it's valid. */
+ buf_state = LockBufHdr(desc);
+ if ((buf_state & BM_VALID) == 0)
+ {
+ UnlockBufHdr(desc, buf_state);
+ return false;
+ }
+
+ /* Check that it's not pinned by someone else. */
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(desc, buf_state);
+ return false;
+ }
+
+ PinBuffer_Locked(desc); /* releases spinlock */
+
+ /* If it was dirty, try to clean it once. */
+ if (buf_state & BM_DIRTY)
+ {
+ ResourceOwnerEnlarge(CurrentResourceOwner);
+ LWLockAcquire(BufferDescriptorGetContentLock(desc), LW_SHARED);
+ FlushBuffer(desc, NULL, IOOBJECT_RELATION, IOCONTEXT_NORMAL);
+ LWLockRelease(BufferDescriptorGetContentLock(desc));
+ }
+
+ /* This will return false if it becomes dirty or someone else pins it. */
+ result = InvalidateVictimBuffer(desc);
+
+ UnpinBuffer(desc);
+
+ return result;
+}
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index f380f9d9a6..f278a697c3 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -305,6 +305,8 @@ extern bool BgBufferSync(struct WritebackContext *wb_context);
extern void LimitAdditionalPins(uint32 *additional_pins);
extern void LimitAdditionalLocalPins(uint32 *additional_pins);
+extern bool EvictBuffer(Buffer buf);
+
/* in buf_init.c */
extern void InitBufferPool(void);
extern Size BufferShmemSize(void);
--
2.39.3 (Apple Git-146)
On second thoughts, I think the original "invalidate" terminology was
fine, no need to invent a new term.
I thought of a better name for the bufmgr.c function though:
InvalidateUnpinnedBuffer(). That name seemed better to me after I
festooned it with warnings about why exactly it's inherently racy and
only for testing use.
I suppose someone could propose an additional function
pg_buffercache_invalidate(db, tbspc, rel, fork, blocknum) that would
be slightly better in the sense that it couldn't accidentally evict
some innocent block that happened to replace the real target just
before it runs, but I don't think it matters much for this purpose and
it would still be racy on return (vacuum decides to load your block
back in) so I don't think it's worth bothering with.
So this is the version I plan to commit.
Attachments:
v6-0001-Add-pg_buffercache_invalidate-function-for-testin.patchtext/x-patch; charset=UTF-8; name=v6-0001-Add-pg_buffercache_invalidate-function-for-testin.patchDownload
From 6a13349b788c8539b2d349f9553706d7c563f8f8 Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Sun, 7 Apr 2024 09:13:17 +1200
Subject: [PATCH v6] Add pg_buffercache_invalidate() function for testing.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When testing buffer pool logic, it is useful to be able to evict
arbitrary blocks. This function can be used in SQL queries over the
pg_buffercache view to set up a wide range of buffer pool states. Of
course, buffer mappings might change concurrently so you might evict a
block other than the one you had in mind, and another session might
bring it back in at any time. That's OK for the intended purpose of
setting up developer testing scenarios, and more complicated interlocking
schemes to give stronger guararantees about that would likely be less
flexible for actual testing work anyway. Superuser-only.
Author: Palak Chaturvedi <chaturvedipalak1911@gmail.com>
Author: Thomas Munro <thomas.munro@gmail.com> (docs, small tweaks)
Reviewed-by: Nitin Jadhav <nitinjadhavpostgres@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Cary Huang <cary.huang@highgo.ca>
Reviewed-by: Cédric Villemain <cedric.villemain+pgsql@abcsql.com>
Reviewed-by: Jim Nasby <jim.nasby@gmail.com>
Reviewed-by: Maxim Orlov <orlovmg@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CALfch19pW48ZwWzUoRSpsaV9hqt0UPyaBPC4bOZ4W+c7FF566A@mail.gmail.com
---
contrib/pg_buffercache/Makefile | 2 +-
contrib/pg_buffercache/meson.build | 1 +
.../pg_buffercache--1.4--1.5.sql | 6 ++
contrib/pg_buffercache/pg_buffercache.control | 2 +-
contrib/pg_buffercache/pg_buffercache_pages.c | 20 ++++++
doc/src/sgml/pgbuffercache.sgml | 37 +++++++++--
src/backend/storage/buffer/bufmgr.c | 63 +++++++++++++++++++
src/include/storage/bufmgr.h | 2 +
8 files changed, 125 insertions(+), 8 deletions(-)
create mode 100644 contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
diff --git a/contrib/pg_buffercache/Makefile b/contrib/pg_buffercache/Makefile
index d6b58d4da9..eae65ead9e 100644
--- a/contrib/pg_buffercache/Makefile
+++ b/contrib/pg_buffercache/Makefile
@@ -8,7 +8,7 @@ OBJS = \
EXTENSION = pg_buffercache
DATA = pg_buffercache--1.2.sql pg_buffercache--1.2--1.3.sql \
pg_buffercache--1.1--1.2.sql pg_buffercache--1.0--1.1.sql \
- pg_buffercache--1.3--1.4.sql
+ pg_buffercache--1.3--1.4.sql pg_buffercache--1.4--1.5.sql
PGFILEDESC = "pg_buffercache - monitoring of shared buffer cache in real-time"
REGRESS = pg_buffercache
diff --git a/contrib/pg_buffercache/meson.build b/contrib/pg_buffercache/meson.build
index c86e33cc95..1ca3452918 100644
--- a/contrib/pg_buffercache/meson.build
+++ b/contrib/pg_buffercache/meson.build
@@ -22,6 +22,7 @@ install_data(
'pg_buffercache--1.2--1.3.sql',
'pg_buffercache--1.2.sql',
'pg_buffercache--1.3--1.4.sql',
+ 'pg_buffercache--1.4--1.5.sql',
'pg_buffercache.control',
kwargs: contrib_data_args,
)
diff --git a/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
new file mode 100644
index 0000000000..92c530bc19
--- /dev/null
+++ b/contrib/pg_buffercache/pg_buffercache--1.4--1.5.sql
@@ -0,0 +1,6 @@
+\echo Use "ALTER EXTENSION pg_buffercache UPDATE TO '1.5'" to load this file. \quit
+
+CREATE FUNCTION pg_buffercache_invalidate(IN int)
+RETURNS bool
+AS 'MODULE_PATHNAME', 'pg_buffercache_invalidate'
+LANGUAGE C PARALLEL SAFE VOLATILE STRICT;
diff --git a/contrib/pg_buffercache/pg_buffercache.control b/contrib/pg_buffercache/pg_buffercache.control
index a82ae5f9bb..5ee875f77d 100644
--- a/contrib/pg_buffercache/pg_buffercache.control
+++ b/contrib/pg_buffercache/pg_buffercache.control
@@ -1,5 +1,5 @@
# pg_buffercache extension
comment = 'examine the shared buffer cache'
-default_version = '1.4'
+default_version = '1.5'
module_pathname = '$libdir/pg_buffercache'
relocatable = true
diff --git a/contrib/pg_buffercache/pg_buffercache_pages.c b/contrib/pg_buffercache/pg_buffercache_pages.c
index 3316732365..9617bfa47b 100644
--- a/contrib/pg_buffercache/pg_buffercache_pages.c
+++ b/contrib/pg_buffercache/pg_buffercache_pages.c
@@ -63,6 +63,7 @@ typedef struct
PG_FUNCTION_INFO_V1(pg_buffercache_pages);
PG_FUNCTION_INFO_V1(pg_buffercache_summary);
PG_FUNCTION_INFO_V1(pg_buffercache_usage_counts);
+PG_FUNCTION_INFO_V1(pg_buffercache_invalidate);
Datum
pg_buffercache_pages(PG_FUNCTION_ARGS)
@@ -347,3 +348,22 @@ pg_buffercache_usage_counts(PG_FUNCTION_ARGS)
return (Datum) 0;
}
+
+/*
+ * Try to invalidate a shared buffer.
+ */
+Datum
+pg_buffercache_invalidate(PG_FUNCTION_ARGS)
+{
+ Buffer buf = PG_GETARG_INT32(0);
+
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to use pg_buffercache_invalidate function")));
+
+ if (buf < 1 || buf > NBuffers)
+ elog(ERROR, "bad buffer ID: %d", buf);
+
+ PG_RETURN_BOOL(InvalidateUnpinnedBuffer(buf));
+}
diff --git a/doc/src/sgml/pgbuffercache.sgml b/doc/src/sgml/pgbuffercache.sgml
index afe2d97834..ec6e9e900f 100644
--- a/doc/src/sgml/pgbuffercache.sgml
+++ b/doc/src/sgml/pgbuffercache.sgml
@@ -21,11 +21,16 @@
<primary>pg_buffercache_summary</primary>
</indexterm>
+ <indexterm>
+ <primary>pg_buffercache_invalidate</primary>
+ </indexterm>
+
<para>
This module provides the <function>pg_buffercache_pages()</function>
function (wrapped in the <structname>pg_buffercache</structname> view),
- the <function>pg_buffercache_summary()</function> function, and the
- <function>pg_buffercache_usage_counts()</function> function.
+ the <function>pg_buffercache_summary()</function> function, the
+ <function>pg_buffercache_usage_counts()</function> function and
+ the <function>pg_buffercache_invalidate()</function> function.
</para>
<para>
@@ -47,9 +52,15 @@
</para>
<para>
- By default, use is restricted to superusers and roles with privileges of the
- <literal>pg_monitor</literal> role. Access may be granted to others
- using <command>GRANT</command>.
+ By default, use of the above functions is restricted to superusers and roles
+ with privileges of the <literal>pg_monitor</literal> role. Access may be
+ granted to others using <command>GRANT</command>.
+ </para>
+
+ <para>
+ The <function>pg_buffercache_invalidate()</function> function allows a block to
+ be evicted from the buffer pool given a buffer identifier. Use of this
+ function is restricted to superusers only.
</para>
<sect2 id="pgbuffercache-pg-buffercache">
@@ -351,7 +362,21 @@
</para>
</sect2>
- <sect2 id="pgbuffercache-sample-output">
+ <sect2 id="pgbuffercache-pg-buffercache-invalidate">
+ <title>The <structname>pg_buffercache_invalidate</structname> Function</title>
+ <para>
+ The <function>pg_buffercache_invalidate()</function> function takes a buffer
+ identifier, as shown in the <structfield>bufferid</structfield> column of
+ the <structname>pg_buffercache</structname> view. It returns true on success,
+ and false if the buffer wasn't valid, couldn't be evicted because it was
+ pinned, or became dirty again after an attempt to write it out. The result
+ is immediately out of date upon return, as the buffer might become valid
+ again at any time due to concurrent activity. The function is intended for
+ developer testing only.
+ </para>
+ </sect2>
+
+<sect2 id="pgbuffercache-sample-output">
<title>Sample Output</title>
<screen>
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 06e9ffd2b0..7643ea6b2c 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -6003,3 +6003,66 @@ ResOwnerPrintBufferPin(Datum res)
{
return DebugPrintBufferRefcount(DatumGetInt32(res));
}
+
+/*
+ * Try to invalidate a shared buffer.
+ *
+ * This function is intended for testing/development use only!
+ *
+ * To succeed, the buffer should not be pinned on entry, so if the caller had a
+ * particular block in mind, it might already have been replaced by some other
+ * block by the time this function runs. It's also unpinned on return, so the
+ * buffer might be occupied again by the time control is returned, potentially
+ * even by the same block. This inherent raciness without other interlocking
+ * makes the function unsuitable for non-testing usage.
+ *
+ * Returns true if the buffer was valid and it has now been made invalid.
+ * Returns false if the wasn't valid, or it couldn't be evicted due to a pin,
+ * or if the buffer becomes dirty again while we're trying to write it out.
+ */
+bool
+InvalidateUnpinnedBuffer(Buffer buf)
+{
+ BufferDesc *desc;
+ uint32 buf_state;
+ bool result;
+
+ /* Make sure we can pin the buffer. */
+ ResourceOwnerEnlarge(CurrentResourceOwner);
+ ReservePrivateRefCountEntry();
+
+ Assert(!BufferIsLocal(buf));
+ desc = GetBufferDescriptor(buf - 1);
+
+ /* Lock the header and check if it's valid. */
+ buf_state = LockBufHdr(desc);
+ if ((buf_state & BM_VALID) == 0)
+ {
+ UnlockBufHdr(desc, buf_state);
+ return false;
+ }
+
+ /* Check that it's not pinned already. */
+ if (BUF_STATE_GET_REFCOUNT(buf_state) > 0)
+ {
+ UnlockBufHdr(desc, buf_state);
+ return false;
+ }
+
+ PinBuffer_Locked(desc); /* releases spinlock */
+
+ /* If it was dirty, try to clean it once. */
+ if (buf_state & BM_DIRTY)
+ {
+ LWLockAcquire(BufferDescriptorGetContentLock(desc), LW_SHARED);
+ FlushBuffer(desc, NULL, IOOBJECT_RELATION, IOCONTEXT_NORMAL);
+ LWLockRelease(BufferDescriptorGetContentLock(desc));
+ }
+
+ /* This will return false if it becomes dirty or someone else pins it. */
+ result = InvalidateVictimBuffer(desc);
+
+ UnpinBuffer(desc);
+
+ return result;
+}
diff --git a/src/include/storage/bufmgr.h b/src/include/storage/bufmgr.h
index 07ba1a6050..cce8eddbdd 100644
--- a/src/include/storage/bufmgr.h
+++ b/src/include/storage/bufmgr.h
@@ -305,6 +305,8 @@ extern bool BgBufferSync(struct WritebackContext *wb_context);
extern void LimitAdditionalPins(uint32 *additional_pins);
extern void LimitAdditionalLocalPins(uint32 *additional_pins);
+extern bool InvalidateUnpinnedBuffer(Buffer buf);
+
/* in buf_init.c */
extern void InitBufferPool(void);
extern Size BufferShmemSize(void);
--
2.44.0
On Sat, Apr 6, 2024 at 7:08 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On second thoughts, I think the original "invalidate" terminology was
fine, no need to invent a new term.I thought of a better name for the bufmgr.c function though:
InvalidateUnpinnedBuffer(). That name seemed better to me after I
festooned it with warnings about why exactly it's inherently racy and
only for testing use.I suppose someone could propose an additional function
pg_buffercache_invalidate(db, tbspc, rel, fork, blocknum) that would
be slightly better in the sense that it couldn't accidentally evict
some innocent block that happened to replace the real target just
before it runs, but I don't think it matters much for this purpose and
it would still be racy on return (vacuum decides to load your block
back in) so I don't think it's worth bothering with.So this is the version I plan to commit.
I've reviewed v6. I think you should mention in the docs that it only
works for shared buffers -- so specifically not buffers containing
blocks of temp tables.
In the function pg_buffercache_invalidate(), why not use the
BufferIsValid() function?
- if (buf < 1 || buf > NBuffers)
+ if (!BufferIsValid(buf) || buf > NBuffers)
I thought the below would be more clear for the comment above
InvalidateUnpinnedBuffer().
- * Returns true if the buffer was valid and it has now been made invalid.
- * Returns false if the wasn't valid, or it couldn't be evicted due to a pin,
- * or if the buffer becomes dirty again while we're trying to write it out.
+ * Returns true if the buffer was valid and has now been made invalid. Returns
+ * false if it wasn't valid, if it couldn't be evicted due to a pin, or if the
+ * buffer becomes dirty again while we're trying to write it out.
Some of that probably applies for the docs too (i.e. you have some
similar wording in the docs). There is actually one typo in your
version, so even if you don't adopt my suggestion, you should fix that
typo.
I didn't notice anything else out of place. I tried it and it worked
as expected. I'm excited to have this feature!
I didn't read through this whole thread, but was there any talk of
adding other functions to let me invalidate a bunch of buffers at once
or even some options -- like invalidate every 3rd buffer or whatever?
(Not the concern of this patch, but just wondering because that would
be a useful future enhancement IMO).
- Melanie
Hi,
On 2024-04-07 11:07:58 +1200, Thomas Munro wrote:
I thought of a better name for the bufmgr.c function though:
InvalidateUnpinnedBuffer(). That name seemed better to me after I
festooned it with warnings about why exactly it's inherently racy and
only for testing use.
I still dislike that, fwiw, due to the naming similarity to
InvalidateBuffer(), which throws away dirty buffer contents too. Which
obviously isn't acceptable from "userspace". I'd just name it
pg_buffercache_evict() - given that the commit message's first paragraph uses
"it is useful to be able to evict arbitrary blocks" that seems to describe
things at least as well as "invalidate"?
Greetings,
Andres Freund
On Mon, Apr 8, 2024 at 12:10 PM Andres Freund <andres@anarazel.de> wrote:
On 2024-04-07 11:07:58 +1200, Thomas Munro wrote:
I thought of a better name for the bufmgr.c function though:
InvalidateUnpinnedBuffer(). That name seemed better to me after I
festooned it with warnings about why exactly it's inherently racy and
only for testing use.I still dislike that, fwiw, due to the naming similarity to
InvalidateBuffer(), which throws away dirty buffer contents too. Which
obviously isn't acceptable from "userspace". I'd just name it
pg_buffercache_evict() - given that the commit message's first paragraph uses
"it is useful to be able to evict arbitrary blocks" that seems to describe
things at least as well as "invalidate"?
Alright, sold. I'll go with EvictUnpinnedBuffer() in bufmgr.c and
pg_buffercache_evict() in the contrib module.
On Mon, Apr 8, 2024 at 11:53 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
I've reviewed v6. I think you should mention in the docs that it only
works for shared buffers -- so specifically not buffers containing
blocks of temp tables.
Thanks for looking! The whole pg_buffercache extension is for working
with shared buffers only, as mentioned at the top. I have tried to
improve that paragraph though, as it only mentioned examining them.
In the function pg_buffercache_invalidate(), why not use the
BufferIsValid() function?- if (buf < 1 || buf > NBuffers) + if (!BufferIsValid(buf) || buf > NBuffers)
It doesn't check the range (it has assertions, not errors).
I thought the below would be more clear for the comment above
InvalidateUnpinnedBuffer().- * Returns true if the buffer was valid and it has now been made invalid. - * Returns false if the wasn't valid, or it couldn't be evicted due to a pin, - * or if the buffer becomes dirty again while we're trying to write it out. + * Returns true if the buffer was valid and has now been made invalid. Returns + * false if it wasn't valid, if it couldn't be evicted due to a pin, or if the + * buffer becomes dirty again while we're trying to write it out.
Fixed.
Some of that probably applies for the docs too (i.e. you have some
similar wording in the docs). There is actually one typo in your
version, so even if you don't adopt my suggestion, you should fix that
typo.
Yeah, thanks, improved similarly there.
I didn't notice anything else out of place. I tried it and it worked
as expected. I'm excited to have this feature!
Thanks!
I didn't read through this whole thread, but was there any talk of
adding other functions to let me invalidate a bunch of buffers at once
or even some options -- like invalidate every 3rd buffer or whatever?
(Not the concern of this patch, but just wondering because that would
be a useful future enhancement IMO).
TBH I tried to resist people steering in that direction because you
can also just define a SQL function to do that built on this, and if
you had specialised functions they'd never be quite right. IMHO we
succeeded in minimising the engineering and maximising flexibility,
'cause it's for hackers. Crude, but already able to express a wide
range of stuff by punting the problem to SQL.
Thanks to Palak for the patch. Pushed.
On 07.04.2024 02:07, Thomas Munro wrote:
So this is the version I plan to commit.
+bool +EvictUnpinnedBuffer(Buffer buf) +{ ... + /* This will return false if it becomes dirty or someone else pins it. */ + result = InvalidateVictimBuffer(desc); + + UnpinBuffer(desc); + + return result; +}
Hi, Thomas!
Should not we call at the end the StrategyFreeBuffer() function to add
target buffer to freelist and not miss it after invalidation?
--
Best regards,
Maksim Milyutin
On 14.04.2024 21:16, Maksim Milyutin wrote:
On 07.04.2024 02:07, Thomas Munro wrote:
So this is the version I plan to commit.
+bool +EvictUnpinnedBuffer(Buffer buf) +{ ... + /* This will return false if it becomes dirty or someone else pins it. */ + result = InvalidateVictimBuffer(desc); + + UnpinBuffer(desc); + + return result; +}Hi, Thomas!
Should not we call at the end the StrategyFreeBuffer() function to add
target buffer to freelist and not miss it after invalidation?
Hello everyone!
Please take a look at this issue, current implementation of
EvictUnpinnedBuffer() IMO is erroneous - evicted buffers are lost
permanently and will not be reused again
--
Best regards,
Maksim Milyutin
On Tue, Apr 30, 2024 at 6:47 AM Maksim Milyutin <milyutinma@gmail.com> wrote:
Should not we call at the end the StrategyFreeBuffer() function to add target buffer to freelist and not miss it after invalidation?
Please take a look at this issue, current implementation of EvictUnpinnedBuffer() IMO is erroneous - evicted buffers are lost permanently and will not be reused again
Hi Maksim,
Oops, thanks, will fix.
On Tue, Apr 30, 2024 at 7:17 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, Apr 30, 2024 at 6:47 AM Maksim Milyutin <milyutinma@gmail.com> wrote:
Should not we call at the end the StrategyFreeBuffer() function to add target buffer to freelist and not miss it after invalidation?
Please take a look at this issue, current implementation of EvictUnpinnedBuffer() IMO is erroneous - evicted buffers are lost permanently and will not be reused again
I don't think that's true: it is not lost permanently, it'll be found
by the regular clock hand. Perhaps it should be put on the freelist
so it can be found again quickly, but I'm not sure that's a bug, is
it? If it were true, even basic testing eg select
count(pg_buffercache_evict(bufferid)) from pg_buffercache would leave
the system non-functional, but it doesn't, the usual CLOCK algorithm
just does its thing.
On 29.04.2024 23:59, Thomas Munro wrote:
On Tue, Apr 30, 2024 at 7:17 AM Thomas Munro<thomas.munro@gmail.com> wrote:
On Tue, Apr 30, 2024 at 6:47 AM Maksim Milyutin<milyutinma@gmail.com> wrote:
Should not we call at the end the StrategyFreeBuffer() function to add target buffer to freelist and not miss it after invalidation?
Please take a look at this issue, current implementation of EvictUnpinnedBuffer() IMO is erroneous - evicted buffers are lost permanently and will not be reused again
I don't think that's true: it is not lost permanently, it'll be found
by the regular clock hand. Perhaps it should be put on the freelist
so it can be found again quickly, but I'm not sure that's a bug, is
it?
Yeah, you are right. Thanks for clarification.
CLOCK algorithm will reuse it eventually but being of evicted cleared
buffer in freelist might greatly restrict the time of buffer allocation
when all others buffers were in use.
--
Best regards,
Maksim Milyutin