Scan buffercache for a table

Started by Aminalmost 3 years ago4 messages
#1Amin
amin.fallahi@gmail.com

Hi,

I am looking for function calls to scan the buffer cache for a table and
find the cached pages. I want to find out which pages are cached and which
of them are dirty. Having the relation id, how can I do that? I have gone
through bufmgr.c and relcache.c, but could not find a way to get
relation-specific pages from the buffer cache.

Thank you!

#2Justin Pryzby
pryzby@telsasoft.com
In reply to: Amin (#1)
Re: Scan buffercache for a table

On Mon, Jan 30, 2023 at 06:01:08PM -0800, Amin wrote:

Hi,

I am looking for function calls to scan the buffer cache for a table and
find the cached pages. I want to find out which pages are cached and which
of them are dirty. Having the relation id, how can I do that? I have gone
through bufmgr.c and relcache.c, but could not find a way to get
relation-specific pages from the buffer cache.

This looks like a re-post of the question you asked on Jan 13:
CAF-KA8_axSMpQW1scOTnAQx8NFHgmJc6L87QzAo3JezLiBU1HQ@mail.gmail.com
It'd be better not to start a new thread (or if you do that, it'd be
better to mention the old one and include its participants).

On Fri, Jan 13, 2023 at 05:28:31PM -0800, Amin wrote:

Hi,

Before scanning a relation, in the planner stage, I want to make a
call to
retrieve information about how many pages will be a hit for a specific
relation. The module pg_buffercache seems to be doing a similar thing.

The planner is a *model* which (among other things) tries to guess how
many pages will be read/hit. It's not expected to be anywhere near
accurate.

pg_buffercache only checks for pages within postgres' own buffer cache.
It doesn't look for pages which are in the OS page cache, which require
a system call to access (but don't require device I/O).

Read about pgfincore for introinspection of the OS page cache.

Also, pg_statio_all_tables seems to be having that information, but it
is updated after execution. However, I want the information before
execution. Also not sure how pg_statio_all_tables is created and how
I can access it in the code.

But the view isn't omnicient. When you execute a plan, you don't know
how it's going to end. If you did, you wouldn't need to run it - you
could just print the answer.

Note that planning and execution are separate and independant. It's
possible to plan a query without ever running it, or to plan it once and
run it multiple times. The view reflects I/O requested by postgres; the
I/O normally comes primarily from execution.

You can look at how the view is defined:
\sv pg_statio_all_tables

And then you can look at how the functions that it calls are implemented
(\df+). Same for pg_buffercache. It seems like you'll want to learn
how to navigate the source code to find how things are connected.

--
Justin

#3Amin
amin.fallahi@gmail.com
In reply to: Justin Pryzby (#2)
Re: Scan buffercache for a table

Thank you Justin. I started a new thread because the context is a little
bit different. I am no longer interested in statistics anymore. I want to
find exact individual pages of a table which are cached and are/aren't
dirty. pg_buffercache implements the loop, but it goes over all the
buffers. However, I want to scan a specific table cache pages.

On Mon, Jan 30, 2023 at 6:43 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

Show quoted text

On Mon, Jan 30, 2023 at 06:01:08PM -0800, Amin wrote:

Hi,

I am looking for function calls to scan the buffer cache for a table and
find the cached pages. I want to find out which pages are cached and

which

of them are dirty. Having the relation id, how can I do that? I have gone
through bufmgr.c and relcache.c, but could not find a way to get
relation-specific pages from the buffer cache.

This looks like a re-post of the question you asked on Jan 13:
CAF-KA8_axSMpQW1scOTnAQx8NFHgmJc6L87QzAo3JezLiBU1HQ@mail.gmail.com
It'd be better not to start a new thread (or if you do that, it'd be
better to mention the old one and include its participants).

On Fri, Jan 13, 2023 at 05:28:31PM -0800, Amin wrote:

Hi,

Before scanning a relation, in the planner stage, I want to make a
call to
retrieve information about how many pages will be a hit for a specific
relation. The module pg_buffercache seems to be doing a similar thing.

The planner is a *model* which (among other things) tries to guess how
many pages will be read/hit. It's not expected to be anywhere near
accurate.

pg_buffercache only checks for pages within postgres' own buffer cache.
It doesn't look for pages which are in the OS page cache, which require
a system call to access (but don't require device I/O).

Read about pgfincore for introinspection of the OS page cache.

Also, pg_statio_all_tables seems to be having that information, but it
is updated after execution. However, I want the information before
execution. Also not sure how pg_statio_all_tables is created and how
I can access it in the code.

But the view isn't omnicient. When you execute a plan, you don't know
how it's going to end. If you did, you wouldn't need to run it - you
could just print the answer.

Note that planning and execution are separate and independant. It's
possible to plan a query without ever running it, or to plan it once and
run it multiple times. The view reflects I/O requested by postgres; the
I/O normally comes primarily from execution.

You can look at how the view is defined:
\sv pg_statio_all_tables

And then you can look at how the functions that it calls are implemented
(\df+). Same for pg_buffercache. It seems like you'll want to learn
how to navigate the source code to find how things are connected.

--
Justin

#4Justin Pryzby
pryzby@telsasoft.com
In reply to: Amin (#3)
Re: Scan buffercache for a table

On Mon, Jan 30, 2023 at 08:11:30PM -0800, Amin wrote:

Thank you Justin. I started a new thread because the context is a little
bit different. I am no longer interested in statistics anymore. I want to
find exact individual pages of a table which are cached and are/aren't
dirty. pg_buffercache implements the loop, but it goes over all the
buffers. However, I want to scan a specific table cache pages.

Check ReadBuffer*(), BufTableLookup() or loops around it like
FindAndDropRelationBuffers(), which is in the file you referenced.

Show quoted text

On Mon, Jan 30, 2023 at 06:01:08PM -0800, Amin wrote:

Hi,

I am looking for function calls to scan the buffer cache for a table and
find the cached pages. I want to find out which pages are cached and which
of them are dirty. Having the relation id, how can I do that? I have gone
through bufmgr.c and relcache.c, but could not find a way to get
relation-specific pages from the buffer cache.