pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Started by Andres Freundabout 6 years ago173 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

Currently pg_stat_bgwriter.buffers_backend is pretty useless to gauge
whether backends are doing writes they shouldn't do. That's because it
counts things that are either unavoidably or unlikely doable by other
parts of the system (checkpointer, bgwriter).

In particular extending the file can not currently be done by any
another type of process, yet is counted. When using a buffer access
strategy it is also very likely that writes have to be done by the
'dirtying' backend itself, as the buffer will be reused soon after (when
not previously in s_b that is).

Additionally pg_stat_bgwriter.buffers_backend also counts writes done by
autovacuum et al.

I think it'd make sense to at least split buffers_backend into
buffers_backend_extend,
buffers_backend_write,
buffers_backend_write_strat

but it could also be worthwhile to expand it into
buffers_backend_extend,
buffers_{backend,checkpoint,bgwriter,autovacuum}_write
buffers_{backend,autovacuum}_write_stat

Possibly by internally, in contrast to SQL level, having just counter
arrays indexed by backend types.

It's also noteworthy that buffers_backend is accounted in an absurd
manner. One might think that writes are accounted from backend -> shared
memory or such. But instead it works like this:

1) backend flushes buffer in bufmgr.c, accounts for backend *write time*
2) mdwrite writes and registers a sync request, which forwards the sync request to checkpointer
3) ForwardSyncRequest(), when not called by bgwriter, increments CheckpointerShmem->num_backend_writes
4) checkpointer, whenever doing AbsorbSyncRequests(), moves
CheckpointerShmem->num_backend_writes to
BgWriterStats.m_buf_written_backend (local memory!)
5) Occasionally it calls pgstat_send_bgwriter(), which sends the data to
pgstat (which bgwriter also does)
6) Which then updates the shared memory used by the display functions

Worthwhile to note that backend buffer read/write *time* is accounted
differently. That's done via pgstat_send_tabstat().

I think there's very little excuse for the indirection via checkpointer,
besides architectually being weird, it actually requires that we
continue to wake up checkpointer over and over instead of optimizing how
and when we submit fsync requests.

As far as I can tell we're also simply not accounting at all for writes
done outside of shared buffers. All writes done directly through
smgrwrite()/extend() aren't accounted anywhere as far as I can tell.

I think we also count things as writes that aren't writes: mdtruncate()
is AFAICT counted as one backend write for each segment. Which seems
weird to me.

Lastly, I don't understand what the point of sending fixed size stats,
like the stuff underlying pg_stat_bgwriter, through pgstats IPC. While
I don't like it's architecture, we obviously need something like pgstat
to handle variable amounts of stats (database, table level etc
stats). But that doesn't at all apply to these types of global stats.

Greetings,

Andres Freund

#2Magnus Hagander
magnus@hagander.net
In reply to: Andres Freund (#1)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

On Fri, Jan 24, 2020 at 8:52 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

Currently pg_stat_bgwriter.buffers_backend is pretty useless to gauge
whether backends are doing writes they shouldn't do. That's because it
counts things that are either unavoidably or unlikely doable by other
parts of the system (checkpointer, bgwriter).
In particular extending the file can not currently be done by any
another type of process, yet is counted. When using a buffer access
strategy it is also very likely that writes have to be done by the
'dirtying' backend itself, as the buffer will be reused soon after (when
not previously in s_b that is).

Yeah. That's quite annoying.

Additionally pg_stat_bgwriter.buffers_backend also counts writes done by
autovacuum et al.

I think it'd make sense to at least split buffers_backend into
buffers_backend_extend,
buffers_backend_write,
buffers_backend_write_strat

but it could also be worthwhile to expand it into
buffers_backend_extend,
buffers_{backend,checkpoint,bgwriter,autovacuum}_write
buffers_{backend,autovacuum}_write_stat

Given that these are individual global counters, I don't really see
any reason not to expand it to the bigger set of counters. It's easy
enough to add them up together later if needed.

Possibly by internally, in contrast to SQL level, having just counter
arrays indexed by backend types.

It's also noteworthy that buffers_backend is accounted in an absurd
manner. One might think that writes are accounted from backend -> shared
memory or such. But instead it works like this:

1) backend flushes buffer in bufmgr.c, accounts for backend *write time*
2) mdwrite writes and registers a sync request, which forwards the sync request to checkpointer
3) ForwardSyncRequest(), when not called by bgwriter, increments CheckpointerShmem->num_backend_writes
4) checkpointer, whenever doing AbsorbSyncRequests(), moves
CheckpointerShmem->num_backend_writes to
BgWriterStats.m_buf_written_backend (local memory!)
5) Occasionally it calls pgstat_send_bgwriter(), which sends the data to
pgstat (which bgwriter also does)
6) Which then updates the shared memory used by the display functions

Worthwhile to note that backend buffer read/write *time* is accounted
differently. That's done via pgstat_send_tabstat().

I think there's very little excuse for the indirection via checkpointer,
besides architectually being weird, it actually requires that we
continue to wake up checkpointer over and over instead of optimizing how
and when we submit fsync requests.

As far as I can tell we're also simply not accounting at all for writes
done outside of shared buffers. All writes done directly through
smgrwrite()/extend() aren't accounted anywhere as far as I can tell.

I think we also count things as writes that aren't writes: mdtruncate()
is AFAICT counted as one backend write for each segment. Which seems
weird to me.

It's at least slightly weird :) Might it be worth counting truncate
events separately?

Lastly, I don't understand what the point of sending fixed size stats,
like the stuff underlying pg_stat_bgwriter, through pgstats IPC. While
I don't like it's architecture, we obviously need something like pgstat
to handle variable amounts of stats (database, table level etc
stats). But that doesn't at all apply to these types of global stats.

That part has annoyed me as well a few times. +1 for just moving that
into a global shared memory. Given that we don't really care about
things being in sync between those different counters *or* if we loose
a bit of data (which the stats collector is designed to do), we could
even do that without a lock?

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

#3Andres Freund
andres@anarazel.de
In reply to: Magnus Hagander (#2)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Hi,

On 2020-01-25 15:43:41 +0100, Magnus Hagander wrote:

On Fri, Jan 24, 2020 at 8:52 PM Andres Freund <andres@anarazel.de> wrote:

Additionally pg_stat_bgwriter.buffers_backend also counts writes done by
autovacuum et al.

I think it'd make sense to at least split buffers_backend into
buffers_backend_extend,
buffers_backend_write,
buffers_backend_write_strat

but it could also be worthwhile to expand it into
buffers_backend_extend,
buffers_{backend,checkpoint,bgwriter,autovacuum}_write
buffers_{backend,autovacuum}_write_stat

Given that these are individual global counters, I don't really see
any reason not to expand it to the bigger set of counters. It's easy
enough to add them up together later if needed.

Are you agreeing to
buffers_{backend,checkpoint,bgwriter,autovacuum}_write
or are you suggesting further ones?

I think we also count things as writes that aren't writes: mdtruncate()
is AFAICT counted as one backend write for each segment. Which seems
weird to me.

It's at least slightly weird :) Might it be worth counting truncate
events separately?

Is that really something interesting? Feels like it'd have to be done at
a higher level to be useful. E.g. the truncate done by TRUNCATE (when in
same xact as creation) and VACUUM are quite different. I think it'd be
better to just not include it.

Lastly, I don't understand what the point of sending fixed size stats,
like the stuff underlying pg_stat_bgwriter, through pgstats IPC. While
I don't like it's architecture, we obviously need something like pgstat
to handle variable amounts of stats (database, table level etc
stats). But that doesn't at all apply to these types of global stats.

That part has annoyed me as well a few times. +1 for just moving that
into a global shared memory. Given that we don't really care about
things being in sync between those different counters *or* if we loose
a bit of data (which the stats collector is designed to do), we could
even do that without a lock?

I don't think we'd quite want to do it without any (single counter)
synchronization - high concurrency setups would be pretty likely to
loose values that way. I suspect the best would be to have a struct in
shared memory that contains the potential counters for each potential
process. And then sum them up when actually wanting the concrete
value. That way we avoid unnecessary contention, in contrast to having a
single shared memory value for each(which would just pingpong between
different sockets and store buffers). There's a few details like how
exactly to implement resetting the counters, but ...

Thanks,

Andres Freund

#4Magnus Hagander
magnus@hagander.net
In reply to: Andres Freund (#3)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

On Sun, Jan 26, 2020 at 1:44 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2020-01-25 15:43:41 +0100, Magnus Hagander wrote:

On Fri, Jan 24, 2020 at 8:52 PM Andres Freund <andres@anarazel.de> wrote:

Additionally pg_stat_bgwriter.buffers_backend also counts writes done by
autovacuum et al.

I think it'd make sense to at least split buffers_backend into
buffers_backend_extend,
buffers_backend_write,
buffers_backend_write_strat

but it could also be worthwhile to expand it into
buffers_backend_extend,
buffers_{backend,checkpoint,bgwriter,autovacuum}_write
buffers_{backend,autovacuum}_write_stat

Given that these are individual global counters, I don't really see
any reason not to expand it to the bigger set of counters. It's easy
enough to add them up together later if needed.

Are you agreeing to
buffers_{backend,checkpoint,bgwriter,autovacuum}_write
or are you suggesting further ones?

The former.

I think we also count things as writes that aren't writes: mdtruncate()
is AFAICT counted as one backend write for each segment. Which seems
weird to me.

It's at least slightly weird :) Might it be worth counting truncate
events separately?

Is that really something interesting? Feels like it'd have to be done at
a higher level to be useful. E.g. the truncate done by TRUNCATE (when in
same xact as creation) and VACUUM are quite different. I think it'd be
better to just not include it.

Yeah, you're probably right. it certainly makes very little sense
where it is now.

Lastly, I don't understand what the point of sending fixed size stats,
like the stuff underlying pg_stat_bgwriter, through pgstats IPC. While
I don't like it's architecture, we obviously need something like pgstat
to handle variable amounts of stats (database, table level etc
stats). But that doesn't at all apply to these types of global stats.

That part has annoyed me as well a few times. +1 for just moving that
into a global shared memory. Given that we don't really care about
things being in sync between those different counters *or* if we loose
a bit of data (which the stats collector is designed to do), we could
even do that without a lock?

I don't think we'd quite want to do it without any (single counter)
synchronization - high concurrency setups would be pretty likely to
loose values that way. I suspect the best would be to have a struct in
shared memory that contains the potential counters for each potential
process. And then sum them up when actually wanting the concrete
value. That way we avoid unnecessary contention, in contrast to having a
single shared memory value for each(which would just pingpong between
different sockets and store buffers). There's a few details like how
exactly to implement resetting the counters, but ...

Right. Each process gets to do their own write, but still in shared
memory. But do you need to lock them when reading them (for the
summary)? That's the part where I figured you could just read and
summarize them, and accept the possible loss.

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

#5Andres Freund
andres@anarazel.de
In reply to: Magnus Hagander (#4)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Hi,

On 2020-01-26 16:20:03 +0100, Magnus Hagander wrote:

On Sun, Jan 26, 2020 at 1:44 AM Andres Freund <andres@anarazel.de> wrote:

On 2020-01-25 15:43:41 +0100, Magnus Hagander wrote:

On Fri, Jan 24, 2020 at 8:52 PM Andres Freund <andres@anarazel.de> wrote:

Lastly, I don't understand what the point of sending fixed size stats,
like the stuff underlying pg_stat_bgwriter, through pgstats IPC. While
I don't like it's architecture, we obviously need something like pgstat
to handle variable amounts of stats (database, table level etc
stats). But that doesn't at all apply to these types of global stats.

That part has annoyed me as well a few times. +1 for just moving that
into a global shared memory. Given that we don't really care about
things being in sync between those different counters *or* if we loose
a bit of data (which the stats collector is designed to do), we could
even do that without a lock?

I don't think we'd quite want to do it without any (single counter)
synchronization - high concurrency setups would be pretty likely to
loose values that way. I suspect the best would be to have a struct in
shared memory that contains the potential counters for each potential
process. And then sum them up when actually wanting the concrete
value. That way we avoid unnecessary contention, in contrast to having a
single shared memory value for each(which would just pingpong between
different sockets and store buffers). There's a few details like how
exactly to implement resetting the counters, but ...

Right. Each process gets to do their own write, but still in shared
memory. But do you need to lock them when reading them (for the
summary)? That's the part where I figured you could just read and
summarize them, and accept the possible loss.

Oh, yea, I'd not lock for that. On nearly all machines aligned 64bit
integers can be read / written without a danger of torn values, and I
don't think we need perfect cross counter accuracy. To deal with the few
platforms without 64bit "single copy atomicity", we can just use
pg_atomic_read/write_u64. These days (e8fdbd58fe) they automatically
fall back to using locked operations for those platforms. So I don't
think there's actually a danger of loss.

Obviously we could also use atomic ops to increment the value, but I'd
rather not add all those atomic operations, even if it's on uncontended
cachelines. It'd allow us to reset the backend values more easily by
just swapping in a 0, which we can't do if the backend increments
non-atomically. But I think we could instead just have one global "bias"
value to implement resets (by subtracting that from the summarized
value, and storing the current sum when resetting). Or use the new
global barrier to trigger a reset. Or something similar.

Greetings,

Andres Freund

#6Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Andres Freund (#5)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Hello.

At Sun, 26 Jan 2020 12:22:03 -0800, Andres Freund <andres@anarazel.de> wrote in

Hi,

I feel the same on the specific issues brought in upthread.

On 2020-01-26 16:20:03 +0100, Magnus Hagander wrote:

On Sun, Jan 26, 2020 at 1:44 AM Andres Freund <andres@anarazel.de> wrote:

On 2020-01-25 15:43:41 +0100, Magnus Hagander wrote:

On Fri, Jan 24, 2020 at 8:52 PM Andres Freund <andres@anarazel.de> wrote:

Lastly, I don't understand what the point of sending fixed size stats,
like the stuff underlying pg_stat_bgwriter, through pgstats IPC. While
I don't like it's architecture, we obviously need something like pgstat
to handle variable amounts of stats (database, table level etc
stats). But that doesn't at all apply to these types of global stats.

That part has annoyed me as well a few times. +1 for just moving that
into a global shared memory. Given that we don't really care about
things being in sync between those different counters *or* if we loose
a bit of data (which the stats collector is designed to do), we could
even do that without a lock?

I don't think we'd quite want to do it without any (single counter)
synchronization - high concurrency setups would be pretty likely to
loose values that way. I suspect the best would be to have a struct in
shared memory that contains the potential counters for each potential
process. And then sum them up when actually wanting the concrete
value. That way we avoid unnecessary contention, in contrast to having a
single shared memory value for each(which would just pingpong between
different sockets and store buffers). There's a few details like how
exactly to implement resetting the counters, but ...

Right. Each process gets to do their own write, but still in shared
memory. But do you need to lock them when reading them (for the
summary)? That's the part where I figured you could just read and
summarize them, and accept the possible loss.

Oh, yea, I'd not lock for that. On nearly all machines aligned 64bit
integers can be read / written without a danger of torn values, and I
don't think we need perfect cross counter accuracy. To deal with the few
platforms without 64bit "single copy atomicity", we can just use
pg_atomic_read/write_u64. These days (e8fdbd58fe) they automatically
fall back to using locked operations for those platforms. So I don't
think there's actually a danger of loss.

Obviously we could also use atomic ops to increment the value, but I'd
rather not add all those atomic operations, even if it's on uncontended
cachelines. It'd allow us to reset the backend values more easily by
just swapping in a 0, which we can't do if the backend increments
non-atomically. But I think we could instead just have one global "bias"
value to implement resets (by subtracting that from the summarized
value, and storing the current sum when resetting). Or use the new
global barrier to trigger a reset. Or something similar.

Fixed or global stats are suitable for the startar of shared-memory
stats collector. In the case of buffers_*_write, the global stats
entry for each process needs just 8 bytes plus matbe extra 8 bytes for
the bias value. I'm not sure how many counters like this there are,
but is such size of footprint acceptatble? (Each backend already uses
the same amount of local memory for pgstat use, though.)

Anyway I will do something like that as a trial, maybe by adding a
member in PgBackendStatus and one global-shared for the bial value.

int64 st_progress_param[PGSTAT_NUM_PROGRESS_PARAM];
+ PgBackendStatsCounters counters;
} PgBackendStatus;

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#7Melanie Plageman
melanieplageman@gmail.com
In reply to: Kyotaro Horiguchi (#6)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

On Sun, Jan 26, 2020 at 11:21 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Sun, 26 Jan 2020 12:22:03 -0800, Andres Freund <andres@anarazel.de> wrote in

On 2020-01-26 16:20:03 +0100, Magnus Hagander wrote:

On Sun, Jan 26, 2020 at 1:44 AM Andres Freund <andres@anarazel.de> wrote:

On 2020-01-25 15:43:41 +0100, Magnus Hagander wrote:

On Fri, Jan 24, 2020 at 8:52 PM Andres Freund <andres@anarazel.de> wrote:

Lastly, I don't understand what the point of sending fixed size stats,
like the stuff underlying pg_stat_bgwriter, through pgstats IPC. While
I don't like it's architecture, we obviously need something like pgstat
to handle variable amounts of stats (database, table level etc
stats). But that doesn't at all apply to these types of global stats.

That part has annoyed me as well a few times. +1 for just moving that
into a global shared memory. Given that we don't really care about
things being in sync between those different counters *or* if we loose
a bit of data (which the stats collector is designed to do), we could
even do that without a lock?

I don't think we'd quite want to do it without any (single counter)
synchronization - high concurrency setups would be pretty likely to
loose values that way. I suspect the best would be to have a struct in
shared memory that contains the potential counters for each potential
process. And then sum them up when actually wanting the concrete
value. That way we avoid unnecessary contention, in contrast to having a
single shared memory value for each(which would just pingpong between
different sockets and store buffers). There's a few details like how
exactly to implement resetting the counters, but ...

Right. Each process gets to do their own write, but still in shared
memory. But do you need to lock them when reading them (for the
summary)? That's the part where I figured you could just read and
summarize them, and accept the possible loss.

Oh, yea, I'd not lock for that. On nearly all machines aligned 64bit
integers can be read / written without a danger of torn values, and I
don't think we need perfect cross counter accuracy. To deal with the few
platforms without 64bit "single copy atomicity", we can just use
pg_atomic_read/write_u64. These days (e8fdbd58fe) they automatically
fall back to using locked operations for those platforms. So I don't
think there's actually a danger of loss.

Obviously we could also use atomic ops to increment the value, but I'd
rather not add all those atomic operations, even if it's on uncontended
cachelines. It'd allow us to reset the backend values more easily by
just swapping in a 0, which we can't do if the backend increments
non-atomically. But I think we could instead just have one global "bias"
value to implement resets (by subtracting that from the summarized
value, and storing the current sum when resetting). Or use the new
global barrier to trigger a reset. Or something similar.

Fixed or global stats are suitable for the startar of shared-memory
stats collector. In the case of buffers_*_write, the global stats
entry for each process needs just 8 bytes plus matbe extra 8 bytes for
the bias value. I'm not sure how many counters like this there are,
but is such size of footprint acceptatble? (Each backend already uses
the same amount of local memory for pgstat use, though.)

Anyway I will do something like that as a trial, maybe by adding a
member in PgBackendStatus and one global-shared for the bial value.

int64 st_progress_param[PGSTAT_NUM_PROGRESS_PARAM];
+ PgBackendStatsCounters counters;
} PgBackendStatus;

So, I took a stab at implementing this in PgBackendStatus. The attached
patch is not quite on top of current master, so, alas, don't try and
apply it. I went to rebase today and realized I needed to make some
changes in light of e1025044cd4, however, I wanted to share this WIP so
that I could pose a few questions that I imagine will still be relevant
after I rewrite the patch.

I removed buffers_backend and buffers_backend_fsync from
pg_stat_bgwriter and have created a new view which tracks
- number of shared buffers the checkpointer and bgwriter write out
- number of shared buffers a regular backend is forced to flush
- number of extends done by a regular backend through shared buffers
- number of buffers flushed by a backend or autovacuum using a
BufferAccessStrategy which, were they not to use this strategy,
could perhaps have been avoided if a clean shared buffer was
available
- number of fsyncs done by a backend which could have been done by
checkpointer if sync queue had not been full

This view currently does only track writes and extends that go through
shared buffers and fsyncs of shared buffers (which, AFAIK are the only
things fsync'd though the SyncRequest machinery currently).

BufferAlloc() and SyncOneBuffer() are the main points at which the
tracking is done. I can definitely expand this, but, I want to make sure
that we are tracking the right kind of information.

num_backend_writes and num_backend_fsync were intended (though they were
not accurate) to count buffers that backends had to end up writing
themselves and fsyncs that backends had to end up doing themselves which
could have been avoided with a different configuration (or, I suppose, a
different workload/different data, etc). That is, they were meant to
tell you if checkpointer and bgwriter were keeping up and/or if the
size of shared buffers was adequate.

In implementing this counting per backend, it is easy for all types of
backends to keep track of the number of writes, extends, fsyncs, and
strategy writes they are doing. So, as recommended upthread, I have
added columns in the view for the number of writes for checkpointer and
bgwriter and others. Thus, this view becomes more than just stats on
"avoidable I/O done by backends".

So, my question is, does it makes sense to track all extends -- those to
extend the fsm and visimap and when making a new relation or index? Is
that information useful? If so, is it different than the extends done
through shared buffers? Should it be tracked separately?

Also, if we care about all of the extends, then it seems a bit annoying
to pepper the counting all over the place when it really just needs to
be done when smgrextend() — even though maybe a stats function doesn't
belong in that API.

Another question I have is, should the number of extends be for every
single block extended or should we try to track the initiation of a set
of extends (all of those added in RelationAddExtraBlocks(), in this
case)?

When it comes to fsync counting, I only count the fsyncs counted by the
previous code — that is fsyncs done by backends themselves when the
checkpointer sync request queue was full.
I did the counting in the same place in checkpointer code -- in
ForwardSyncRequest() -- partially because there did not seem to be
another good place to do it since register_dirty_segment() returns void
(thought about having it return a bool to indicate if it fsync'd it or
if it registered the fsync because that seemed alright, but mdextend(),
mdwrite() etc, also return NULL) so there is no way to propagate the
information back up to the bufmgr that the process had to do its own
fsync, so, that means that I would have to muck with the md.c API. and,
since the checkpointer is the one processing these sync requests anyway,
it actually seems okay to do it in the checkpointer code.

I'm not counting fsyncs that are "unavoidable" in the sense that they
couldn't be avoided by changing settings/workload etc -- like those done
when building an index, creating a table/rewriting a table/copying a
table -- is it useful to count these? It seems like it makes the number
of "avoidable fsyncs by backends" less useful if we count the others.
Also, should we count how many fsyncs checkpointer has done (have to
check if there is already a stat for that)? Is that useful in this
context?

Of course, this view, when grown, will begin to overlap with pg_statio,
which is another consideration. What is its identity? I would find
"avoidable I/O" either avoidable entirely or avoidable for that
particular type of process, to be useful.

Or maybe, it should have a more expansive mandate. Maybe it would be
useful to aggregate some of the info from pg_stat_statements at a higher
level -- like maybe shared_blks_read counted across many statements for
a period of time/context in which we expected the relation in shared
buffers becomes potentially interesting.

As for the way I have recorded strategy writes -- it is quite inelegant,
but, I wanted to make sure that I only counted a strategy write as one
in which the backend wrote out the dirty buffer from its strategy ring
but did not check if there was any clean buffer in shared buffers more
generally (so, it is *potentially* an avoidable write). I'm not sure if
this distinction is useful to anyone. I haven't done enough with
BufferAccessStrategies to know what I'd want to know about them when
developing or using Postgres. However, if I don't need to be so careful,
it will make the code much simpler (though, I'm sure I can improve the
code regardless).

As for the implementation of the counters themselves, I appreciate that
it isn't very nice to have a bunch of random members in PgBackendStatus
to count all of these write, extends, fsyncs. I considered if I could
add params that were used for all command types to st_progress_param but
I haven't looked into it yet. Alternatively, I could create an array
just for these kind of stats in PgBackendStatus. Though, I imagine that
I should take a look at the changes that have been made recently to this
area and at the shared memory stats patch.

Oh, also, there should be a way to reset the stats, especially if we add
more extends and fsyncs that happen at the time of relation/index
creation. I, at least, would find it useful to see these numbers once
the database is at some kind of steady state.

Oh and src/test/regress/sql/stats.sql will fail and, of course, I don't
intend to add that SELECT from the view to regress, it was just for
testing purposes to make sure the view was working.

-- Melanie

Attachments:

v1-0001-Add-system-view-tracking-shared-buffers-written.patchapplication/octet-stream; name=v1-0001-Add-system-view-tracking-shared-buffers-written.patchDownload+453-53
#8Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#7)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Hi,

On 2021-04-12 19:49:36 -0700, Melanie Plageman wrote:

So, I took a stab at implementing this in PgBackendStatus.

Cool!

The attached patch is not quite on top of current master, so, alas,
don't try and apply it. I went to rebase today and realized I needed
to make some changes in light of e1025044cd4, however, I wanted to
share this WIP so that I could pose a few questions that I imagine
will still be relevant after I rewrite the patch.

I removed buffers_backend and buffers_backend_fsync from
pg_stat_bgwriter and have created a new view which tracks
- number of shared buffers the checkpointer and bgwriter write out
- number of shared buffers a regular backend is forced to flush
- number of extends done by a regular backend through shared buffers
- number of buffers flushed by a backend or autovacuum using a
BufferAccessStrategy which, were they not to use this strategy,
could perhaps have been avoided if a clean shared buffer was
available
- number of fsyncs done by a backend which could have been done by
checkpointer if sync queue had not been full

I wonder if leaving buffers_alloc in pg_stat_bgwriter makes sense after
this? I'm tempted to move that to pg_stat_buffers or such...

I'm not quite convinced by having separate columns for checkpointer,
bgwriter, etc. That doesn't seem to scale all that well. What if we
instead made it a view that has one row for each BackendType?

In implementing this counting per backend, it is easy for all types of
backends to keep track of the number of writes, extends, fsyncs, and
strategy writes they are doing. So, as recommended upthread, I have
added columns in the view for the number of writes for checkpointer and
bgwriter and others. Thus, this view becomes more than just stats on
"avoidable I/O done by backends".

So, my question is, does it makes sense to track all extends -- those to
extend the fsm and visimap and when making a new relation or index? Is
that information useful? If so, is it different than the extends done
through shared buffers? Should it be tracked separately?

I don't fully understand what you mean with "extends done through shared
buffers"?

Another question I have is, should the number of extends be for every
single block extended or should we try to track the initiation of a set
of extends (all of those added in RelationAddExtraBlocks(), in this
case)?

I think it should be 8k blocks, i.e. RelationAddExtraBlocks() should be
tracked as many individual extends. It's implemented that way, but more
importantly, it should be in BLCKSZ units. If we later add some actually
batched operations, we can have separate stats for that.

Of course, this view, when grown, will begin to overlap with pg_statio,
which is another consideration. What is its identity? I would find
"avoidable I/O" either avoidable entirely or avoidable for that
particular type of process, to be useful.

I think it's fine to overlap with pg_statio_* - those are for individual
objects, so it seems to be expected to overlap with coarser stats.

Or maybe, it should have a more expansive mandate. Maybe it would be
useful to aggregate some of the info from pg_stat_statements at a higher
level -- like maybe shared_blks_read counted across many statements for
a period of time/context in which we expected the relation in shared
buffers becomes potentially interesting.

Let's do something more basic first...

Greetings,

Andres Freund

#9Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#8)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

On Thu, Apr 15, 2021 at 7:59 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2021-04-12 19:49:36 -0700, Melanie Plageman wrote:

So, I took a stab at implementing this in PgBackendStatus.

Cool!

Just a note on v2 of the patch -- the diff for the changes I made to
pgstatfuncs.c is pretty atrocious and hard to read. I tried using a
different diff algorithm, to no avail.

The attached patch is not quite on top of current master, so, alas,
don't try and apply it. I went to rebase today and realized I needed
to make some changes in light of e1025044cd4, however, I wanted to
share this WIP so that I could pose a few questions that I imagine
will still be relevant after I rewrite the patch.

Regarding the refactor done in e1025044cd4:
Most of the functions I've added access variables in PgBackendStatus, so
I put most of them in backend_status.h/c. However, technically, these
are stats which are aggregated over time, which e1025044cd4 says should
go in pgstat.c/h. I could move some of it, but I hadn't tried to do so,
as it made a few things inconvenient, and, I wasn't sure if it was the
right thing to do anyway.

I removed buffers_backend and buffers_backend_fsync from
pg_stat_bgwriter and have created a new view which tracks
- number of shared buffers the checkpointer and bgwriter write out
- number of shared buffers a regular backend is forced to flush
- number of extends done by a regular backend through shared buffers
- number of buffers flushed by a backend or autovacuum using a
BufferAccessStrategy which, were they not to use this strategy,
could perhaps have been avoided if a clean shared buffer was
available
- number of fsyncs done by a backend which could have been done by
checkpointer if sync queue had not been full

I wonder if leaving buffers_alloc in pg_stat_bgwriter makes sense after
this? I'm tempted to move that to pg_stat_buffers or such...

I've gone ahead and moved buffers_alloc out of pg_stat_bgwriter and into
pg_stat_buffer_actions (I've renamed it from pg_stat_buffers_written).

I'm not quite convinced by having separate columns for checkpointer,
bgwriter, etc. That doesn't seem to scale all that well. What if we
instead made it a view that has one row for each BackendType?

I've changed the view to have one row for each backend type for which we
would like to report stats and one column for each buffer action type.

To make the code easier to write, I record buffer actions for all
backend types -- even if we don't have any buffer actions we care about
for that backend type. I thought it was okay because when I actually
aggregate the counters across backends, I only do so for the backend
types we care about -- thus there shouldn't be much accessing of shared
memory by multiple different processes.

Also, I copy-pasted most of the code in pg_stat_get_buffer_actions() to
set up the result tuplestore from pg_stat_get_activity() without totally
understanding all the parts of it, so I'm not sure if all of it is
required here.

In implementing this counting per backend, it is easy for all types of
backends to keep track of the number of writes, extends, fsyncs, and
strategy writes they are doing. So, as recommended upthread, I have
added columns in the view for the number of writes for checkpointer and
bgwriter and others. Thus, this view becomes more than just stats on
"avoidable I/O done by backends".

So, my question is, does it makes sense to track all extends -- those to
extend the fsm and visimap and when making a new relation or index? Is
that information useful? If so, is it different than the extends done
through shared buffers? Should it be tracked separately?

I don't fully understand what you mean with "extends done through shared
buffers"?

By "extends done through shared buffers", I just mean when an extend of
a relation is done and the data that will be written to the new block is
written into a shared buffer (as opposed to a local one or local memory
or a strategy buffer).

Random note:
I added a length member to the BackendType enum (BACKEND_NUM_TYPES),
which led to this compiler warning:

miscinit.c: In function ‘GetBackendTypeDesc’:
miscinit.c:236:2: warning: enumeration value ‘BACKEND_NUM_TYPES’ not
handled in switch [-Wswitch]
236 | switch (backendType)
| ^~~~~~

I tried using pg_attribute_unused() for BACKEND_NUM_TYPES, but, it
didn't seem to have the desired effect. As such, I just threw a case
into GetBackendTypeDesc() which does nothing (as opposed to erroring
out), since the backendDesc already is initialized to "unknown process
type", erroring out doesn't seem to be expected.

- Melanie

Attachments:

v2-0001-Add-system-view-tracking-shared-buffer-actions.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Add-system-view-tracking-shared-buffer-actions.patchDownload+341-84
#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Melanie Plageman (#7)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

On 2021-Apr-12, Melanie Plageman wrote:

As for the way I have recorded strategy writes -- it is quite inelegant,
but, I wanted to make sure that I only counted a strategy write as one
in which the backend wrote out the dirty buffer from its strategy ring
but did not check if there was any clean buffer in shared buffers more
generally (so, it is *potentially* an avoidable write). I'm not sure if
this distinction is useful to anyone. I haven't done enough with
BufferAccessStrategies to know what I'd want to know about them when
developing or using Postgres. However, if I don't need to be so careful,
it will make the code much simpler (though, I'm sure I can improve the
code regardless).

I was bitten last year by REFRESH MATERIALIZED VIEW counting its writes
via buffers_backend, and I was very surprised/confused about it. So it
seems definitely worthwhile to count writes via strategy separately.
For a DBA tuning the server configuration it is very useful.

The main thing is to *not* let these writes end up regular
buffers_backend (or whatever you call these now). I didn't read your
patch, but the way you have described it seems okay to me.

--
�lvaro Herrera 39�49'30"S 73�17'W

#11Melanie Plageman
melanieplageman@gmail.com
In reply to: Alvaro Herrera (#10)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

On Fri, Jun 4, 2021 at 5:52 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2021-Apr-12, Melanie Plageman wrote:

As for the way I have recorded strategy writes -- it is quite inelegant,
but, I wanted to make sure that I only counted a strategy write as one
in which the backend wrote out the dirty buffer from its strategy ring
but did not check if there was any clean buffer in shared buffers more
generally (so, it is *potentially* an avoidable write). I'm not sure if
this distinction is useful to anyone. I haven't done enough with
BufferAccessStrategies to know what I'd want to know about them when
developing or using Postgres. However, if I don't need to be so careful,
it will make the code much simpler (though, I'm sure I can improve the
code regardless).

I was bitten last year by REFRESH MATERIALIZED VIEW counting its writes
via buffers_backend, and I was very surprised/confused about it. So it
seems definitely worthwhile to count writes via strategy separately.
For a DBA tuning the server configuration it is very useful.

The main thing is to *not* let these writes end up regular
buffers_backend (or whatever you call these now). I didn't read your
patch, but the way you have described it seems okay to me.

Thanks for the feedback!

I agree it makes sense to count strategy writes separately.

I thought about this some more, and I don't know if it makes sense to
only count "avoidable" strategy writes.

This would mean that a backend writing out a buffer from the strategy
ring when no clean shared buffers (as well as no clean strategy buffers)
are available would not count that write as a strategy write (even
though it is writing out a buffer from its strategy ring). But, it
obviously doesn't make sense to count it as a regular buffer being
written out. So, I plan to change this code.

On another note, I've updated the patch with more correct concurrency
control control mechanisms (had some data races and other problems
before). Now, I am using atomics for the buffer action counters, though
the code includes several #TODO questions around the correctness of what
I have now too.

I also wrapped the buffer action types in a struct to make them easier
to work with.

The most substantial missing piece of the patch right now is persisting
the data across reboots.

The two places in the code I can see to persist the buffer action stats
data are:
1) using the stats collector code (like in
pgstat_read/write_statsfiles()
2) using a before_shmem_exit() hook which writes the data structure to a
file and then read from it when making the shared memory array initially

It feels a bit weird to me to wedge the buffer actions stats into the
stats collector code--since the stats collector isn't receiving and
aggregating the buffer action stats.

Also, I'm unsure how writing the buffer action stats out in
pgstat_write_statsfiles() will work, since I think that backends can
update their buffer action stats after we would have already persisted
the data from the BufferActionStatsArray -- causing us to lose those
updates.

And, I don't think I can use pgstat_read_statsfiles() since the
BufferActionStatsArray should have the data from the file as soon as the
view containing the buffer action stats can be queried. Thus, it seems
like I would need to read the file while initializing the array in
CreateBufferActionStatsCounters().

I am registering the patch for September commitfest but plan to update
the stats persistence before then (and docs, etc).

-- Melanie

Attachments:

v3-0001-Add-system-view-tracking-shared-buffer-actions.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Add-system-view-tracking-shared-buffer-actions.patchDownload+337-84
#12Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#11)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Hi,

On 2021-08-02 18:25:56 -0400, Melanie Plageman wrote:

Thanks for the feedback!

I agree it makes sense to count strategy writes separately.

I thought about this some more, and I don't know if it makes sense to
only count "avoidable" strategy writes.

This would mean that a backend writing out a buffer from the strategy
ring when no clean shared buffers (as well as no clean strategy buffers)
are available would not count that write as a strategy write (even
though it is writing out a buffer from its strategy ring). But, it
obviously doesn't make sense to count it as a regular buffer being
written out. So, I plan to change this code.

What do you mean with "no clean shared buffers ... are available"?

The most substantial missing piece of the patch right now is persisting
the data across reboots.

The two places in the code I can see to persist the buffer action stats
data are:
1) using the stats collector code (like in
pgstat_read/write_statsfiles()
2) using a before_shmem_exit() hook which writes the data structure to a
file and then read from it when making the shared memory array initially

I think it's pretty clear that we should go for 1. Having two mechanisms for
persisting stats data is a bad idea.

Also, I'm unsure how writing the buffer action stats out in
pgstat_write_statsfiles() will work, since I think that backends can
update their buffer action stats after we would have already persisted
the data from the BufferActionStatsArray -- causing us to lose those
updates.

I was thinking it'd work differently. Whenever a connection ends, it reports
its data up to pgstats.c (otherwise we'd loose those stats). By the time
shutdown happens, they all need to have already have reported their stats - so
we don't need to do anything to get the data to pgstats.c during shutdown
time.

And, I don't think I can use pgstat_read_statsfiles() since the
BufferActionStatsArray should have the data from the file as soon as the
view containing the buffer action stats can be queried. Thus, it seems
like I would need to read the file while initializing the array in
CreateBufferActionStatsCounters().

Why would backends need to read that data back?

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..96cac0a74e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1067,9 +1067,6 @@ CREATE VIEW pg_stat_bgwriter AS
pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint,
pg_stat_get_bgwriter_buf_written_clean() AS buffers_clean,
pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean,
-        pg_stat_get_buf_written_backend() AS buffers_backend,
-        pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync,
-        pg_stat_get_buf_alloc() AS buffers_alloc,
pg_stat_get_bgwriter_stat_reset_time() AS stats_reset;

Material for a separate patch, not this. But if we're going to break
monitoring queries anyway, I think we should consider also renaming
maxwritten_clean (and perhaps a few others), because nobody understands what
that is supposed to mean.

@@ -1089,10 +1077,6 @@ ForwardSyncRequest(const FileTag *ftag, SyncRequestType type)

LWLockAcquire(CheckpointerCommLock, LW_EXCLUSIVE);

-	/* Count all backend writes regardless of if they fit in the queue */
-	if (!AmBackgroundWriterProcess())
-		CheckpointerShmem->num_backend_writes++;
-
/*
* If the checkpointer isn't running or the request queue is full, the
* backend will have to perform its own fsync request.  But before forcing
@@ -1106,8 +1090,10 @@ ForwardSyncRequest(const FileTag *ftag, SyncRequestType type)
* Count the subset of writes where backends have to do their own
* fsync
*/
+		/* TODO: should we count fsyncs for all types of procs? */
if (!AmBackgroundWriterProcess())
-			CheckpointerShmem->num_backend_fsync++;
+			pgstat_increment_buffer_action(BA_Fsync);
+

Yes, I think that'd make sense. Now that we can disambiguate the different
types of syncs between procs, I don't see a point of having a process-type
filter here. We just loose data...

/* don't set checksum for all-zero page */
@@ -1229,11 +1234,60 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
if (XLogNeedsFlush(lsn) &&
StrategyRejectBuffer(strategy, buf))
{
+						/*
+						 * Unset the strat write flag, as we will not be writing
+						 * this particular buffer from our ring out and may end
+						 * up having to find a buffer from main shared buffers,
+						 * which, if it is dirty, we may have to write out, which
+						 * could have been prevented by checkpointing and background
+						 * writing
+						 */
+						StrategyUnChooseBufferFromRing(strategy);
+
/* Drop lock/pin and loop around for another buffer */
LWLockRelease(BufferDescriptorGetContentLock(buf));
UnpinBuffer(buf, true);
continue;
}

Could we combine this with StrategyRejectBuffer()? It seems a bit wasteful to
have two function calls into freelist.c when the second happens exactly when
the first returns true?

+
+					/*
+					 * TODO: there is certainly a better way to write this
+					 * logic
+					 */
+
+					/*
+					 * The dirty buffer that will be written out was selected
+					 * from the ring and we did not bother checking the
+					 * freelist or doing a clock sweep to look for a clean
+					 * buffer to use, thus, this write will be counted as a
+					 * strategy write -- one that may be unnecessary without a
+					 * strategy
+					 */
+					if (StrategyIsBufferFromRing(strategy))
+					{
+						pgstat_increment_buffer_action(BA_Write_Strat);
+					}
+
+						/*
+						 * If the dirty buffer was one we grabbed from the
+						 * freelist or through a clock sweep, it could have been
+						 * written out by bgwriter or checkpointer, thus, we will
+						 * count it as a regular write
+						 */
+					else
+						pgstat_increment_buffer_action(BA_Write);

It seems this would be better solved by having an "bool *from_ring" or
GetBufferSource* parameter to StrategyGetBuffer().

@@ -2895,6 +2948,20 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/*
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
+
+	/*
+	 * TODO: consider that if we did not need to distinguish between a buffer
+	 * flushed that was grabbed from the ring buffer and written out as part
+	 * of a strategy which was not from main Shared Buffers (and thus
+	 * preventable by bgwriter or checkpointer), then we could move all calls
+	 * to pgstat_increment_buffer_action() here except for the one for
+	 * extends, which would remain in ReadBuffer_common() before smgrextend()
+	 * (unless we decide to start counting other extends). That includes the
+	 * call to count buffers written by bgwriter and checkpointer which go
+	 * through FlushBuffer() but not BufferAlloc(). That would make it
+	 * simpler. Perhaps instead we can find somewhere else to indicate that
+	 * the buffer is from the ring of buffers to reuse.
+	 */
smgrwrite(reln,
buf->tag.forkNum,
buf->tag.blockNum,

Can we just add a parameter to FlushBuffer indicating what the source of the
write is?

@@ -247,7 +257,7 @@ StrategyGetBuffer(BufferAccessStrategy strategy, uint32 *buf_state)
* the rate of buffer consumption.  Note that buffers recycled by a
* strategy object are intentionally not counted here.
*/
-	pg_atomic_fetch_add_u32(&StrategyControl->numBufferAllocs, 1);
+	pgstat_increment_buffer_action(BA_Alloc);

/*
* First check, without acquiring the lock, whether there's buffers in the

@@ -411,11 +421,6 @@ StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc)
*/
*complete_passes += nextVictimBuffer / NBuffers;
}
-
-	if (num_buf_alloc)
-	{
-		*num_buf_alloc = pg_atomic_exchange_u32(&StrategyControl->numBufferAllocs, 0);
-	}
SpinLockRelease(&StrategyControl->buffer_strategy_lock);
return result;
}

Hm. Isn't bgwriter using the *num_buf_alloc value to pace its activity? I
suspect this patch shouldn't get rid of numBufferAllocs at the same time as
overhauling the stats stuff. Perhaps we don't need both - but it's not obvious
that that's the case / how we can make that work.

+void
+pgstat_increment_buffer_action(BufferActionType ba_type)
+{
+	volatile PgBackendStatus *beentry   = MyBEEntry;
+
+	if (!beentry || !pgstat_track_activities)
+		return;
+
+	if (ba_type == BA_Alloc)
+		pg_atomic_add_fetch_u64(&beentry->buffer_action_stats.allocs, 1);
+	else if (ba_type == BA_Extend)
+		pg_atomic_add_fetch_u64(&beentry->buffer_action_stats.extends, 1);
+	else if (ba_type == BA_Fsync)
+		pg_atomic_add_fetch_u64(&beentry->buffer_action_stats.fsyncs, 1);
+	else if (ba_type == BA_Write)
+		pg_atomic_add_fetch_u64(&beentry->buffer_action_stats.writes, 1);
+	else if (ba_type == BA_Write_Strat)
+		pg_atomic_add_fetch_u64(&beentry->buffer_action_stats.writes_strat, 1);
+}

I don't think we want to use atomic increments here - they're *slow*. And
there only ever can be a single writer to a backend's stats. So just doing
something like
pg_atomic_write_u64(&var, pg_atomic_read_u64(&var) + 1)
should do the trick.

+/*
+ * Called for a single backend at the time of death to persist its I/O stats
+ */
+void
+pgstat_record_dead_backend_buffer_actions(void)
+{
+	volatile PgBackendBufferActionStats *ba_stats;
+	volatile	PgBackendStatus *beentry = MyBEEntry;
+
+	if (beentry->st_procpid != 0)
+		return;
+
+	// TODO: is this correct? could there be a data race? do I need a lock?
+	ba_stats = &BufferActionStatsArray[beentry->st_backendType];
+	pg_atomic_add_fetch_u64(&ba_stats->allocs, pg_atomic_read_u64(&beentry->buffer_action_stats.allocs));
+	pg_atomic_add_fetch_u64(&ba_stats->extends, pg_atomic_read_u64(&beentry->buffer_action_stats.extends));
+	pg_atomic_add_fetch_u64(&ba_stats->fsyncs, pg_atomic_read_u64(&beentry->buffer_action_stats.fsyncs));
+	pg_atomic_add_fetch_u64(&ba_stats->writes, pg_atomic_read_u64(&beentry->buffer_action_stats.writes));
+	pg_atomic_add_fetch_u64(&ba_stats->writes_strat, pg_atomic_read_u64(&beentry->buffer_action_stats.writes_strat));
+}

I don't see a race, FWIW.

This is where I propose that we instead report the values up to the stats
collector, instead of having a separate array that we need to persist

+/*
+ * Fill the provided values array with the accumulated counts of buffer actions
+ * taken by all backends of type backend_type (input parameter), both alive and
+ * dead. This is currently only used by pg_stat_get_buffer_actions() to create
+ * the rows in the pg_stat_buffer_actions system view.
+ */
+void
+pgstat_recount_all_buffer_actions(BackendType backend_type, Datum *values)
+{
+	int			i;
+	volatile PgBackendStatus *beentry;
+
+	/*
+	 * Add stats from all exited backends
+	 */
+	values[BA_Alloc] = pg_atomic_read_u64(&BufferActionStatsArray[backend_type].allocs);
+	values[BA_Extend] = pg_atomic_read_u64(&BufferActionStatsArray[backend_type].extends);
+	values[BA_Fsync] = pg_atomic_read_u64(&BufferActionStatsArray[backend_type].fsyncs);
+	values[BA_Write] = pg_atomic_read_u64(&BufferActionStatsArray[backend_type].writes);
+	values[BA_Write_Strat] = pg_atomic_read_u64(&BufferActionStatsArray[backend_type].writes_strat);
+
+	/*
+	 * Loop through all live backends and count their buffer actions
+	 */
+	// TODO: see note in pg_stat_get_buffer_actions() about inefficiency of this method
+
+	beentry = BackendStatusArray;
+	for (i = 1; i <= MaxBackends; i++)
+	{
+		/* Don't count dead backends. They should already be counted */
+		if (beentry->st_procpid == 0)
+			continue;
+		if (beentry->st_backendType != backend_type)
+			continue;
+
+		values[BA_Alloc] += pg_atomic_read_u64(&beentry->buffer_action_stats.allocs);
+		values[BA_Extend] += pg_atomic_read_u64(&beentry->buffer_action_stats.extends);
+		values[BA_Fsync] += pg_atomic_read_u64(&beentry->buffer_action_stats.fsyncs);
+		values[BA_Write] += pg_atomic_read_u64(&beentry->buffer_action_stats.writes);
+		values[BA_Write_Strat] += pg_atomic_read_u64(&beentry->buffer_action_stats.writes_strat);
+
+		beentry++;
+	}
+}

It seems to make a bit more sense to have this sum up the stats for all
backend types at once.

+		/*
+		 * Currently, the only supported backend types for stats are the following.
+		 * If this were to change, pg_proc.dat would need to be changed as well
+		 * to reflect the new expected number of rows.
+		 */
+		Datum values[BUFFER_ACTION_NUM_TYPES];
+		bool nulls[BUFFER_ACTION_NUM_TYPES];

Ah ;)

Greetings,

Andres Freund

#13Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#12)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

On Tue, Aug 3, 2021 at 2:13 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2021-08-02 18:25:56 -0400, Melanie Plageman wrote:

Thanks for the feedback!

I agree it makes sense to count strategy writes separately.

I thought about this some more, and I don't know if it makes sense to
only count "avoidable" strategy writes.

This would mean that a backend writing out a buffer from the strategy
ring when no clean shared buffers (as well as no clean strategy buffers)
are available would not count that write as a strategy write (even
though it is writing out a buffer from its strategy ring). But, it
obviously doesn't make sense to count it as a regular buffer being
written out. So, I plan to change this code.

What do you mean with "no clean shared buffers ... are available"?

I think I was talking about the scenario in which a backend using a
strategy does not find a clean buffer in the strategy ring and goes to
look in the freelist for a clean shared buffer and doesn't find one.

I was probably talking in circles up there. I think the current
patch counts the right writes in the right way, though.

The most substantial missing piece of the patch right now is persisting
the data across reboots.

The two places in the code I can see to persist the buffer action stats
data are:
1) using the stats collector code (like in
pgstat_read/write_statsfiles()
2) using a before_shmem_exit() hook which writes the data structure to a
file and then read from it when making the shared memory array initially

I think it's pretty clear that we should go for 1. Having two mechanisms for
persisting stats data is a bad idea.

New version uses the stats collector.

Also, I'm unsure how writing the buffer action stats out in
pgstat_write_statsfiles() will work, since I think that backends can
update their buffer action stats after we would have already persisted
the data from the BufferActionStatsArray -- causing us to lose those
updates.

I was thinking it'd work differently. Whenever a connection ends, it reports
its data up to pgstats.c (otherwise we'd loose those stats). By the time
shutdown happens, they all need to have already have reported their stats - so
we don't need to do anything to get the data to pgstats.c during shutdown
time.

When you say "whenever a connection ends", what part of the code are you
referring to specifically?

Also, when you say "shutdown", do you mean a backend shutting down or
all backends shutting down (including postmaster) -- like pg_ctl stop?

And, I don't think I can use pgstat_read_statsfiles() since the
BufferActionStatsArray should have the data from the file as soon as the
view containing the buffer action stats can be queried. Thus, it seems
like I would need to read the file while initializing the array in
CreateBufferActionStatsCounters().

Why would backends need to read that data back?

To get totals across restarts, but, doesn't matter now that I am using
stats collector.

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..96cac0a74e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1067,9 +1067,6 @@ CREATE VIEW pg_stat_bgwriter AS
pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint,
pg_stat_get_bgwriter_buf_written_clean() AS buffers_clean,
pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean,
-        pg_stat_get_buf_written_backend() AS buffers_backend,
-        pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync,
-        pg_stat_get_buf_alloc() AS buffers_alloc,
pg_stat_get_bgwriter_stat_reset_time() AS stats_reset;

Material for a separate patch, not this. But if we're going to break
monitoring queries anyway, I think we should consider also renaming
maxwritten_clean (and perhaps a few others), because nobody understands what
that is supposed to mean.

Do you mean I shouldn't remove anything from the pg_stat_bgwriter view?

@@ -1089,10 +1077,6 @@ ForwardSyncRequest(const FileTag *ftag, SyncRequestType type)

LWLockAcquire(CheckpointerCommLock, LW_EXCLUSIVE);

-     /* Count all backend writes regardless of if they fit in the queue */
-     if (!AmBackgroundWriterProcess())
-             CheckpointerShmem->num_backend_writes++;
-
/*
* If the checkpointer isn't running or the request queue is full, the
* backend will have to perform its own fsync request.  But before forcing
@@ -1106,8 +1090,10 @@ ForwardSyncRequest(const FileTag *ftag, SyncRequestType type)
* Count the subset of writes where backends have to do their own
* fsync
*/
+             /* TODO: should we count fsyncs for all types of procs? */
if (!AmBackgroundWriterProcess())
-                     CheckpointerShmem->num_backend_fsync++;
+                     pgstat_increment_buffer_action(BA_Fsync);
+

Yes, I think that'd make sense. Now that we can disambiguate the different
types of syncs between procs, I don't see a point of having a process-type
filter here. We just loose data...

Done

/* don't set checksum for all-zero page */
@@ -1229,11 +1234,60 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
if (XLogNeedsFlush(lsn) &&
StrategyRejectBuffer(strategy, buf))
{
+                                             /*
+                                              * Unset the strat write flag, as we will not be writing
+                                              * this particular buffer from our ring out and may end
+                                              * up having to find a buffer from main shared buffers,
+                                              * which, if it is dirty, we may have to write out, which
+                                              * could have been prevented by checkpointing and background
+                                              * writing
+                                              */
+                                             StrategyUnChooseBufferFromRing(strategy);
+
/* Drop lock/pin and loop around for another buffer */
LWLockRelease(BufferDescriptorGetContentLock(buf));
UnpinBuffer(buf, true);
continue;
}

Could we combine this with StrategyRejectBuffer()? It seems a bit wasteful to
have two function calls into freelist.c when the second happens exactly when
the first returns true?

+
+                                     /*
+                                      * TODO: there is certainly a better way to write this
+                                      * logic
+                                      */
+
+                                     /*
+                                      * The dirty buffer that will be written out was selected
+                                      * from the ring and we did not bother checking the
+                                      * freelist or doing a clock sweep to look for a clean
+                                      * buffer to use, thus, this write will be counted as a
+                                      * strategy write -- one that may be unnecessary without a
+                                      * strategy
+                                      */
+                                     if (StrategyIsBufferFromRing(strategy))
+                                     {
+                                             pgstat_increment_buffer_action(BA_Write_Strat);
+                                     }
+
+                                             /*
+                                              * If the dirty buffer was one we grabbed from the
+                                              * freelist or through a clock sweep, it could have been
+                                              * written out by bgwriter or checkpointer, thus, we will
+                                              * count it as a regular write
+                                              */
+                                     else
+                                             pgstat_increment_buffer_action(BA_Write);

It seems this would be better solved by having an "bool *from_ring" or
GetBufferSource* parameter to StrategyGetBuffer().

I've addressed both of these in the new version.

@@ -2895,6 +2948,20 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/*
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
+
+     /*
+      * TODO: consider that if we did not need to distinguish between a buffer
+      * flushed that was grabbed from the ring buffer and written out as part
+      * of a strategy which was not from main Shared Buffers (and thus
+      * preventable by bgwriter or checkpointer), then we could move all calls
+      * to pgstat_increment_buffer_action() here except for the one for
+      * extends, which would remain in ReadBuffer_common() before smgrextend()
+      * (unless we decide to start counting other extends). That includes the
+      * call to count buffers written by bgwriter and checkpointer which go
+      * through FlushBuffer() but not BufferAlloc(). That would make it
+      * simpler. Perhaps instead we can find somewhere else to indicate that
+      * the buffer is from the ring of buffers to reuse.
+      */
smgrwrite(reln,
buf->tag.forkNum,
buf->tag.blockNum,

Can we just add a parameter to FlushBuffer indicating what the source of the
write is?

I just noticed this comment now, so I'll address that in the next
version. I rebased today and noticed merge conflicts, so, it looks like
v5 will be on its way soon anyway.

@@ -247,7 +257,7 @@ StrategyGetBuffer(BufferAccessStrategy strategy, uint32 *buf_state)
* the rate of buffer consumption.  Note that buffers recycled by a
* strategy object are intentionally not counted here.
*/
-     pg_atomic_fetch_add_u32(&StrategyControl->numBufferAllocs, 1);
+     pgstat_increment_buffer_action(BA_Alloc);

/*
* First check, without acquiring the lock, whether there's buffers in the

@@ -411,11 +421,6 @@ StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc)
*/
*complete_passes += nextVictimBuffer / NBuffers;
}
-
-     if (num_buf_alloc)
-     {
-             *num_buf_alloc = pg_atomic_exchange_u32(&StrategyControl->numBufferAllocs, 0);
-     }
SpinLockRelease(&StrategyControl->buffer_strategy_lock);
return result;
}

Hm. Isn't bgwriter using the *num_buf_alloc value to pace its activity? I
suspect this patch shouldn't get rid of numBufferAllocs at the same time as
overhauling the stats stuff. Perhaps we don't need both - but it's not obvious
that that's the case / how we can make that work.

I initially meant to add a function to the patch like
pg_stat_get_buffer_actions() but which took a BufferActionType and
BackendType as parameters and returned a single value which is the
number of buffer action types of that type for that type of backend.

let's say I defined it like this:
uint64
pg_stat_get_backend_buffer_actions_stats(BackendType backend_type,
BufferActionType ba_type)

Then, I intended to use that in StrategySyncStart() to set num_buf_alloc
by subtracting the value of StrategyControl->numBufferAllocs from the
value returned by pg_stat_get_backend_buffer_actions_stats(B_BG_WRITER,
BA_Alloc), val, then adding that value, val, to
StrategyControl->numBufferAllocs.

I think that would have the same behavior as current, though I'm not
sure if the performance would end up being better or worse. It wouldn't
be atomically incrementing StrategyControl->numBufferAllocs, but it
would do a few additional atomic operations in StrategySyncStart() than
before. Also, we would do all the work done by
pg_stat_get_buffer_actions() in StrategySyncStart().

But that is called comparatively infrequently, right?

+void
+pgstat_increment_buffer_action(BufferActionType ba_type)
+{
+     volatile PgBackendStatus *beentry   = MyBEEntry;
+
+     if (!beentry || !pgstat_track_activities)
+             return;
+
+     if (ba_type == BA_Alloc)
+             pg_atomic_add_fetch_u64(&beentry->buffer_action_stats.allocs, 1);
+     else if (ba_type == BA_Extend)
+             pg_atomic_add_fetch_u64(&beentry->buffer_action_stats.extends, 1);
+     else if (ba_type == BA_Fsync)
+             pg_atomic_add_fetch_u64(&beentry->buffer_action_stats.fsyncs, 1);
+     else if (ba_type == BA_Write)
+             pg_atomic_add_fetch_u64(&beentry->buffer_action_stats.writes, 1);
+     else if (ba_type == BA_Write_Strat)
+             pg_atomic_add_fetch_u64(&beentry->buffer_action_stats.writes_strat, 1);
+}

I don't think we want to use atomic increments here - they're *slow*. And
there only ever can be a single writer to a backend's stats. So just doing
something like
pg_atomic_write_u64(&var, pg_atomic_read_u64(&var) + 1)
should do the trick.

Done

+/*
+ * Called for a single backend at the time of death to persist its I/O stats
+ */
+void
+pgstat_record_dead_backend_buffer_actions(void)
+{
+     volatile PgBackendBufferActionStats *ba_stats;
+     volatile        PgBackendStatus *beentry = MyBEEntry;
+
+     if (beentry->st_procpid != 0)
+             return;
+
+     // TODO: is this correct? could there be a data race? do I need a lock?
+     ba_stats = &BufferActionStatsArray[beentry->st_backendType];
+     pg_atomic_add_fetch_u64(&ba_stats->allocs, pg_atomic_read_u64(&beentry->buffer_action_stats.allocs));
+     pg_atomic_add_fetch_u64(&ba_stats->extends, pg_atomic_read_u64(&beentry->buffer_action_stats.extends));
+     pg_atomic_add_fetch_u64(&ba_stats->fsyncs, pg_atomic_read_u64(&beentry->buffer_action_stats.fsyncs));
+     pg_atomic_add_fetch_u64(&ba_stats->writes, pg_atomic_read_u64(&beentry->buffer_action_stats.writes));
+     pg_atomic_add_fetch_u64(&ba_stats->writes_strat, pg_atomic_read_u64(&beentry->buffer_action_stats.writes_strat));
+}

I don't see a race, FWIW.

This is where I propose that we instead report the values up to the stats
collector, instead of having a separate array that we need to persist

Changed

+/*
+ * Fill the provided values array with the accumulated counts of buffer actions
+ * taken by all backends of type backend_type (input parameter), both alive and
+ * dead. This is currently only used by pg_stat_get_buffer_actions() to create
+ * the rows in the pg_stat_buffer_actions system view.
+ */
+void
+pgstat_recount_all_buffer_actions(BackendType backend_type, Datum *values)
+{
+     int                     i;
+     volatile PgBackendStatus *beentry;
+
+     /*
+      * Add stats from all exited backends
+      */
+     values[BA_Alloc] = pg_atomic_read_u64(&BufferActionStatsArray[backend_type].allocs);
+     values[BA_Extend] = pg_atomic_read_u64(&BufferActionStatsArray[backend_type].extends);
+     values[BA_Fsync] = pg_atomic_read_u64(&BufferActionStatsArray[backend_type].fsyncs);
+     values[BA_Write] = pg_atomic_read_u64(&BufferActionStatsArray[backend_type].writes);
+     values[BA_Write_Strat] = pg_atomic_read_u64(&BufferActionStatsArray[backend_type].writes_strat);
+
+     /*
+      * Loop through all live backends and count their buffer actions
+      */
+     // TODO: see note in pg_stat_get_buffer_actions() about inefficiency of this method
+
+     beentry = BackendStatusArray;
+     for (i = 1; i <= MaxBackends; i++)
+     {
+             /* Don't count dead backends. They should already be counted */
+             if (beentry->st_procpid == 0)
+                     continue;
+             if (beentry->st_backendType != backend_type)
+                     continue;
+
+             values[BA_Alloc] += pg_atomic_read_u64(&beentry->buffer_action_stats.allocs);
+             values[BA_Extend] += pg_atomic_read_u64(&beentry->buffer_action_stats.extends);
+             values[BA_Fsync] += pg_atomic_read_u64(&beentry->buffer_action_stats.fsyncs);
+             values[BA_Write] += pg_atomic_read_u64(&beentry->buffer_action_stats.writes);
+             values[BA_Write_Strat] += pg_atomic_read_u64(&beentry->buffer_action_stats.writes_strat);
+
+             beentry++;
+     }
+}

It seems to make a bit more sense to have this sum up the stats for all
backend types at once.

Changed.

+             /*
+              * Currently, the only supported backend types for stats are the following.
+              * If this were to change, pg_proc.dat would need to be changed as well
+              * to reflect the new expected number of rows.
+              */
+             Datum values[BUFFER_ACTION_NUM_TYPES];
+             bool nulls[BUFFER_ACTION_NUM_TYPES];

Ah ;)

I just went ahead and made a row for each backend type.

- Melanie

Attachments:

v4-0001-Add-system-view-tracking-shared-buffer-actions.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Add-system-view-tracking-shared-buffer-actions.patchDownload+297-83
#14Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#13)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

On Wed, Aug 11, 2021 at 4:11 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Tue, Aug 3, 2021 at 2:13 PM Andres Freund <andres@anarazel.de> wrote:

@@ -2895,6 +2948,20 @@ FlushBuffer(BufferDesc *buf, SMgrRelation reln)
/*
* bufToWrite is either the shared buffer or a copy, as appropriate.
*/
+
+     /*
+      * TODO: consider that if we did not need to distinguish between a buffer
+      * flushed that was grabbed from the ring buffer and written out as part
+      * of a strategy which was not from main Shared Buffers (and thus
+      * preventable by bgwriter or checkpointer), then we could move all calls
+      * to pgstat_increment_buffer_action() here except for the one for
+      * extends, which would remain in ReadBuffer_common() before smgrextend()
+      * (unless we decide to start counting other extends). That includes the
+      * call to count buffers written by bgwriter and checkpointer which go
+      * through FlushBuffer() but not BufferAlloc(). That would make it
+      * simpler. Perhaps instead we can find somewhere else to indicate that
+      * the buffer is from the ring of buffers to reuse.
+      */
smgrwrite(reln,
buf->tag.forkNum,
buf->tag.blockNum,

Can we just add a parameter to FlushBuffer indicating what the source of the
write is?

I just noticed this comment now, so I'll address that in the next
version. I rebased today and noticed merge conflicts, so, it looks like
v5 will be on its way soon anyway.

Actually, after moving the code around like you suggested, calling
pgstat_increment_buffer_action() before smgrwrite() in FlushBuffer() and
using a parameter to indicate if it is a strategy write or not would
only save us one other call to pgstat_increment_buffer_action() -- the
one in SyncOneBuffer(). We would end up moving the one in BufferAlloc()
to FlushBuffer() and removing the one in SyncOneBuffer().
Do you think it is still worth it?

Rebased v5 attached.

Attachments:

v5-0001-Add-system-view-tracking-shared-buffer-actions.patchtext/x-patch; charset=US-ASCII; name=v5-0001-Add-system-view-tracking-shared-buffer-actions.patchDownload+293-83
#15Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#13)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Hi,

On 2021-08-11 16:11:34 -0400, Melanie Plageman wrote:

On Tue, Aug 3, 2021 at 2:13 PM Andres Freund <andres@anarazel.de> wrote:

Also, I'm unsure how writing the buffer action stats out in
pgstat_write_statsfiles() will work, since I think that backends can
update their buffer action stats after we would have already persisted
the data from the BufferActionStatsArray -- causing us to lose those
updates.

I was thinking it'd work differently. Whenever a connection ends, it reports
its data up to pgstats.c (otherwise we'd loose those stats). By the time
shutdown happens, they all need to have already have reported their stats - so
we don't need to do anything to get the data to pgstats.c during shutdown
time.

When you say "whenever a connection ends", what part of the code are you
referring to specifically?

pgstat_beshutdown_hook()

Also, when you say "shutdown", do you mean a backend shutting down or
all backends shutting down (including postmaster) -- like pg_ctl stop?

Admittedly our language is very imprecise around this :(. What I meant
is that backends would report their own stats up to the stats collector
when the connection ends (in pgstat_beshutdown_hook()). That means that
when the whole server (pgstat and then postmaster, potentially via
pg_ctl stop) shuts down, all the per-connection stats have already been
reported up to pgstat.

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..96cac0a74e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1067,9 +1067,6 @@ CREATE VIEW pg_stat_bgwriter AS
pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint,
pg_stat_get_bgwriter_buf_written_clean() AS buffers_clean,
pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean,
-        pg_stat_get_buf_written_backend() AS buffers_backend,
-        pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync,
-        pg_stat_get_buf_alloc() AS buffers_alloc,
pg_stat_get_bgwriter_stat_reset_time() AS stats_reset;

Material for a separate patch, not this. But if we're going to break
monitoring queries anyway, I think we should consider also renaming
maxwritten_clean (and perhaps a few others), because nobody understands what
that is supposed to mean.

Do you mean I shouldn't remove anything from the pg_stat_bgwriter view?

No - I just meant that now that we're breaking pg_stat_bgwriter queries,
we should also rename the columns to be easier to understand. But that
it should be a separate patch / commit...

@@ -411,11 +421,6 @@ StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc)
*/
*complete_passes += nextVictimBuffer / NBuffers;
}
-
-     if (num_buf_alloc)
-     {
-             *num_buf_alloc = pg_atomic_exchange_u32(&StrategyControl->numBufferAllocs, 0);
-     }
SpinLockRelease(&StrategyControl->buffer_strategy_lock);
return result;
}

Hm. Isn't bgwriter using the *num_buf_alloc value to pace its activity? I
suspect this patch shouldn't get rid of numBufferAllocs at the same time as
overhauling the stats stuff. Perhaps we don't need both - but it's not obvious
that that's the case / how we can make that work.

I initially meant to add a function to the patch like
pg_stat_get_buffer_actions() but which took a BufferActionType and
BackendType as parameters and returned a single value which is the
number of buffer action types of that type for that type of backend.

let's say I defined it like this:
uint64
pg_stat_get_backend_buffer_actions_stats(BackendType backend_type,
BufferActionType ba_type)

Then, I intended to use that in StrategySyncStart() to set num_buf_alloc
by subtracting the value of StrategyControl->numBufferAllocs from the
value returned by pg_stat_get_backend_buffer_actions_stats(B_BG_WRITER,
BA_Alloc), val, then adding that value, val, to
StrategyControl->numBufferAllocs.

I don't think you could restrict this to B_BG_WRITER? The whole point of
this logic is that bgwriter uses the stats for *all* backends to get the
"usage rate" for buffers, which it then uses to control how many buffers
to clean.

I think that would have the same behavior as current, though I'm not
sure if the performance would end up being better or worse. It wouldn't
be atomically incrementing StrategyControl->numBufferAllocs, but it
would do a few additional atomic operations in StrategySyncStart() than
before. Also, we would do all the work done by
pg_stat_get_buffer_actions() in StrategySyncStart().

I think it'd be better to separate changing the bgwriter pacing logic
(and thus numBufferAllocs) from changing the stats reporting.

But that is called comparatively infrequently, right?

Depending on the workload not that rarely. I'm afraid this might be a
bit too expensive. It's possible we can work around that however.

Greetings,

Andres Freund

#16Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#15)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

On Fri, Aug 13, 2021 at 3:08 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2021-08-11 16:11:34 -0400, Melanie Plageman wrote:

On Tue, Aug 3, 2021 at 2:13 PM Andres Freund <andres@anarazel.de> wrote:

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 55f6e3711d..96cac0a74e 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1067,9 +1067,6 @@ CREATE VIEW pg_stat_bgwriter AS
pg_stat_get_bgwriter_buf_written_checkpoints() AS buffers_checkpoint,
pg_stat_get_bgwriter_buf_written_clean() AS buffers_clean,
pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean,
-        pg_stat_get_buf_written_backend() AS buffers_backend,
-        pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync,
-        pg_stat_get_buf_alloc() AS buffers_alloc,
pg_stat_get_bgwriter_stat_reset_time() AS stats_reset;

Material for a separate patch, not this. But if we're going to break
monitoring queries anyway, I think we should consider also renaming
maxwritten_clean (and perhaps a few others), because nobody understands what
that is supposed to mean.

Do you mean I shouldn't remove anything from the pg_stat_bgwriter view?

No - I just meant that now that we're breaking pg_stat_bgwriter queries,
we should also rename the columns to be easier to understand. But that
it should be a separate patch / commit...

I separated the removal of some redundant stats from pg_stat_bgwriter
into a different commit but haven't removed or clarified any additional
columns in pg_stat_bgwriter.

@@ -411,11 +421,6 @@ StrategySyncStart(uint32 *complete_passes, uint32 *num_buf_alloc)
*/
*complete_passes += nextVictimBuffer / NBuffers;
}
-
-     if (num_buf_alloc)
-     {
-             *num_buf_alloc = pg_atomic_exchange_u32(&StrategyControl->numBufferAllocs, 0);
-     }
SpinLockRelease(&StrategyControl->buffer_strategy_lock);
return result;
}

Hm. Isn't bgwriter using the *num_buf_alloc value to pace its activity? I
suspect this patch shouldn't get rid of numBufferAllocs at the same time as
overhauling the stats stuff. Perhaps we don't need both - but it's not obvious
that that's the case / how we can make that work.

I initially meant to add a function to the patch like
pg_stat_get_buffer_actions() but which took a BufferActionType and
BackendType as parameters and returned a single value which is the
number of buffer action types of that type for that type of backend.

let's say I defined it like this:
uint64
pg_stat_get_backend_buffer_actions_stats(BackendType backend_type,
BufferActionType ba_type)

Then, I intended to use that in StrategySyncStart() to set num_buf_alloc
by subtracting the value of StrategyControl->numBufferAllocs from the
value returned by pg_stat_get_backend_buffer_actions_stats(B_BG_WRITER,
BA_Alloc), val, then adding that value, val, to
StrategyControl->numBufferAllocs.

I don't think you could restrict this to B_BG_WRITER? The whole point of
this logic is that bgwriter uses the stats for *all* backends to get the
"usage rate" for buffers, which it then uses to control how many buffers
to clean.

I think that would have the same behavior as current, though I'm not
sure if the performance would end up being better or worse. It wouldn't
be atomically incrementing StrategyControl->numBufferAllocs, but it
would do a few additional atomic operations in StrategySyncStart() than
before. Also, we would do all the work done by
pg_stat_get_buffer_actions() in StrategySyncStart().

I think it'd be better to separate changing the bgwriter pacing logic
(and thus numBufferAllocs) from changing the stats reporting.

But that is called comparatively infrequently, right?

Depending on the workload not that rarely. I'm afraid this might be a
bit too expensive. It's possible we can work around that however.

I've restored StrategyControl->numBuffersAlloc.

Attached is v6 of the patchset.

I have made several small updates to the patch, including user docs
updates, comment clarifications, various changes related to how
structures are initialized, code simplications, small details like
alphabetizing of #includes, etc.

Below are details on the remaining TODOs and open questions for this
patch and why I haven't done them yet:

1) performance testing (initial tests done, but need to do some further
investigation before sharing)

2) stats_reset
Because pg_stat_buffer_actions fields were added to the globalStats
structure, they get reset when the target RESET_BGWRITER is reset.
Depending on whether or not these commits remove columns from the
pg_stat_bgwriter view, I would approach adding stats_reset to
pg_stat_buffer_actions differently. If removing all of pg_stat_bgwriter,
I would just rename the target to apply to pg_stat_buffer_actions. If
not removing all of pg_stat_bgwriter, I would add a new target for
pg_stat_buffer_actions to reset those stats and then either remove them
from globalStats or MemSet() only the relevant parts of the struct in
pgstat_recv_resetsharedcounter().
I haven't done this yet because I want to get input on what should
happen to pg_stat_bgwriter first (all of it goes, all of it stays, some
goes, etc).

3) what to count
Currently, the patch counts allocs, extends, fsyncs and writes of shared
buffers and writes done when using a buffer access strategy. So, it is a
mix of mostly shared buffers and a few non-shared buffers. I am
wondering if it makes sense to also count extends with smgrextend()
other than those using shared buffers--for example when building an
index or when extending the free space map or visibility map. For
fsyncs, the patch does not count checkpointer fsyncs or fsyncs done from
XLogWrite().
On a related note, depending on what the view counts, the name
buffer_actions may or may not be too general.

I also feel like the BackendType B_BACKEND is a bit confusing when we
are tracking buffer actions for different backend types -- this name
makes it seem like other types of backends are not backends.

I'm not sure what the view should track and can see arguments for
excluding certain extends or separating them into another stat. I
haven't made the changes because I am looking for other peoples'
opinions.

4) Adding some sort of protection against regressions when code is added
that adds additional buffer actions but doesn't count them -- more
likely if we are counting all users of smgrextend() but not doing the
counter incrementing there.

I'm not sure how I would even do this, so, that's why I haven't done it.

5) It seems like the code to create a tuplestore used by various stats
functions like pg_stat_get_progress_info(), pg_stat_get_activity, and
pg_stat_get_slru could be refactored into a helper function since it is
quite redundant (maybe returning a ReturnSetInfo).

I haven't done this because I wasn't sure if it was a good idea, and, if
it is, if I should do it in a separate commit.

6) Cleaning up of commit message, running pgindent, and, eventually,
catalog bump (waiting until the patch is done to do this).

7) Additional testing to ensure all codepaths added are hit (one-off
testing, not added to regression test suite). I am waiting to do this
until all of the types of buffer actions that will be done are
finalized.

- Melanie

Attachments:

v6-0002-Remove-superfluous-bgwriter-stats.patchapplication/octet-stream; name=v6-0002-Remove-superfluous-bgwriter-stats.patchDownload+0-104
v6-0001-Add-system-view-tracking-shared-buffer-actions.patchapplication/octet-stream; name=v6-0001-Add-system-view-tracking-shared-buffer-actions.patchDownload+425-20
#17Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#15)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

On Fri, Aug 13, 2021 at 3:08 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2021-08-11 16:11:34 -0400, Melanie Plageman wrote:

On Tue, Aug 3, 2021 at 2:13 PM Andres Freund <andres@anarazel.de> wrote:

Also, I'm unsure how writing the buffer action stats out in
pgstat_write_statsfiles() will work, since I think that backends can
update their buffer action stats after we would have already persisted
the data from the BufferActionStatsArray -- causing us to lose those
updates.

I was thinking it'd work differently. Whenever a connection ends, it reports
its data up to pgstats.c (otherwise we'd loose those stats). By the time
shutdown happens, they all need to have already have reported their stats - so
we don't need to do anything to get the data to pgstats.c during shutdown
time.

When you say "whenever a connection ends", what part of the code are you
referring to specifically?

pgstat_beshutdown_hook()

Also, when you say "shutdown", do you mean a backend shutting down or
all backends shutting down (including postmaster) -- like pg_ctl stop?

Admittedly our language is very imprecise around this :(. What I meant
is that backends would report their own stats up to the stats collector
when the connection ends (in pgstat_beshutdown_hook()). That means that
when the whole server (pgstat and then postmaster, potentially via
pg_ctl stop) shuts down, all the per-connection stats have already been
reported up to pgstat.

So, I realized that the patch has a problem. I added the code to send
buffer actions stats to the stats collector
(pgstat_send_buffer_actions()) to pgstat_report_stat() and this isn't
getting called when all types of backends exit.

I originally thought to add pgstat_send_buffer_actions() to
pgstat_beshutdown_hook() (as suggested), but, this is called after
pgstat_shutdown_hook(), so, we aren't able to send stats to the stats
collector at that time. (pgstat_shutdown_hook() sets pgstat_is_shutdown
to true and then in pgstat_beshutdown_hook() (called after), if we call
pgstat_send_buffer_actions(), it calls pgstat_send() which calls
pgstat_assert_is_up() which trips when pgstat_is_shutdown is true.)

After calling pgstat_send_buffer_actions() from pgstat_report_stat(), it
seems to miss checkpointer stats entirely. I did find that if I
sprinkled pgstat_send_buffer_actions() around in the various places that
pgstat_send_checkpointer() is called, I could get checkpointer stats
(see attached patch, capture_checkpointer_buffer_actions.patch), but,
that seems a little bit haphazard since pgstat_send_buffer_actions() is
supposed to capture stats for all backend types. Is there somewhere else
I can call it that is exercised by all backend types before
pgstat_shutdown_hook() is called but after they would have finished any
relevant buffer actions?

- Melanie

Attachments:

capture_checkpointer_buffer_actions.patchapplication/octet-stream; name=capture_checkpointer_buffer_actions.patchDownload+5-4
#18Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#17)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

On Wed, Sep 8, 2021 at 9:28 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:

On Fri, Aug 13, 2021 at 3:08 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2021-08-11 16:11:34 -0400, Melanie Plageman wrote:

On Tue, Aug 3, 2021 at 2:13 PM Andres Freund <andres@anarazel.de> wrote:

Also, I'm unsure how writing the buffer action stats out in
pgstat_write_statsfiles() will work, since I think that backends can
update their buffer action stats after we would have already persisted
the data from the BufferActionStatsArray -- causing us to lose those
updates.

I was thinking it'd work differently. Whenever a connection ends, it reports
its data up to pgstats.c (otherwise we'd loose those stats). By the time
shutdown happens, they all need to have already have reported their stats - so
we don't need to do anything to get the data to pgstats.c during shutdown
time.

When you say "whenever a connection ends", what part of the code are you
referring to specifically?

pgstat_beshutdown_hook()

Also, when you say "shutdown", do you mean a backend shutting down or
all backends shutting down (including postmaster) -- like pg_ctl stop?

Admittedly our language is very imprecise around this :(. What I meant
is that backends would report their own stats up to the stats collector
when the connection ends (in pgstat_beshutdown_hook()). That means that
when the whole server (pgstat and then postmaster, potentially via
pg_ctl stop) shuts down, all the per-connection stats have already been
reported up to pgstat.

So, I realized that the patch has a problem. I added the code to send
buffer actions stats to the stats collector
(pgstat_send_buffer_actions()) to pgstat_report_stat() and this isn't
getting called when all types of backends exit.

I originally thought to add pgstat_send_buffer_actions() to
pgstat_beshutdown_hook() (as suggested), but, this is called after
pgstat_shutdown_hook(), so, we aren't able to send stats to the stats
collector at that time. (pgstat_shutdown_hook() sets pgstat_is_shutdown
to true and then in pgstat_beshutdown_hook() (called after), if we call
pgstat_send_buffer_actions(), it calls pgstat_send() which calls
pgstat_assert_is_up() which trips when pgstat_is_shutdown is true.)

After calling pgstat_send_buffer_actions() from pgstat_report_stat(), it
seems to miss checkpointer stats entirely. I did find that if I
sprinkled pgstat_send_buffer_actions() around in the various places that
pgstat_send_checkpointer() is called, I could get checkpointer stats
(see attached patch, capture_checkpointer_buffer_actions.patch), but,
that seems a little bit haphazard since pgstat_send_buffer_actions() is
supposed to capture stats for all backend types. Is there somewhere else
I can call it that is exercised by all backend types before
pgstat_shutdown_hook() is called but after they would have finished any
relevant buffer actions?

I realized that putting these additional calls in checkpointer code and
not clearing out PgBackendStatus counters for buffer actions results in
a lot of duplicate stats. I was wondering if
pgstat_send_buffer_actions() is needed, however, in
HandleCheckpointerInterrupts() before the proc_exit().

It does seem like additional calls to pgstat_send_buffer_actions()
shouldn't be needed since most processes register
pgstat_shutdown_hook(). However, since MyDatabaseId isn't valid for the
auxiliary processes, even though the pgstat_shutdown_hook() is
registered from BaseInit(), pgstat_report_stat() never gets called for
them, so their stats aren't persisted using the current method.

It seems like the best solution to persisting all processes' stats would
be to have all processes register pgstat_shutdown_hook() and to still
call pgstat_report_stat() even if MyDatabaseId is not valid if the
process is not a regular backend (I assume that it is only a problem
that MyDatabaseId is InvalidOid for backends that have had it set to a
valid oid at some point). For the stats that rely on database OID,
perhaps those can be reported based on whether or not MyDatabaseId is
valid from within pgstat_report_stat().

I also realized that I am not collecting stats from live auxiliary
processes in pg_stat_get_buffer_actions(). I need to change the loop to
for (i = 0; i <= MaxBackends + NUM_AUXPROCTYPES; i++) to actually get
stats from live auxiliary processes when querying the view.

On an unrelated note, I am planning to remove buffers_clean and
buffers_checkpoint from the pg_stat_bgwriter view since those are also
redundant. When I was removing them, I noticed that buffers_checkpoint
and buffers_clean count buffers as having been written even when
FlushBuffer() "does nothing" because someone else wrote out the dirty
buffer before the bgwriter or checkpointer had a chance to do it. This
seems like it would result in an incorrect count. Am I missing
something?

- Melanie

#19Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#18)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Hi,

I've attached the v7 patch set.

Changes from v6:
- removed unnecessary global variable BufferActionsStats
- fixed the loop condition in pg_stat_get_buffer_actions()
- updated some comments
- removed buffers_checkpoint and buffers_clean from pg_stat_bgwriter
view (now pg_stat_bgwriter view is mainly checkpointer statistics,
which isn't great)
- instead of calling pgstat_send_buffer_actions() in
pgstat_report_stat(), I renamed pgstat_send_buffer_actions() to
pgstat_report_buffers() and call it directly from
pgstat_shutdown_hook() for all types of processes (including processes
with invalid MyDatabaseId [like auxiliary processes])

I began changing the code to add the stats reset timestamp to the
pg_stat_buffer_actions view, but, I realized that it will be kind of
distracting to have every row for every backend type have a stats reset
timestamp (since it will be the same timestamp over and over). If,
however, you could reset buffer stats for each backend type
individually, then, I could see having it. Otherwise, we could add a
function like pg_stat_get_stats_reset_time(viewname) where viewname
would be pg_stat_buffer_actions in our case. Though, maybe that is
annoying and not very usable--I'm not sure.

I also think it makes sense to rename the pg_stat_buffer_actions view to
pg_stat_buffers and to name the columns using both the buffer action
type and buffer type -- e.g. shared, strategy, local. This leaves open
the possibility of counting buffer actions done on other non-shared
buffers -- like those done while building indexes or those using local
buffers. The third patch in the set does this (I wanted to see if it
made sense before fixing it up into the first patch in the set).

This naming convention (BufferType_BufferActionType) made me think that
it might make sense to have two enumerations: one being the current
BufferActionType (which could also be called BufferAccessType though
that might get confusing with BufferAccessStrategyType and buffer access
strategies in general) and the other being BufferType (which would be
one of shared, local, index, etc).

I attached a patch with the outline of this idea
(buffer_type_enum_addition.patch). It doesn't work because
pg_stat_get_buffer_actions() uses the BufferActionType as an index into
the values array returned. If I wanted to use a combination of the two
enums as an indexing mechanism (BufferActionType and BufferType), we
would end up with a tuple having every combination of the two
enums--some of which aren't valid. It might not make sense to implement
this. I do think it is useful to think of these stats as a combination
of a buffer action and a type of buffer.

- Melanie

Attachments:

v7-0001-Add-system-view-tracking-shared-buffer-actions.patchtext/x-patch; charset=US-ASCII; name=v7-0001-Add-system-view-tracking-shared-buffer-actions.patchDownload+426-21
v7-0002-Remove-superfluous-bgwriter-stats.patchtext/x-patch; charset=US-ASCII; name=v7-0002-Remove-superfluous-bgwriter-stats.patchDownload+0-157
v7-0003-Rename-pg_stat_buffer_actions-to-pg_stat_buffers.patchtext/x-patch; charset=US-ASCII; name=v7-0003-Rename-pg_stat_buffer_actions-to-pg_stat_buffers.patchDownload+33-34
buffer_type_enum_addition.patchtext/x-patch; charset=US-ASCII; name=buffer_type_enum_addition.patchDownload+26-18
#20Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Melanie Plageman (#19)
Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

Hello Melanie

On 2021-Sep-13, Melanie Plageman wrote:

I also think it makes sense to rename the pg_stat_buffer_actions view to
pg_stat_buffers and to name the columns using both the buffer action
type and buffer type -- e.g. shared, strategy, local. This leaves open
the possibility of counting buffer actions done on other non-shared
buffers -- like those done while building indexes or those using local
buffers. The third patch in the set does this (I wanted to see if it
made sense before fixing it up into the first patch in the set).

What do you think of the idea of having the "shared/strategy/local"
attribute be a column? So you'd have up to three rows per buffer action
type. Users wishing to see an aggregate can just aggregate them, just
like they'd do with pg_buffercache. I think that leads to an easy
decision with regards to this point:

I attached a patch with the outline of this idea
(buffer_type_enum_addition.patch). It doesn't work because
pg_stat_get_buffer_actions() uses the BufferActionType as an index into
the values array returned. If I wanted to use a combination of the two
enums as an indexing mechanism (BufferActionType and BufferType), we
would end up with a tuple having every combination of the two
enums--some of which aren't valid. It might not make sense to implement
this. I do think it is useful to think of these stats as a combination
of a buffer action and a type of buffer.

Does that seem sensible?

(It's weird to have enum values that are there just to indicate what's
the maximum value. I think that sort of thing is better done by having
a "#define LAST_THING" that takes the last valid value from the enum.
That would free you from having to handle the last value in switch
blocks, for example. LAST_OCLASS in dependency.h is a precedent on this.)

--
Álvaro Herrera Valdivia, Chile — https://www.EnterpriseDB.com/
"That sort of implies that there are Emacs keystrokes which aren't obscure.
I've been using it daily for 2 years now and have yet to discover any key
sequence which makes any sense." (Paul Thomas)

#21Melanie Plageman
melanieplageman@gmail.com
In reply to: Alvaro Herrera (#20)
#22Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#21)
#23Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#22)
#24Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#23)
#25Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#24)
#26Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Melanie Plageman (#25)
#27Melanie Plageman
melanieplageman@gmail.com
In reply to: Alvaro Herrera (#26)
#28Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#27)
#29Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#28)
#30Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#29)
#31Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#30)
#32Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#31)
#33Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#32)
#34Justin Pryzby
pryzby@telsasoft.com
In reply to: Melanie Plageman (#33)
#35Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#34)
#36Melanie Plageman
melanieplageman@gmail.com
In reply to: Justin Pryzby (#34)
#37Melanie Plageman
melanieplageman@gmail.com
In reply to: Justin Pryzby (#35)
#38Justin Pryzby
pryzby@telsasoft.com
In reply to: Melanie Plageman (#37)
#39Justin Pryzby
pryzby@telsasoft.com
In reply to: Melanie Plageman (#36)
#40Justin Pryzby
pryzby@telsasoft.com
In reply to: Melanie Plageman (#36)
#41Melanie Plageman
melanieplageman@gmail.com
In reply to: Justin Pryzby (#40)
#42Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#41)
#43Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#42)
#44Justin Pryzby
pryzby@telsasoft.com
In reply to: Melanie Plageman (#41)
#45Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#43)
#46Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Melanie Plageman (#43)
#47Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#45)
#48Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#47)
#49Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#48)
#50Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#49)
#51Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#50)
#52Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#44)
#53Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#51)
#54Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#53)
#55Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#54)
#56Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Melanie Plageman (#55)
#57Melanie Plageman
melanieplageman@gmail.com
In reply to: Kyotaro Horiguchi (#56)
#58Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#57)
#59Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#55)
#60Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Melanie Plageman (#57)
#61Andres Freund
andres@anarazel.de
In reply to: Kyotaro Horiguchi (#60)
#62Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Andres Freund (#61)
#63Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#58)
#64Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#59)
#65Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#64)
#66Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#65)
#67Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#64)
#68Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#67)
#69Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#67)
#70Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#69)
#71Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#70)
#72Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#70)
#73Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#72)
#74Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#73)
#75Lukas Fittl
lukas@fittl.com
In reply to: Melanie Plageman (#74)
#76Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#74)
#77Melanie Plageman
melanieplageman@gmail.com
In reply to: Lukas Fittl (#75)
#78Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#77)
#79Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#78)
#80Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Melanie Plageman (#79)
#81Melanie Plageman
melanieplageman@gmail.com
In reply to: Maciek Sakrejda (#80)
#82Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Melanie Plageman (#81)
#83Melanie Plageman
melanieplageman@gmail.com
In reply to: Maciek Sakrejda (#82)
#84Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#83)
#85Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Melanie Plageman (#83)
#86Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Andres Freund (#84)
#87Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#84)
#88Melanie Plageman
melanieplageman@gmail.com
In reply to: Maciek Sakrejda (#85)
#89Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#87)
#90Melanie Plageman
melanieplageman@gmail.com
In reply to: Maciek Sakrejda (#86)
#91Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#87)
#92Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Melanie Plageman (#90)
#93Melanie Plageman
melanieplageman@gmail.com
In reply to: Maciek Sakrejda (#92)
#94Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Melanie Plageman (#93)
#95Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#93)
#96Justin Pryzby
pryzby@telsasoft.com
In reply to: Melanie Plageman (#93)
#97Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#96)
#98Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#95)
#99Melanie Plageman
melanieplageman@gmail.com
In reply to: Justin Pryzby (#96)
#100Melanie Plageman
melanieplageman@gmail.com
In reply to: Maciek Sakrejda (#94)
#101Justin Pryzby
pryzby@telsasoft.com
In reply to: Melanie Plageman (#99)
#102Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Melanie Plageman (#100)
#103Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#100)
#104Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#103)
#105Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Melanie Plageman (#104)
#106Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#77)
#107Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#106)
#108Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#107)
#109Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#108)
#110Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#109)
#111vignesh C
vignesh21@gmail.com
In reply to: Melanie Plageman (#110)
#112Justin Pryzby
pryzby@telsasoft.com
In reply to: Melanie Plageman (#110)
#113Melanie Plageman
melanieplageman@gmail.com
In reply to: Justin Pryzby (#112)
#114Justin Pryzby
pryzby@telsasoft.com
In reply to: Melanie Plageman (#113)
#115Melanie Plageman
melanieplageman@gmail.com
In reply to: Justin Pryzby (#114)
#116Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#115)
#117Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Melanie Plageman (#115)
#118Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#116)
#119Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#118)
#120Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#119)
#121Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Melanie Plageman (#118)
#122vignesh C
vignesh21@gmail.com
In reply to: Melanie Plageman (#120)
#123Melanie Plageman
melanieplageman@gmail.com
In reply to: vignesh C (#122)
#124Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#123)
#125Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Melanie Plageman (#124)
#126Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#125)
#127Andres Freund
andres@anarazel.de
In reply to: Kyotaro Horiguchi (#125)
#128Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Andres Freund (#127)
#129Andres Freund
andres@anarazel.de
In reply to: Kyotaro Horiguchi (#128)
#130Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Andres Freund (#129)
#131Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#129)
#132Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#131)
#133Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#132)
#134Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Andres Freund (#133)
#135Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Maciek Sakrejda (#134)
#136Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#132)
#137Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#136)
#138Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#131)
#139Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#138)
#140Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#138)
#141Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#140)
#142Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#141)
#143Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#142)
#144Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#143)
#145Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#144)
#146Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#145)
#147Melanie Plageman
melanieplageman@gmail.com
In reply to: Tom Lane (#139)
#148Tom Lane
tgl@sss.pgh.pa.us
In reply to: Melanie Plageman (#147)
#149Melanie Plageman
melanieplageman@gmail.com
In reply to: Tom Lane (#148)
#150Tom Lane
tgl@sss.pgh.pa.us
In reply to: Melanie Plageman (#149)
#151Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#150)
#152Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#132)
#153Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Tom Lane (#152)
#154Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#153)
#155Melanie Plageman
melanieplageman@gmail.com
In reply to: Kyotaro Horiguchi (#154)
#156Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#155)
#157Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#156)
#158Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#157)
#159Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#158)
#160Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Melanie Plageman (#159)
#161Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#159)
#162Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#161)
#163Andres Freund
andres@anarazel.de
In reply to: Justin Pryzby (#162)
#164Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#163)
#165Justin Pryzby
pryzby@telsasoft.com
In reply to: Andres Freund (#163)
#166Melanie Plageman
melanieplageman@gmail.com
In reply to: Justin Pryzby (#165)
#167Pavel Luzanov
p.luzanov@postgrespro.ru
In reply to: Melanie Plageman (#166)
#168Melanie Plageman
melanieplageman@gmail.com
In reply to: Pavel Luzanov (#167)
#169Pavel Luzanov
p.luzanov@postgrespro.ru
In reply to: Melanie Plageman (#168)
#170Melanie Plageman
melanieplageman@gmail.com
In reply to: Pavel Luzanov (#169)
#171Pavel Luzanov
p.luzanov@postgrespro.ru
In reply to: Melanie Plageman (#170)
#172Melanie Plageman
melanieplageman@gmail.com
In reply to: Pavel Luzanov (#171)
#173Pavel Luzanov
p.luzanov@postgrespro.ru
In reply to: Melanie Plageman (#172)