New statistics for tuning WAL buffer size

Started by Masahiro Ikedaover 5 years ago57 messageshackers
Jump to latest
#1Masahiro Ikeda
ikedamsh@oss.nttdata.com

Hi,

It's important to provide the metrics for tuning the size of WAL
buffers.
For now, it's lack of the statistics how often processes wait to write
WAL because WAL buffer is full.

If those situation are often occurred, WAL buffer is too small for the
workload.
DBAs must to tune the WAL buffer size for performance improvement.

There are related threads, but those are not merged.
/messages/by-id/4FF824F3.5090407@uptime.jp
/messages/by-id/CAJrrPGc6APFUGYNcPe4qcNxpL8gXKYv1KST+vwJcFtCSCEySnA@mail.gmail.com

What do you think?
If we can have a consensus, I will make a PoC patch.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

#2tsunakawa.takay@fujitsu.com
tsunakawa.takay@fujitsu.com
In reply to: Masahiro Ikeda (#1)
RE: New statistics for tuning WAL buffer size

From: Masahiro Ikeda <ikedamsh@oss.nttdata.com>

It's important to provide the metrics for tuning the size of WAL buffers.
For now, it's lack of the statistics how often processes wait to write WAL
because WAL buffer is full.

If those situation are often occurred, WAL buffer is too small for the workload.
DBAs must to tune the WAL buffer size for performance improvement.

Yes, it's helpful to know if we need to enlarge the WAL buffer. That's why our colleague HariBabu proposed the patch. We'd be happy if it could be committed in some form.

There are related threads, but those are not merged.
/messages/by-id/4FF824F3.5090407@uptime.jp
/messages/by-id/CAJrrPGc6APFUGYNcPe4qcNx
pL8gXKYv1KST%2BvwJcFtCSCEySnA%40mail.gmail.com

What's the difference between those patches? What blocked them from being committed?

Regards
Takayuki Tsunakawa

#3Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: tsunakawa.takay@fujitsu.com (#2)
RE: New statistics for tuning WAL buffer size

On 2020-08-18 16:35, tsunakawa.takay@fujitsu.com wrote:

From: Masahiro Ikeda <ikedamsh@oss.nttdata.com>

It's important to provide the metrics for tuning the size of WAL
buffers.
For now, it's lack of the statistics how often processes wait to write
WAL
because WAL buffer is full.

If those situation are often occurred, WAL buffer is too small for the
workload.
DBAs must to tune the WAL buffer size for performance improvement.

Yes, it's helpful to know if we need to enlarge the WAL buffer.
That's why our colleague HariBabu proposed the patch. We'd be happy
if it could be committed in some form.

There are related threads, but those are not merged.
/messages/by-id/4FF824F3.5090407@uptime.jp
/messages/by-id/CAJrrPGc6APFUGYNcPe4qcNx
pL8gXKYv1KST%2BvwJcFtCSCEySnA%40mail.gmail.com

What's the difference between those patches? What blocked them from
being committed?

Thanks for replying.

Since the above threads are not active now and those patches can't be
applied HEAD,
I made this thread. If it is better to reply the above thread, I will do
so.

If my understanding is correct, we have to measure the performance
impact first.
Do you know HariBabu is now trying to solve it? If not, I will try to
modify patches to apply HEAD.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

#4tsunakawa.takay@fujitsu.com
tsunakawa.takay@fujitsu.com
In reply to: Masahiro Ikeda (#3)
RE: New statistics for tuning WAL buffer size

From: Masahiro Ikeda <ikedamsh@oss.nttdata.com>

If my understanding is correct, we have to measure the performance
impact first.
Do you know HariBabu is now trying to solve it? If not, I will try to
modify patches to apply HEAD.

No, he's not doing it anymore. It'd be great if you could resume it. However, I recommend sharing your understanding about what were the issues with those two threads and how you're trying to solve them. Was the performance overhead the blocker in both of the threads?

Regards
Takayuki Tsunakawa

#5Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: tsunakawa.takay@fujitsu.com (#4)
RE: New statistics for tuning WAL buffer size

On 2020-08-19 13:49, tsunakawa.takay@fujitsu.com wrote:

From: Masahiro Ikeda <ikedamsh@oss.nttdata.com>

If my understanding is correct, we have to measure the performance
impact first.
Do you know HariBabu is now trying to solve it? If not, I will try to
modify patches to apply HEAD.

No, he's not doing it anymore. It'd be great if you could resume it.

OK, thanks.

However, I recommend sharing your understanding about what were the
issues with those two threads and how you're trying to solve them.
Was the performance overhead the blocker in both of the threads?

In my understanding, some comments are not solved in both of the
threads.
I think the following works are remained.

1) Modify patches to apply HEAD
2) Get consensus what metrics we collect and how to use them for tuning.
3) Measure performance impact and if it leads poor performance, we solve
it.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

#6Fujii Masao
masao.fujii@gmail.com
In reply to: Masahiro Ikeda (#5)
Re: New statistics for tuning WAL buffer size

On 2020/08/19 14:10, Masahiro Ikeda wrote:

On 2020-08-19 13:49, tsunakawa.takay@fujitsu.com wrote:

From: Masahiro Ikeda <ikedamsh@oss.nttdata.com>

If my understanding is correct, we have to measure the performance
impact first.
Do you know HariBabu is now trying to solve it? If not, I will try to
modify patches to apply HEAD.

No, he's not doing it anymore.  It'd be great if you could resume it.

OK, thanks.

However, I recommend sharing your understanding about what were the
issues with those two threads and how you're trying to solve them.
Was the performance overhead the blocker in both of the threads?

In my understanding, some comments are not solved in both of the threads.
I think the following works are remained.

1) Modify patches to apply HEAD
2) Get consensus what metrics we collect and how to use them for tuning.

I agree to expose the number of WAL write caused by full of WAL buffers.
It's helpful when tuning wal_buffers size. Haribabu separated that number
into two fields in his patch; one is the number of WAL write by backend,
and another is by background processes and workers. But I'm not sure
how useful such separation is. I'm ok with just one field for that number.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#7Fujii Masao
masao.fujii@gmail.com
In reply to: Fujii Masao (#6)
Re: New statistics for tuning WAL buffer size

On 2020/08/20 20:01, Fujii Masao wrote:

On 2020/08/19 14:10, Masahiro Ikeda wrote:

On 2020-08-19 13:49, tsunakawa.takay@fujitsu.com wrote:

From: Masahiro Ikeda <ikedamsh@oss.nttdata.com>

If my understanding is correct, we have to measure the performance
impact first.
Do you know HariBabu is now trying to solve it? If not, I will try to
modify patches to apply HEAD.

No, he's not doing it anymore.  It'd be great if you could resume it.

OK, thanks.

However, I recommend sharing your understanding about what were the
issues with those two threads and how you're trying to solve them.
Was the performance overhead the blocker in both of the threads?

In my understanding, some comments are not solved in both of the threads.
I think the following works are remained.

1) Modify patches to apply HEAD
2) Get consensus what metrics we collect and how to use them for tuning.

I agree to expose the number of WAL write caused by full of WAL buffers.
It's helpful when tuning wal_buffers size. Haribabu separated that number
into two fields in his patch; one is the number of WAL write by backend,
and another is by background processes and workers. But I'm not sure
how useful such separation is. I'm ok with just one field for that number.

Just idea; it may be worth exposing the number of when new WAL file is
created and zero-filled. This initialization may have impact on
the performance of write-heavy workload generating lots of WAL. If this
number is reported high, to reduce the number of this initialization,
we can tune WAL-related parameters so that more "recycled" WAL files
can be hold.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#8tsunakawa.takay@fujitsu.com
tsunakawa.takay@fujitsu.com
In reply to: Fujii Masao (#6)
RE: New statistics for tuning WAL buffer size

From: Fujii Masao <masao.fujii@oss.nttdata.com>

I agree to expose the number of WAL write caused by full of WAL buffers.
It's helpful when tuning wal_buffers size. Haribabu separated that number
into two fields in his patch; one is the number of WAL write by backend,
and another is by background processes and workers. But I'm not sure
how useful such separation is. I'm ok with just one field for that number.

I agree with you. I don't think we need to separate the numbers for foreground processes and background ones. WAL buffer is a single resource. So "Writes due to full WAL buffer are happening. We may be able to boost performance by increasing wal_buffers" would be enough.

Regards
Takayuki Tsunakawa

#9tsunakawa.takay@fujitsu.com
tsunakawa.takay@fujitsu.com
In reply to: Fujii Masao (#7)
RE: New statistics for tuning WAL buffer size

From: Fujii Masao <masao.fujii@oss.nttdata.com>

Just idea; it may be worth exposing the number of when new WAL file is
created and zero-filled. This initialization may have impact on
the performance of write-heavy workload generating lots of WAL. If this
number is reported high, to reduce the number of this initialization,
we can tune WAL-related parameters so that more "recycled" WAL files
can be hold.

Sounds good. Actually, I want to know how much those zeroing affected the transaction response times, but it may be the target of the wait event statistics that Imai-san is addressing.

(I wonder how the fallocate() patch went that tries to minimize the zeroing time.)

Regards
Takayuki Tsunakawa

#10Fujii Masao
masao.fujii@gmail.com
In reply to: tsunakawa.takay@fujitsu.com (#9)
Re: New statistics for tuning WAL buffer size

On 2020/08/21 12:08, tsunakawa.takay@fujitsu.com wrote:

From: Fujii Masao <masao.fujii@oss.nttdata.com>

Just idea; it may be worth exposing the number of when new WAL file is
created and zero-filled. This initialization may have impact on
the performance of write-heavy workload generating lots of WAL. If this
number is reported high, to reduce the number of this initialization,
we can tune WAL-related parameters so that more "recycled" WAL files
can be hold.

Sounds good. Actually, I want to know how much those zeroing affected the transaction response times, but it may be the target of the wait event statistics that Imai-san is addressing.

Maybe, so I'm ok if the first pg_stat_walwriter patch doesn't expose
this number. We can extend it to include that later after we confirm
that number is really useful.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#11Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Fujii Masao (#10)
Re: New statistics for tuning WAL buffer size

Hi, thanks for useful comments.

I agree to expose the number of WAL write caused by full of WAL
buffers.
It's helpful when tuning wal_buffers size. Haribabu separated that
number
into two fields in his patch; one is the number of WAL write by
backend,
and another is by background processes and workers. But I'm not sure
how useful such separation is. I'm ok with just one field for that
number.

I agree with you. I don't think we need to separate the numbers for
foreground processes and background ones. WAL buffer is a single
resource. So "Writes due to full WAL buffer are happening. We may be
able to boost performance by increasing wal_buffers" would be enough.

I made a patch to expose the number of WAL write caused by full of WAL
buffers.
I'm going to submit this patch to commitfests.

As Fujii-san and Tsunakawa-san said, it expose the total number
since I agreed that we don't need to separate the numbers for
foreground processes and background ones.

By the way, do we need to add another metrics related to WAL?
For example, is the total number of WAL writes to the buffers useful to
calculate the dirty WAL write ratio?

Is it enough as a first step?

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Attachments:

0001_pg_stat_walwrites_view.patchtext/x-diff; name=0001_pg_stat_walwrites_view.patchDownload+183-3
#12Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Masahiro Ikeda (#11)
Re: New statistics for tuning WAL buffer size

On 2020-08-24 20:45, Masahiro Ikeda wrote:

Hi, thanks for useful comments.

I agree to expose the number of WAL write caused by full of WAL
buffers.
It's helpful when tuning wal_buffers size. Haribabu separated that
number
into two fields in his patch; one is the number of WAL write by
backend,
and another is by background processes and workers. But I'm not sure
how useful such separation is. I'm ok with just one field for that
number.

I agree with you. I don't think we need to separate the numbers for
foreground processes and background ones. WAL buffer is a single
resource. So "Writes due to full WAL buffer are happening. We may be
able to boost performance by increasing wal_buffers" would be enough.

I made a patch to expose the number of WAL write caused by full of WAL
buffers.
I'm going to submit this patch to commitfests.

As Fujii-san and Tsunakawa-san said, it expose the total number
since I agreed that we don't need to separate the numbers for
foreground processes and background ones.

By the way, do we need to add another metrics related to WAL?
For example, is the total number of WAL writes to the buffers useful
to calculate the dirty WAL write ratio?

Is it enough as a first step?

I forgot to rebase the current master.
I've attached the rebased patch.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Attachments:

0002_pg_stat_walwrites_view.patchtext/x-diff; name=0002_pg_stat_walwrites_view.patchDownload+183-3
#13Fujii Masao
masao.fujii@gmail.com
In reply to: Masahiro Ikeda (#12)
Re: New statistics for tuning WAL buffer size

On 2020/08/24 21:00, Masahiro Ikeda wrote:

On 2020-08-24 20:45, Masahiro Ikeda wrote:

Hi, thanks for useful comments.

I agree to expose the number of WAL write caused by full of WAL buffers.
It's helpful when tuning wal_buffers size. Haribabu separated that number
into two fields in his patch; one is the number of WAL write by backend,
and another is by background processes and workers. But I'm not sure
how useful such separation is. I'm ok with just one field for that number.

I agree with you.� I don't think we need to separate the numbers for foreground processes and background ones.� WAL buffer is a single resource.� So "Writes due to full WAL buffer are happening.� We may be able to boost performance by increasing wal_buffers" would be enough.

I made a patch to expose the number of WAL write caused by full of WAL buffers.
I'm going to submit this patch to commitfests.

As Fujii-san and Tsunakawa-san said, it expose the total number
since I agreed that we don't need to separate the numbers for
foreground processes and background ones.

By the way, do we need to add another metrics related to WAL?
For example, is the total number of WAL writes to the buffers useful
to calculate the dirty WAL write ratio?

Is it enough as a first step?

I forgot to rebase the current master.
I've attached the rebased patch.

Thanks for the patch!

+/* ----------
+ * Backend types
+ * ----------

You seem to forget to add "*/" into the above comment.
This issue could cause the following compiler warning.

../../src/include/pgstat.h:761:1: warning: '/*' within block comment [-Wcomment]

The contents of pg_stat_walwrites are reset when the server
is restarted. Isn't this problematic? IMO since pg_stat_walwrites
is a collected statistics view, basically its contents should be
kept even in the case of server restart.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#14Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Fujii Masao (#13)
Re: New statistics for tuning WAL buffer size
+/* ----------
+ * Backend types
+ * ----------

You seem to forget to add "*/" into the above comment.
This issue could cause the following compiler warning.
../../src/include/pgstat.h:761:1: warning: '/*' within block comment
[-Wcomment]

Thanks for the comment. I fixed.

The contents of pg_stat_walwrites are reset when the server
is restarted. Isn't this problematic? IMO since pg_stat_walwrites
is a collected statistics view, basically its contents should be
kept even in the case of server restart.

I agree your opinion.
I modified to use the statistics collector and persist the wal
statistics.

I changed the view name from pg_stat_walwrites to pg_stat_walwriter.
I think it is better to match naming scheme with other views like
pg_stat_bgwriter,
which is for bgwriter statistics but it has the statistics related to
backend.

The pg_stat_walwriter is not security restricted now, so ordinary users
can access it.
I has the same security level as pg_stat_archiver.If you have any
comments, please let me know.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Attachments:

0003_pg_stat_walwriter_view.patchtext/x-diff; name=0003_pg_stat_walwriter_view.patchDownload+251-5
#15Fujii Masao
masao.fujii@gmail.com
In reply to: Masahiro Ikeda (#14)
Re: New statistics for tuning WAL buffer size

On 2020/09/02 18:56, Masahiro Ikeda wrote:

+/* ----------
+ * Backend types
+ * ----------

You seem to forget to add "*/" into the above comment.
This issue could cause the following compiler warning.
../../src/include/pgstat.h:761:1: warning: '/*' within block comment [-Wcomment]

Thanks for the comment. I fixed.

Thanks for the fix! But why are those comments necessary?

The contents of pg_stat_walwrites are reset when the server
is restarted. Isn't this problematic? IMO since pg_stat_walwrites
is a collected statistics view, basically its contents should be
kept even in the case of server restart.

I agree your opinion.
I modified to use the statistics collector and persist the wal statistics.

I changed the view name from pg_stat_walwrites to pg_stat_walwriter.
I think it is better to match naming scheme with other views like pg_stat_bgwriter,
which is for bgwriter statistics but it has the statistics related to backend.

I prefer the view name pg_stat_walwriter for the consistency with
other view names. But we also have pg_stat_wal_receiver. Which
makes me think that maybe pg_stat_wal_writer is better for
the consistency. Thought? IMO either of them works for me.
I'd like to hear more opinons about this.

The pg_stat_walwriter is not security restricted now, so ordinary users can access it.
I has the same security level as pg_stat_archiver.If you have any comments, please let me know.

+ <structfield>dirty_writes</structfield> <type>bigint</type>

I guess that the column name "dirty_writes" derived from
the DTrace probe name. Isn't this name confusing? We should
rename it to "wal_buffers_full" or something?

+/* ----------
+ * PgStat_MsgWalWriter			Sent by the walwriter to update statistics.

This comment seems not accurate because backends also send it.

+/*
+ * WAL writes statistics counter is updated in XLogWrite function
+ */
+extern PgStat_MsgWalWriter WalWriterStats;

This comment seems not right because the counter is not updated in XLogWrite().

+-- There will surely and maximum one record
+select count(*) = 1 as ok from pg_stat_walwriter;

What about changing this comment to "There must be only one record"?

+ WalWriterStats.m_xlog_dirty_writes++;
LWLockRelease(WALWriteLock);

Since WalWriterStats.m_xlog_dirty_writes doesn't need to be protected
with WALWriteLock, isn't it better to increment that after releasing the lock?

+CREATE VIEW pg_stat_walwriter AS
+    SELECT
+        pg_stat_get_xlog_dirty_writes() AS dirty_writes,
+        pg_stat_get_walwriter_stat_reset_time() AS stats_reset;
+
  CREATE VIEW pg_stat_progress_vacuum AS

In system_views.sql, the definition of pg_stat_walwriter should be
placed just after that of pg_stat_bgwriter not pg_stat_progress_analyze.

}
-
/*
* We found an existing collector stats file. Read it and put all the

You seem to accidentally have removed the empty line here.

-				 errhint("Target must be \"archiver\" or \"bgwriter\".")));
+				 errhint("Target must be \"archiver\" or \"bgwriter\" or \"walwriter\".")));

There are two "or" in the message, but the former should be replaced with ","?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#16tsunakawa.takay@fujitsu.com
tsunakawa.takay@fujitsu.com
In reply to: Fujii Masao (#15)
RE: New statistics for tuning WAL buffer size

From: Fujii Masao <masao.fujii@oss.nttdata.com>

I changed the view name from pg_stat_walwrites to pg_stat_walwriter.
I think it is better to match naming scheme with other views like

pg_stat_bgwriter,

which is for bgwriter statistics but it has the statistics related to backend.

I prefer the view name pg_stat_walwriter for the consistency with
other view names. But we also have pg_stat_wal_receiver. Which
makes me think that maybe pg_stat_wal_writer is better for
the consistency. Thought? IMO either of them works for me.
I'd like to hear more opinons about this.

I think pg_stat_bgwriter is now a misnomer, because it contains the backends' activity. Likewise, pg_stat_walwriter leads to misunderstanding because its information is not limited to WAL writer.

How about simply pg_stat_wal? In the future, we may want to include WAL reads in this view, e.g. reading undo logs in zheap.

Regards
Takayuki Tsunakawa

#17Fujii Masao
masao.fujii@gmail.com
In reply to: tsunakawa.takay@fujitsu.com (#16)
Re: New statistics for tuning WAL buffer size

On 2020/09/04 11:50, tsunakawa.takay@fujitsu.com wrote:

From: Fujii Masao <masao.fujii@oss.nttdata.com>

I changed the view name from pg_stat_walwrites to pg_stat_walwriter.
I think it is better to match naming scheme with other views like

pg_stat_bgwriter,

which is for bgwriter statistics but it has the statistics related to backend.

I prefer the view name pg_stat_walwriter for the consistency with
other view names. But we also have pg_stat_wal_receiver. Which
makes me think that maybe pg_stat_wal_writer is better for
the consistency. Thought? IMO either of them works for me.
I'd like to hear more opinons about this.

I think pg_stat_bgwriter is now a misnomer, because it contains the backends' activity. Likewise, pg_stat_walwriter leads to misunderstanding because its information is not limited to WAL writer.

How about simply pg_stat_wal? In the future, we may want to include WAL reads in this view, e.g. reading undo logs in zheap.

Sounds reasonable.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#18Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#17)
Re: New statistics for tuning WAL buffer size

On Fri, Sep 4, 2020 at 5:42 AM Fujii Masao <masao.fujii@oss.nttdata.com>
wrote:

On 2020/09/04 11:50, tsunakawa.takay@fujitsu.com wrote:

From: Fujii Masao <masao.fujii@oss.nttdata.com>

I changed the view name from pg_stat_walwrites to pg_stat_walwriter.
I think it is better to match naming scheme with other views like

pg_stat_bgwriter,

which is for bgwriter statistics but it has the statistics related to

backend.

I prefer the view name pg_stat_walwriter for the consistency with
other view names. But we also have pg_stat_wal_receiver. Which
makes me think that maybe pg_stat_wal_writer is better for
the consistency. Thought? IMO either of them works for me.
I'd like to hear more opinons about this.

I think pg_stat_bgwriter is now a misnomer, because it contains the

backends' activity. Likewise, pg_stat_walwriter leads to misunderstanding
because its information is not limited to WAL writer.

How about simply pg_stat_wal? In the future, we may want to include WAL

reads in this view, e.g. reading undo logs in zheap.

Sounds reasonable.

+1.

pg_stat_bgwriter has had the "wrong name" for quite some time now -- it
became even more apparent when the checkpointer was split out to it's own
process, and that's not exactly a recent change. And it had allocs in it
from day one...

I think naming it for what the data in it is ("wal") rather than which
process deals with it ("walwriter") is correct, unless the statistics can
be known to only *ever* affect one type of process. (And then different
processes can affect different columns in the view). As a general rule --
and that's from what I can tell exactly what's being proposed.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#19Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Fujii Masao (#17)
Re: New statistics for tuning WAL buffer size

Thanks for the review and advice!

On 2020-09-03 16:05, Fujii Masao wrote:

On 2020/09/02 18:56, Masahiro Ikeda wrote:

+/* ----------
+ * Backend types
+ * ----------

You seem to forget to add "*/" into the above comment.
This issue could cause the following compiler warning.
../../src/include/pgstat.h:761:1: warning: '/*' within block comment
[-Wcomment]

Thanks for the comment. I fixed.

Thanks for the fix! But why are those comments necessary?

Sorry about that. This comment is not necessary.
I removed it.

The pg_stat_walwriter is not security restricted now, so ordinary
users can access it.
It has the same security level as pg_stat_archiver. If you have any
comments, please let me know.

+ <structfield>dirty_writes</structfield> <type>bigint</type>

I guess that the column name "dirty_writes" derived from
the DTrace probe name. Isn't this name confusing? We should
rename it to "wal_buffers_full" or something?

I agree and rename it to "wal_buffers_full".

+/* ----------
+ * PgStat_MsgWalWriter			Sent by the walwriter to update statistics.

This comment seems not accurate because backends also send it.

+/*
+ * WAL writes statistics counter is updated in XLogWrite function
+ */
+extern PgStat_MsgWalWriter WalWriterStats;

This comment seems not right because the counter is not updated in
XLogWrite().

Right. I fixed it to "Sent by each backend and background workers to
update WAL statistics."
In the future, other statistics will be included so I remove the
function's name.

+-- There will surely and maximum one record
+select count(*) = 1 as ok from pg_stat_walwriter;

What about changing this comment to "There must be only one record"?

Thanks, I fixed.

+ WalWriterStats.m_xlog_dirty_writes++;
LWLockRelease(WALWriteLock);

Since WalWriterStats.m_xlog_dirty_writes doesn't need to be protected
with WALWriteLock, isn't it better to increment that after releasing
the lock?

Thanks, I fixed.

+CREATE VIEW pg_stat_walwriter AS
+    SELECT
+        pg_stat_get_xlog_dirty_writes() AS dirty_writes,
+        pg_stat_get_walwriter_stat_reset_time() AS stats_reset;
+
CREATE VIEW pg_stat_progress_vacuum AS

In system_views.sql, the definition of pg_stat_walwriter should be
placed just after that of pg_stat_bgwriter not
pg_stat_progress_analyze.

OK, I fixed it.

}
-
/*
* We found an existing collector stats file. Read it and put all the

You seem to accidentally have removed the empty line here.

Sorry about that. I fixed it.

-				 errhint("Target must be \"archiver\" or \"bgwriter\".")));
+				 errhint("Target must be \"archiver\" or \"bgwriter\" or
\"walwriter\".")));

There are two "or" in the message, but the former should be replaced
with ","?

Thanks, I fixed.

On 2020-09-05 18:40, Magnus Hagander wrote:

On Fri, Sep 4, 2020 at 5:42 AM Fujii Masao
<masao.fujii@oss.nttdata.com> wrote:

On 2020/09/04 11:50, tsunakawa.takay@fujitsu.com wrote:

From: Fujii Masao <masao.fujii@oss.nttdata.com>

I changed the view name from pg_stat_walwrites to

pg_stat_walwriter.

I think it is better to match naming scheme with other views

like

pg_stat_bgwriter,

which is for bgwriter statistics but it has the statistics

related to backend.

I prefer the view name pg_stat_walwriter for the consistency with
other view names. But we also have pg_stat_wal_receiver. Which
makes me think that maybe pg_stat_wal_writer is better for
the consistency. Thought? IMO either of them works for me.
I'd like to hear more opinons about this.

I think pg_stat_bgwriter is now a misnomer, because it contains

the backends' activity. Likewise, pg_stat_walwriter leads to
misunderstanding because its information is not limited to WAL
writer.

How about simply pg_stat_wal? In the future, we may want to

include WAL reads in this view, e.g. reading undo logs in zheap.

Sounds reasonable.

+1.

pg_stat_bgwriter has had the "wrong name" for quite some time now --
it became even more apparent when the checkpointer was split out to
it's own process, and that's not exactly a recent change. And it had
allocs in it from day one...

I think naming it for what the data in it is ("wal") rather than which
process deals with it ("walwriter") is correct, unless the statistics
can be known to only *ever* affect one type of process. (And then
different processes can affect different columns in the view). As a
general rule -- and that's from what I can tell exactly what's being
proposed.

Thanks for your comments. I agree with your opinions.
I changed the view name to "pg_stat_wal".

I fixed the code to send the WAL statistics from not only backend and
walwriter
but also checkpointer, walsender and autovacuum worker.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Attachments:

0004_pg_stat_walwriter_view.patchtext/x-diff; charset=us-ascii; name=0004_pg_stat_walwriter_view.patchDownload+256-5
#20Fujii Masao
masao.fujii@gmail.com
In reply to: Masahiro Ikeda (#19)
Re: New statistics for tuning WAL buffer size

On 2020/09/07 9:58, Masahiro Ikeda wrote:

Thanks for the review and advice!

On 2020-09-03 16:05, Fujii Masao wrote:

On 2020/09/02 18:56, Masahiro Ikeda wrote:

+/* ----------
+ * Backend types
+ * ----------

You seem to forget to add "*/" into the above comment.
This issue could cause the following compiler warning.
../../src/include/pgstat.h:761:1: warning: '/*' within block comment [-Wcomment]

Thanks for the comment. I fixed.

Thanks for the fix! But why are those comments necessary?

Sorry about that. This comment is not necessary.
I removed it.

The pg_stat_walwriter is not security restricted now, so ordinary users can access it.
It has the same security level as pg_stat_archiver. If you have any comments, please let me know.

+������ <structfield>dirty_writes</structfield> <type>bigint</type>

I guess that the column name "dirty_writes" derived from
the DTrace probe name. Isn't this name confusing? We should
rename it to "wal_buffers_full" or something?

I agree and rename it to "wal_buffers_full".

+/* ----------
+ * PgStat_MsgWalWriter����������� Sent by the walwriter to update statistics.

This comment seems not accurate because backends also send it.

+/*
+ * WAL writes statistics counter is updated in XLogWrite function
+ */
+extern PgStat_MsgWalWriter WalWriterStats;

This comment seems not right because the counter is not updated in XLogWrite().

Right. I fixed it to "Sent by each backend and background workers to update WAL statistics."
In the future, other statistics will be included so I remove the function's name.

+-- There will surely and maximum one record
+select count(*) = 1 as ok from pg_stat_walwriter;

What about changing this comment to "There must be only one record"?

Thanks, I fixed.

+������������������� WalWriterStats.m_xlog_dirty_writes++;
�������������������� LWLockRelease(WALWriteLock);

Since WalWriterStats.m_xlog_dirty_writes doesn't need to be protected
with WALWriteLock, isn't it better to increment that after releasing the lock?

Thanks, I fixed.

+CREATE VIEW pg_stat_walwriter AS
+��� SELECT
+������� pg_stat_get_xlog_dirty_writes() AS dirty_writes,
+������� pg_stat_get_walwriter_stat_reset_time() AS stats_reset;
+
�CREATE VIEW pg_stat_progress_vacuum AS

In system_views.sql, the definition of pg_stat_walwriter should be
placed just after that of pg_stat_bgwriter not pg_stat_progress_analyze.

OK, I fixed it.

���� }
-
���� /*
����� * We found an existing collector stats file. Read it and put all the

You seem to accidentally have removed the empty line here.

Sorry about that. I fixed it.

-���������������� errhint("Target must be \"archiver\" or \"bgwriter\".")));
+���������������� errhint("Target must be \"archiver\" or \"bgwriter\" or
\"walwriter\".")));

There are two "or" in the message, but the former should be replaced with ","?

Thanks, I fixed.

On 2020-09-05 18:40, Magnus Hagander wrote:

On Fri, Sep 4, 2020 at 5:42 AM Fujii Masao
<masao.fujii@oss.nttdata.com> wrote:

On 2020/09/04 11:50, tsunakawa.takay@fujitsu.com wrote:

From: Fujii Masao <masao.fujii@oss.nttdata.com>

I changed the view name from pg_stat_walwrites to

pg_stat_walwriter.

I think it is better to match naming scheme with other views

like

pg_stat_bgwriter,

which is for bgwriter statistics but it has the statistics

related to backend.

I prefer the view name pg_stat_walwriter for the consistency with
other view names. But we also have pg_stat_wal_receiver. Which
makes me think that maybe pg_stat_wal_writer is better for
the consistency. Thought? IMO either of them works for me.
I'd like to hear more opinons about this.

I think pg_stat_bgwriter is now a misnomer, because it contains

the backends' activity.� Likewise, pg_stat_walwriter leads to
misunderstanding because its information is not limited to WAL
writer.

How about simply pg_stat_wal?� In the future, we may want to

include WAL reads in this view, e.g. reading undo logs in zheap.

Sounds reasonable.

+1.

pg_stat_bgwriter has had the "wrong name" for quite some time now --
it became even more apparent when the checkpointer was split out to
it's own process, and that's not exactly a recent change. And it had
allocs in it from day one...

I think naming it for what the data in it is ("wal") rather than which
process deals with it ("walwriter") is correct, unless the statistics
can be known to only *ever* affect one type of process. (And then
different processes can affect different columns in the view). As a
general rule -- and that's from what I can tell exactly what's being
proposed.

Thanks for your comments. I agree with your opinions.
I changed the view name to "pg_stat_wal".

I fixed the code to send the WAL statistics from not only backend and walwriter
but also checkpointer, walsender and autovacuum worker.

Good point! Thanks for updating the patch!

@@ -604,6 +604,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
onerel->rd_rel->relisshared,
Max(new_live_tuples, 0),
vacrelstats->new_dead_tuples);
+ pgstat_send_wal();

I guess that you changed heap_vacuum_rel() as above so that autovacuum
workers can send WAL stats. But heap_vacuum_rel() can be called by
the processes (e.g., backends) other than autovacuum workers? Also
what happens if autovacuum workers just do ANALYZE only? In that case,
heap_vacuum_rel() may not be called.

Currently autovacuum worker reports the stats at the exit via
pgstat_beshutdown_hook(). Unlike other processes, autovacuum worker
is not the process that basically keeps running during the service. It exits
after it does vacuum or analyze. So ISTM that it's not bad to report the stats
only at the exit, in autovacuum worker case. There is no need to add extra
code for WAL stats report by autovacuum worker. Thought?

@@ -1430,6 +1430,9 @@ WalSndWaitForWal(XLogRecPtr loc)
else
RecentFlushPtr = GetXLogReplayRecPtr(NULL);

+ /* Send wal statistics */
+ pgstat_send_wal();

AFAIR logical walsender uses three loops in WalSndLoop(), WalSndWriteData()
and WalSndWaitForWal(). But could you tell me why added pgstat_send_wal()
into WalSndWaitForWal()? I'd like to know why WalSndWaitForWal() is the best
for that purpose.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#21Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Fujii Masao (#20)
#22Fujii Masao
masao.fujii@gmail.com
In reply to: Masahiro Ikeda (#21)
#23Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Masahiro Ikeda (#21)
#24Fujii Masao
masao.fujii@gmail.com
In reply to: Kyotaro Horiguchi (#23)
#25Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Fujii Masao (#24)
#26Fujii Masao
masao.fujii@gmail.com
In reply to: Kyotaro Horiguchi (#25)
#27Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Fujii Masao (#22)
#28Fujii Masao
masao.fujii@gmail.com
In reply to: Masahiro Ikeda (#27)
#29Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Fujii Masao (#28)
#30Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Masahiro Ikeda (#29)
#31Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Kyotaro Horiguchi (#30)
#32Fujii Masao
masao.fujii@gmail.com
In reply to: Masahiro Ikeda (#31)
#33Amit Kapila
amit.kapila16@gmail.com
In reply to: Fujii Masao (#32)
#34Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Amit Kapila (#33)
#35Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Amit Kapila (#33)
#36Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiro Ikeda (#35)
#37Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Amit Kapila (#36)
#38Amit Kapila
amit.kapila16@gmail.com
In reply to: Kyotaro Horiguchi (#37)
#39Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Amit Kapila (#38)
#40Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiro Ikeda (#39)
#41Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Amit Kapila (#40)
#42Fujii Masao
masao.fujii@gmail.com
In reply to: Masahiro Ikeda (#41)
#43Amit Kapila
amit.kapila16@gmail.com
In reply to: Fujii Masao (#42)
#44Fujii Masao
masao.fujii@gmail.com
In reply to: Amit Kapila (#43)
#45Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Fujii Masao (#44)
#46tsunakawa.takay@fujitsu.com
tsunakawa.takay@fujitsu.com
In reply to: Kyotaro Horiguchi (#45)
#47Amit Kapila
amit.kapila16@gmail.com
In reply to: Kyotaro Horiguchi (#45)
#48Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Amit Kapila (#47)
#49Fujii Masao
masao.fujii@gmail.com
In reply to: Masahiro Ikeda (#48)
#50Fujii Masao
masao.fujii@gmail.com
In reply to: tsunakawa.takay@fujitsu.com (#46)
#51tsunakawa.takay@fujitsu.com
tsunakawa.takay@fujitsu.com
In reply to: Fujii Masao (#50)
#52Fujii Masao
masao.fujii@gmail.com
In reply to: Fujii Masao (#49)
#53Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Fujii Masao (#52)
#54Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Masahiro Ikeda (#53)
#55Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Masahiro Ikeda (#54)
#56Fujii Masao
masao.fujii@gmail.com
In reply to: Masahiro Ikeda (#55)
#57Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Fujii Masao (#56)