New statistics for WAL buffer dirty writes

Started by Satoshi Nagayasualmost 14 years ago35 messageshackers

snaga@uptime.jp

almost 14 years ago

Hi all,

I've created new patch to get/reset statistics of WAL buffer
writes (flushes) caused by WAL buffer full.

This patch provides two new functions, pg_stat_get_xlog_dirty_write()
and pg_stat_reset_xlog_dirty_write(), which have been designed to
determine an appropriate value for WAL buffer size.

If this counter is increasing in the production environment,
it would mean that the WAL buffer size is too small to hold
xlog records generated the transactions. So, you can increase
your WAL buffer size to keep xlog records and to reduce WAL writes.

I think this patch would not affect to WAL write performance,
but still paying attention to it.

Any comments or suggestions?

Regards,

-----------------------------------------------------------
[snaga@devvm03 src]$ psql -p 15432 postgres
psql (9.3devel)
Type "help" for help.

postgres=# SELECT pg_stat_get_xlog_dirty_write();
pg_stat_get_xlog_dirty_write
------------------------------
0
(1 row)

postgres=# \q
[snaga@devvm03 src]$ pgbench -p 15432 -s 10 -c 32 -t 1000 postgres
Scale option ignored, using pgbench_branches table count = 10
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 10
query mode: simple
number of clients: 32
number of threads: 1
number of transactions per client: 1000
number of transactions actually processed: 32000/32000
tps = 141.937738 (including connections establishing)
tps = 142.123457 (excluding connections establishing)
[snaga@devvm03 src]$ psql -p 15432 postgres
psql (9.3devel)
Type "help" for help.

postgres=# SELECT pg_stat_get_xlog_dirty_write();
pg_stat_get_xlog_dirty_write
------------------------------
0
(1 row)

postgres=# begin;
BEGIN
postgres=# DELETE FROM pgbench_accounts;
DELETE 1000000
postgres=# commit;
COMMIT
postgres=# SELECT pg_stat_get_xlog_dirty_write();
pg_stat_get_xlog_dirty_write
------------------------------
19229
(1 row)

postgres=# SELECT pg_stat_reset_xlog_dirty_write();
pg_stat_reset_xlog_dirty_write
--------------------------------

(1 row)

postgres=# SELECT pg_stat_get_xlog_dirty_write();
pg_stat_get_xlog_dirty_write
------------------------------
0
(1 row)

postgres=# \q
[snaga@devvm03 src]$
-----------------------------------------------------------

--
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp

Euler Taveira de Oliveira

euler@timbira.com

almost 14 years ago

In reply to: Satoshi Nagayasu (#1)

Re: New statistics for WAL buffer dirty writes

On 07-07-2012 09:00, Satoshi Nagayasu wrote:

I've created new patch to get/reset statistics of WAL buffer
writes (flushes) caused by WAL buffer full.

This new statistic doesn't solve your problem (tune wal_buffers). It doesn't
give you the wal_buffers value. It only says "hey, I needed more buffers so I
write those dirty ones". It doesn't say how many. I would like to have
something that says "hey, you have 1000 buffers available and you are using
100 buffers (10%)". This new statistic is only useful for decreasing the
WALWriteLock contention.

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Satoshi Nagayasu

snaga@uptime.jp

almost 14 years ago

In reply to: Euler Taveira de Oliveira (#2)

Re: New statistics for WAL buffer dirty writes

2012/07/07 22:07, Euler Taveira wrote:

On 07-07-2012 09:00, Satoshi Nagayasu wrote:

I've created new patch to get/reset statistics of WAL buffer
writes (flushes) caused by WAL buffer full.

This new statistic doesn't solve your problem (tune wal_buffers). It doesn't
give you the wal_buffers value. It only says "hey, I needed more buffers so I
write those dirty ones". It doesn't say how many. I would like to have
something that says "hey, you have 1000 buffers available and you are using
100 buffers (10%)". This new statistic is only useful for decreasing the
WALWriteLock contention.

I agree with that it would not tell the exact number for wal_buffers,
but it would help DBA understand what's actually happening around WAL
buffers.

Also, decreasing the WALWriteLock contention is obviously important
for DBA in terms of improving database performance.

Actually, that's the reason why I'm working on another statistics. :)
http://archives.postgresql.org/pgsql-hackers/2012-06/msg01489.php

Regards,
--
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Euler Taveira de Oliveira (#2)

Re: New statistics for WAL buffer dirty writes

On Jul 7, 2012, at 9:07 AM, Euler Taveira <euler@timbira.com> wrote:

On 07-07-2012 09:00, Satoshi Nagayasu wrote:

I've created new patch to get/reset statistics of WAL buffer
writes (flushes) caused by WAL buffer full.

This new statistic doesn't solve your problem (tune wal_buffers). It doesn't
give you the wal_buffers value. It only says "hey, I needed more buffers so I
write those dirty ones". It doesn't say how many. I would like to have
something that says "hey, you have 1000 buffers available and you are using
100 buffers (10%)". This new statistic is only useful for decreasing the
WALWriteLock contention.

The number of WAL buffers that you are using is going to change so quickly as to be utterly meaningless. I don't really see that there's any statistic we could gather that would tell us how many WAL buffers are needed. This patch seems like it's on the right track, at least telling you how often you're running out.

I'm interested to run some benchmarks with this; I think it could be quite informative.

...Robert

Magnus Hagander

magnus@hagander.net

almost 14 years ago

In reply to: Robert Haas (#4)

Re: New statistics for WAL buffer dirty writes

On Sat, Jul 7, 2012 at 3:52 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Jul 7, 2012, at 9:07 AM, Euler Taveira <euler@timbira.com> wrote:

On 07-07-2012 09:00, Satoshi Nagayasu wrote:

I've created new patch to get/reset statistics of WAL buffer
writes (flushes) caused by WAL buffer full.

This new statistic doesn't solve your problem (tune wal_buffers). It doesn't
give you the wal_buffers value. It only says "hey, I needed more buffers so I
write those dirty ones". It doesn't say how many. I would like to have
something that says "hey, you have 1000 buffers available and you are using
100 buffers (10%)". This new statistic is only useful for decreasing the
WALWriteLock contention.

The number of WAL buffers that you are using is going to change so quickly as to be utterly meaningless. I don't really see that there's any statistic we could gather that would tell us how many WAL buffers are needed. This patch seems like it's on the right track, at least telling you how often you're running out.

We could keep a high watermark of "what's the largest percentage we've
used", perhaps?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Magnus Hagander (#5)

Re: New statistics for WAL buffer dirty writes

On Jul 7, 2012, at 8:54 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Sat, Jul 7, 2012 at 3:52 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Jul 7, 2012, at 9:07 AM, Euler Taveira <euler@timbira.com> wrote:

On 07-07-2012 09:00, Satoshi Nagayasu wrote:

I've created new patch to get/reset statistics of WAL buffer
writes (flushes) caused by WAL buffer full.

This new statistic doesn't solve your problem (tune wal_buffers). It doesn't
give you the wal_buffers value. It only says "hey, I needed more buffers so I
write those dirty ones". It doesn't say how many. I would like to have
something that says "hey, you have 1000 buffers available and you are using
100 buffers (10%)". This new statistic is only useful for decreasing the
WALWriteLock contention.

The number of WAL buffers that you are using is going to change so quickly as to be utterly meaningless. I don't really see that there's any statistic we could gather that would tell us how many WAL buffers are needed. This patch seems like it's on the right track, at least telling you how often you're running out.

We could keep a high watermark of "what's the largest percentage we've
used", perhaps?

Sure, but I doubt that would be as informative as this. It's no big deal if you hit 100% every once in a while; what you really want to know is whether it's happening once per second or once per week.

...Robert

Magnus Hagander

magnus@hagander.net

almost 14 years ago

In reply to: Robert Haas (#6)

Re: New statistics for WAL buffer dirty writes

On Sat, Jul 7, 2012 at 7:06 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Jul 7, 2012, at 8:54 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Sat, Jul 7, 2012 at 3:52 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Jul 7, 2012, at 9:07 AM, Euler Taveira <euler@timbira.com> wrote:

On 07-07-2012 09:00, Satoshi Nagayasu wrote:

I've created new patch to get/reset statistics of WAL buffer
writes (flushes) caused by WAL buffer full.

This new statistic doesn't solve your problem (tune wal_buffers). It doesn't
give you the wal_buffers value. It only says "hey, I needed more buffers so I
write those dirty ones". It doesn't say how many. I would like to have
something that says "hey, you have 1000 buffers available and you are using
100 buffers (10%)". This new statistic is only useful for decreasing the
WALWriteLock contention.

The number of WAL buffers that you are using is going to change so quickly as to be utterly meaningless. I don't really see that there's any statistic we could gather that would tell us how many WAL buffers are needed. This patch seems like it's on the right track, at least telling you how often you're running out.

We could keep a high watermark of "what's the largest percentage we've
used", perhaps?

Sure, but I doubt that would be as informative as this. It's no big deal if you hit 100% every once in a while; what you really want to know is whether it's happening once per second or once per week.

I'm not suggesting one or the other, I'm suggesting that both values
might be interesting. Though in reality, you'd want that high
watermark to only count if it was the state for more than <n>, which
is a lot more difficult to get. so yeah, maybe that's overkill to even
try.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Satoshi Nagayasu

snaga@uptime.jp

almost 14 years ago

In reply to: Satoshi Nagayasu (#1)

Re: New statistics for WAL buffer dirty writes

Hi,

Jeff Janes has pointed out that my previous patch could hold
a number of the dirty writes only in single local backend, and
it could not hold all over the cluster, because the counter
was allocated in the local process memory.

That's true, and I have fixed it with moving the counter into
the shared memory, as a member of XLogCtlWrite, to keep total
dirty writes in the cluster.

Regards,

2012/07/07 21:00, Satoshi Nagayasu wrote:

Hi all,

I've created new patch to get/reset statistics of WAL buffer
writes (flushes) caused by WAL buffer full.

This patch provides two new functions, pg_stat_get_xlog_dirty_write()
and pg_stat_reset_xlog_dirty_write(), which have been designed to
determine an appropriate value for WAL buffer size.

If this counter is increasing in the production environment,
it would mean that the WAL buffer size is too small to hold
xlog records generated the transactions. So, you can increase
your WAL buffer size to keep xlog records and to reduce WAL writes.

I think this patch would not affect to WAL write performance,
but still paying attention to it.

Any comments or suggestions?

Regards,

-----------------------------------------------------------
[snaga@devvm03 src]$ psql -p 15432 postgres
psql (9.3devel)
Type "help" for help.

postgres=# SELECT pg_stat_get_xlog_dirty_write();
pg_stat_get_xlog_dirty_write
------------------------------
0
(1 row)

postgres=# \q
[snaga@devvm03 src]$ pgbench -p 15432 -s 10 -c 32 -t 1000 postgres
Scale option ignored, using pgbench_branches table count = 10
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 10
query mode: simple
number of clients: 32
number of threads: 1
number of transactions per client: 1000
number of transactions actually processed: 32000/32000
tps = 141.937738 (including connections establishing)
tps = 142.123457 (excluding connections establishing)
[snaga@devvm03 src]$ psql -p 15432 postgres
psql (9.3devel)
Type "help" for help.

postgres=# SELECT pg_stat_get_xlog_dirty_write();
pg_stat_get_xlog_dirty_write
------------------------------
0
(1 row)

postgres=# begin;
BEGIN
postgres=# DELETE FROM pgbench_accounts;
DELETE 1000000
postgres=# commit;
COMMIT
postgres=# SELECT pg_stat_get_xlog_dirty_write();
pg_stat_get_xlog_dirty_write
------------------------------
19229
(1 row)

postgres=# SELECT pg_stat_reset_xlog_dirty_write();
pg_stat_reset_xlog_dirty_write
--------------------------------

(1 row)

postgres=# SELECT pg_stat_get_xlog_dirty_write();
pg_stat_get_xlog_dirty_write
------------------------------
0
(1 row)

postgres=# \q
[snaga@devvm03 src]$
-----------------------------------------------------------

--
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp

Jeff Janes

jeff.janes@gmail.com

almost 14 years ago

In reply to: Satoshi Nagayasu (#8)

Re: New statistics for WAL buffer dirty writes

On Sat, Jul 7, 2012 at 9:17 PM, Satoshi Nagayasu <snaga@uptime.jp> wrote:

Hi,

Jeff Janes has pointed out that my previous patch could hold
a number of the dirty writes only in single local backend, and
it could not hold all over the cluster, because the counter
was allocated in the local process memory.

That's true, and I have fixed it with moving the counter into
the shared memory, as a member of XLogCtlWrite, to keep total
dirty writes in the cluster.

A concern I have is whether the XLogCtlWrite *Write pointer needs to
be declared volatile, to prevent the compiler from pushing operations
on them outside of the locks (and so memory barriers) that formally
protect them. However I see that existing code with Insert also does
not use volatile, so maybe my concern is baseless. Perhaps the
compiler guarantees to not move operations on pointers over the
boundaries of function calls? The pattern elsewhere in the code seems
to be to use volatiles for things protected by spin-locks (implemented
by macros) but not for things protected by LWLocks.

The comment "XLogCtrlWrite must be protected with WALWriteLock"
mis-spells XLogCtlWrite.

The final patch will need to add a sections to the documentation.

Cheers,

Jeff

#10

Simon Riggs

simon@2ndQuadrant.com

almost 14 years ago

In reply to: Robert Haas (#6)

Re: New statistics for WAL buffer dirty writes

On 7 July 2012 18:06, Robert Haas <robertmhaas@gmail.com> wrote:

Sure, but I doubt that would be as informative as this. It's no big deal if you hit 100% every once in a while; what you really want to know is whether it's happening once per second or once per week.

Agreed.

I can't see an easy way of recording the high water mark % and I'm not
sure how we'd use it if we had it.

Let's just track how often we run out of space because that is when
bad things happen, not before.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#11

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Jeff Janes (#9)

Re: New statistics for WAL buffer dirty writes

On Sat, Jul 28, 2012 at 6:33 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

A concern I have is whether the XLogCtlWrite *Write pointer needs to
be declared volatile, to prevent the compiler from pushing operations
on them outside of the locks (and so memory barriers) that formally
protect them. However I see that existing code with Insert also does
not use volatile, so maybe my concern is baseless. Perhaps the
compiler guarantees to not move operations on pointers over the
boundaries of function calls? The pattern elsewhere in the code seems
to be to use volatiles for things protected by spin-locks (implemented
by macros) but not for things protected by LWLocks.

Yes, our code is only correct if we assume that the compiler performs
no global optimizations - i.e. no movement of code between functions.

IMHO, the way we have it now is kind of a mess. SpinLockAcquire and
SpinLockRelease are required to be CPU barriers, but they are not
required to be compiler barriers. If we changed that so that they
were required to act as barriers of both flavors, then (1) we wouldn't
need volatile in as many places, (2) we would be less prone to bugs
caused by the omission of not-obviously-necessary volatile markings,
and (3) we would remove one possible source of breakage that might be
induced by a globally optimizing compiler. As things stand today,
making a previously-global function static could result in working
code breaking, because the static function might be inlined where the
global function wasn't. Ouch.

Anyway, unless and until we make a definitional change of the sort
described above, any pointers used within a spinlock critical section
must be volatile; and pray that the compiler doesn't inline anything
you weren't expecting.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#12

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Robert Haas (#11)

Re: New statistics for WAL buffer dirty writes

Robert Haas <robertmhaas@gmail.com> writes:

IMHO, the way we have it now is kind of a mess. SpinLockAcquire and
SpinLockRelease are required to be CPU barriers, but they are not
required to be compiler barriers. If we changed that so that they
were required to act as barriers of both flavors,

Since they are macros, how do you propose to do that exactly?

I agree that volatile-izing everything in the vicinity is a sucky
solution, but the last time we looked at this there did not seem to
be a better one.

regards, tom lane

#13

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Tom Lane (#12)

Re: New statistics for WAL buffer dirty writes

On Tue, Jul 31, 2012 at 4:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

IMHO, the way we have it now is kind of a mess. SpinLockAcquire and
SpinLockRelease are required to be CPU barriers, but they are not
required to be compiler barriers. If we changed that so that they
were required to act as barriers of both flavors,

Since they are macros, how do you propose to do that exactly?

Why does it matter that they are macros?

I agree that volatile-izing everything in the vicinity is a sucky
solution, but the last time we looked at this there did not seem to
be a better one.

Well, Linux has a barrier() primitive which is defined as a
compiler-barrier, so I don't see why we shouldn't be able to manage
the same thing. In fact, we've already got it, though it's presently
unused; see storage/barrier.h.

Looking over s_lock.h, it looks like TAS is typically defined using
__asm__ __volatile__, and the __asm__ is marked as clobbering memory.
As the fine comments say "this prevents gcc from thinking it can cache
the values of shared-memory fields across the asm code", which is
another way of saying that it's a compiler barrier. However, there's
no similar guard in S_UNLOCK, which is simply declared as a volatile
store, and therefore compiler ordering is guaranteed only with respect
to other volatile pointer references. If we added something of the
form __asm__ __volatile__("" : : : "memory") in there, it should
serve as a full compiler barrier. That might have to go in a static
inline function as we do with TAS, but I think it should work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14

Tom Lane

tgl@sss.pgh.pa.us

almost 14 years ago

In reply to: Robert Haas (#13)

Re: New statistics for WAL buffer dirty writes

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jul 31, 2012 at 4:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I agree that volatile-izing everything in the vicinity is a sucky
solution, but the last time we looked at this there did not seem to
be a better one.

Well, Linux has a barrier() primitive which is defined as a
compiler-barrier, so I don't see why we shouldn't be able to manage
the same thing. In fact, we've already got it, though it's presently
unused; see storage/barrier.h.

Solving the problem for linux only, or gcc only, isn't going to get us
to a place where we can stop volatile-izing call sites. We need to be
sure it works for every single case supported by s_lock.h.

I think you may be right that using __asm__ __volatile__ in gcc
S_UNLOCK cases would be a big step forward, but it needs more research
to see if that's the only fix needed.

regards, tom lane

#15

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Tom Lane (#14)

Re: New statistics for WAL buffer dirty writes

On Wed, Aug 1, 2012 at 10:12 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Solving the problem for linux only, or gcc only, isn't going to get us
to a place where we can stop volatile-izing call sites. We need to be
sure it works for every single case supported by s_lock.h.

Yep, that's the problem all right.

I think you may be right that using __asm__ __volatile__ in gcc
S_UNLOCK cases would be a big step forward, but it needs more research
to see if that's the only fix needed.

I agree, but I will note that I have done a fair bit of research on
this already, and there are definitions in storage/barrier.h for
pg_compiler_barrier() that cover gcc, icc, HP's aCC, MSVC, and Borland
C. There are probably other wacky compilers out there, though:
looking at the build farm, I see Sun Studio and sco cc as cases that
would likely need some attention. Are there any compilers not
represented in the build-farm that we'd mind breaking?

If we can get working pg_compiler_barrier() definitions for all the
compilers we care about, the rest is probably mostly a question of
going through s_lock.h and inserting compiler barriers anywhere that
they aren't already implied by the existing code.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#16

Jeff Janes

jeff.janes@gmail.com

almost 14 years ago

In reply to: Jeff Janes (#9)

Re: New statistics for WAL buffer dirty writes

On Sat, Jul 28, 2012 at 3:33 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Sat, Jul 7, 2012 at 9:17 PM, Satoshi Nagayasu <snaga@uptime.jp> wrote:

Hi,

Jeff Janes has pointed out that my previous patch could hold
a number of the dirty writes only in single local backend, and
it could not hold all over the cluster, because the counter
was allocated in the local process memory.

That's true, and I have fixed it with moving the counter into
the shared memory, as a member of XLogCtlWrite, to keep total
dirty writes in the cluster.

...

The comment "XLogCtrlWrite must be protected with WALWriteLock"
mis-spells XLogCtlWrite.

The final patch will need to add a sections to the documentation.

Thanks to Robert and Tom for addressing my concerns about the pointer
volatility.

I think there is enough consensus that this is useful without adding
more things to it, like histograms or high water marks.

However, I do think we will want to add a way to query for the time of
the last reset, as other monitoring features are going that way.

Is it OK that the count is reset upon a server restart?
pg_stat_bgwriter, for example, does not do that. Unfortunately I
think fixing this in an acceptable way will be harder than the entire
rest of the patch was.

The coding looks OK to me, it applies and builds, and passes make
check, and does what it says. I didn't do performance testing, as it
is hard to believe it would have a meaningful effect.

I'll marked it as waiting on author, for the documentation and reset
time. I'd ask a more senior hacker to comment on the durability over
restarts.

Cheers,

Jeff

#17

Robert Haas

robertmhaas@gmail.com

almost 14 years ago

In reply to: Jeff Janes (#16)

Re: New statistics for WAL buffer dirty writes

On Sat, Aug 11, 2012 at 6:11 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

However, I do think we will want to add a way to query for the time of
the last reset, as other monitoring features are going that way.

That should be easy to add.

Is it OK that the count is reset upon a server restart?

I think it's OK. The reason why many of our stats are kept in the
stats file is because we have a limited amount of shared memory and
therefore can't guarantee (for example) that there's enough to keep
stats about EVERY table, since the number of tables is unlimited.
However, in cases where the data to be stored is fixed-size, and
especially when it's fixed-size and small, there's a lot of sense to
keeping the data in shared memory rather than sending stats collector
messages. It's a lot less overhead, for one thing. Maybe at some
point someone will want to devise a way to hibernate such stats to
disk at shutdown (or periodically) and reload them on startup, but it
doesn't seem like a must-have to me.

Other opinions may vary, of course.

I'll marked it as waiting on author, for the documentation and reset
time.

Yeah, we definitely need some documentation.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#18

Satoshi Nagayasu

snaga@uptime.jp

over 13 years ago

In reply to: Jeff Janes (#16)

Re: New statistics for WAL buffer dirty writes

Hi,

2012/08/12 7:11, Jeff Janes wrote:

On Sat, Jul 28, 2012 at 3:33 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Sat, Jul 7, 2012 at 9:17 PM, Satoshi Nagayasu <snaga@uptime.jp> wrote:

Hi,

Jeff Janes has pointed out that my previous patch could hold
a number of the dirty writes only in single local backend, and
it could not hold all over the cluster, because the counter
was allocated in the local process memory.

That's true, and I have fixed it with moving the counter into
the shared memory, as a member of XLogCtlWrite, to keep total
dirty writes in the cluster.

...

The comment "XLogCtrlWrite must be protected with WALWriteLock"
mis-spells XLogCtlWrite.

The final patch will need to add a sections to the documentation.

Thanks to Robert and Tom for addressing my concerns about the pointer
volatility.

I think there is enough consensus that this is useful without adding
more things to it, like histograms or high water marks.

However, I do think we will want to add a way to query for the time of
the last reset, as other monitoring features are going that way.

Is it OK that the count is reset upon a server restart?
pg_stat_bgwriter, for example, does not do that. Unfortunately I
think fixing this in an acceptable way will be harder than the entire
rest of the patch was.

The coding looks OK to me, it applies and builds, and passes make
check, and does what it says. I didn't do performance testing, as it
is hard to believe it would have a meaningful effect.

I'll marked it as waiting on author, for the documentation and reset
time. I'd ask a more senior hacker to comment on the durability over
restarts.

I have rewritten the patch to deal with dirty write statistics
through pgstat collector as bgwriter does.
Yeah, it's a bit bigger rewrite.

With this patch, walwriter process and each backend process
would sum up dirty writes, and send it to the stat collector.
So, the value could be saved in the stat file, and could be
kept on restarting.

The statistics could be retreive with using
pg_stat_get_xlog_dirty_writes() function, and could be reset
with calling pg_stat_reset_shared('walwriter').

Now, I have one concern.

The reset time could be captured in globalStats.stat_reset_timestamp,
but this value is the same with the bgwriter one.

So, once pg_stat_reset_shared('walwriter') is called,
stats_reset column in pg_stat_bgwriter does represent
the reset time for walwriter, not for bgwriter.

How should we handle this? Should we split this value?
And should we have new system view for walwriter?

Of course, I will work on documentation next.

Regards,

Cheers,

Jeff

--
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp

#19

Alvaro Herrera

alvherre@2ndquadrant.com

over 13 years ago

In reply to: Satoshi Nagayasu (#18)

Re: New statistics for WAL buffer dirty writes

Satoshi Nagayasu escribió:

With this patch, walwriter process and each backend process
would sum up dirty writes, and send it to the stat collector.
So, the value could be saved in the stat file, and could be
kept on restarting.

The statistics could be retreive with using
pg_stat_get_xlog_dirty_writes() function, and could be reset
with calling pg_stat_reset_shared('walwriter').

Now, I have one concern.

The reset time could be captured in globalStats.stat_reset_timestamp,
but this value is the same with the bgwriter one.

So, once pg_stat_reset_shared('walwriter') is called,
stats_reset column in pg_stat_bgwriter does represent
the reset time for walwriter, not for bgwriter.

How should we handle this? Should we split this value?
And should we have new system view for walwriter?

I think the answer to the two last questions is yes. It doesn't seem to
make sense, to me, to have a single reset timings for what are
effectively two separate things.

Please submit an updated patch to next CF. I'm marking this one
returned with feedback. Thanks.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#20

Satoshi Nagayasu

snaga@uptime.jp

over 13 years ago

In reply to: Alvaro Herrera (#19)

Re: New statistics for WAL buffer dirty writes

2012/10/24 1:12, Alvaro Herrera wrote:

Satoshi Nagayasu escribiï¿½:

With this patch, walwriter process and each backend process
would sum up dirty writes, and send it to the stat collector.
So, the value could be saved in the stat file, and could be
kept on restarting.

The statistics could be retreive with using
pg_stat_get_xlog_dirty_writes() function, and could be reset
with calling pg_stat_reset_shared('walwriter').

Now, I have one concern.

The reset time could be captured in globalStats.stat_reset_timestamp,
but this value is the same with the bgwriter one.

So, once pg_stat_reset_shared('walwriter') is called,
stats_reset column in pg_stat_bgwriter does represent
the reset time for walwriter, not for bgwriter.

How should we handle this? Should we split this value?
And should we have new system view for walwriter?

I think the answer to the two last questions is yes. It doesn't seem to
make sense, to me, to have a single reset timings for what are
effectively two separate things.

Please submit an updated patch to next CF. I'm marking this one
returned with feedback. Thanks.

I attached the latest one, which splits the reset_time
for bgwriter and walwriter, and provides new system view,
called pg_stat_walwriter, to show the dirty write counter
and the reset time.

Regards,
--
Satoshi Nagayasu <snaga@uptime.jp>
Uptime Technologies, LLC. http://www.uptime.jp