pgsql: Add wait event for fsync of WAL segments

Started by Michael Paquieralmost 8 years ago18 messageshackers
Jump to latest
#1Michael Paquier
michael@paquier.xyz

Add wait event for fsync of WAL segments

This has been visibly a forgotten spot in the first implementation of
wait events for I/O added by 249cf07, and what has been missing is a
fsync call for WAL segments which is a wrapper reacting on the value of
GUC wal_sync_method.

Reported-by: Konstantin Knizhnik
Author: Konstantin Knizhnik
Reviewed-by: Craig Ringer, Michael Paquier
Discussion: /messages/by-id/4a243897-0ad8-f471-aa40-242591f2476e@postgrespro.ru

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/c55de5e5123ce58ee19a47c08425949599285041

Modified Files
--------------
doc/src/sgml/monitoring.sgml | 4 ++++
src/backend/access/transam/xlog.c | 2 ++
src/backend/postmaster/pgstat.c | 3 +++
src/include/pgstat.h | 1 +
4 files changed, 10 insertions(+)

#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#1)
Re: pgsql: Add wait event for fsync of WAL segments

On 2018-Jul-02, Michael Paquier wrote:

Add wait event for fsync of WAL segments

This has been visibly a forgotten spot in the first implementation of
wait events for I/O added by 249cf07, and what has been missing is a
fsync call for WAL segments which is a wrapper reacting on the value of
GUC wal_sync_method.

I wonder if we should backpatch this one all the way to pg10. I don't
see no reason not to.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#2)
Re: pgsql: Add wait event for fsync of WAL segments

On Mon, Jul 02, 2018 at 12:23:35PM -0400, Alvaro Herrera wrote:

I wonder if we should backpatch this one all the way to pg10. I don't
see no reason not to.

ABI breakage (if that's the correct wording?). Simply cherry-picking
the patch from master to back-branches would cause extensions and
plugins already compiled with those versions to be confused by the
ordering of the enum WaitEventIO. Well, one simple solution is to
simply put the new event purposefully at the bottom of the list. If
that sounds right, I could do that on back-branches but I'd rather let
HEAD on it current state with the event set correctly ordered.
--
Michael

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Michael Paquier (#3)
Re: pgsql: Add wait event for fsync of WAL segments

On 2018-Jul-03, Michael Paquier wrote:

On Mon, Jul 02, 2018 at 12:23:35PM -0400, Alvaro Herrera wrote:

I wonder if we should backpatch this one all the way to pg10. I don't
see no reason not to.

ABI breakage (if that's the correct wording?). Simply cherry-picking
the patch from master to back-branches would cause extensions and
plugins already compiled with those versions to be confused by the
ordering of the enum WaitEventIO. Well, one simple solution is to
simply put the new event purposefully at the bottom of the list. If
that sounds right, I could do that on back-branches

Are the numerical values actually exposed to the world? I thought the
only way to this info was through the system views, which surely expose
the names, not the numbers.

Actually, comment on pgstat_report_wait_start talks about WaitClass as
if it were a thing, which it seems not to be. There is a comment "Wait
Classes" but that term is not used anywhere else.

If reading wait events is actually possible, then it seems easy to
backpatch this in pg10 by putting the new value at the end of the
relevant enum, yeah.

but I'd rather let HEAD on it current state with the event set
correctly ordered.

Sure.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#4)
Re: pgsql: Add wait event for fsync of WAL segments

Hi,

On 2018-07-03 15:13:20 -0400, Alvaro Herrera wrote:

Are the numerical values actually exposed to the world? I thought the
only way to this info was through the system views, which surely expose
the names, not the numbers.

There's at least some work at high frequency sampling of these, and I
assume that's not going through SQL, but rather C functions.

If reading wait events is actually possible, then it seems easy to
backpatch this in pg10 by putting the new value at the end of the
relevant enum, yeah.

Honestly, I don't really see an argument for backpatching this. If
people started measuring it might even invalidate their stats.

Greetings,

Andres Freund

#6Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#5)
Re: pgsql: Add wait event for fsync of WAL segments

On 2018-Jul-03, Andres Freund wrote:

On 2018-07-03 15:13:20 -0400, Alvaro Herrera wrote:

Are the numerical values actually exposed to the world? I thought the
only way to this info was through the system views, which surely expose
the names, not the numbers.

There's at least some work at high frequency sampling of these, and I
assume that's not going through SQL, but rather C functions.

You're right.

If reading wait events is actually possible, then it seems easy to
backpatch this in pg10 by putting the new value at the end of the
relevant enum, yeah.

Honestly, I don't really see an argument for backpatching this. If
people started measuring it might even invalidate their stats.

Hmm, if the stats are really invalidated by this change, then surely the
original numbers were incorrect anyway.

Anyway, it's not a huge deal to me. If Michael doesn't want to
backpatch it, it's his call, and I don't have the cycles to do it myself
right now either. If some other committer cares about it, well ...

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#6)
Re: pgsql: Add wait event for fsync of WAL segments

On Tue, Jul 03, 2018 at 04:12:23PM -0400, Alvaro Herrera wrote:

On 2018-Jul-03, Andres Freund wrote:

On 2018-07-03 15:13:20 -0400, Alvaro Herrera wrote:

Are the numerical values actually exposed to the world? I thought the
only way to this info was through the system views, which surely expose
the names, not the numbers.

There's at least some work at high frequency sampling of these, and I
assume that's not going through SQL, but rather C functions.

You're right.

My take is a background worker which connects to shared memory and scans
directly ProcArray entries.

Anyway, it's not a huge deal to me. If Michael doesn't want to
backpatch it, it's his call, and I don't have the cycles to do it myself
right now either. If some other committer cares about it, well ...

New wait events on HEAD are adapted to HEAD in my opinion, so I would
keep the code as it is now. If anybody wishes to back-patch the change,
of course feel free to do so.
--
Michael

#8Julien Rouhaud
rjuju123@gmail.com
In reply to: Michael Paquier (#1)
Re: pgsql: Add wait event for fsync of WAL segments

Hi Michael,

On Mon, Jul 2, 2018 at 3:23 PM, Michael Paquier <michael@paquier.xyz> wrote:

Add wait event for fsync of WAL segments

Modified Files
--------------
doc/src/sgml/monitoring.sgml | 4 ++++

I just noticed that the number of rows for the IO wait event type
documentation hasn't been updated, see
https://www.postgresql.org/docs/devel/static/monitoring-stats.html#WAIT-EVENT-TABLE.

Trivial patch attached.

Attachments:

fix-wait-events-table.difftext/x-patch; charset=US-ASCII; name=fix-wait-events-table.diffDownload+1-1
#9Michael Paquier
michael@paquier.xyz
In reply to: Julien Rouhaud (#8)
Re: pgsql: Add wait event for fsync of WAL segments

On Sun, Jul 08, 2018 at 10:23:37PM +0200, Julien Rouhaud wrote:

I just noticed that the number of rows for the IO wait event type
documentation hasn't been updated, see
https://www.postgresql.org/docs/devel/static/monitoring-stats.html#WAIT-EVENT-TABLE.

Trivial patch attached.

Thanks, Julien! Fixed. Indeed the table format was weird..
--
Michael

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#9)
Re: pgsql: Add wait event for fsync of WAL segments

Michael Paquier <michael@paquier.xyz> writes:

On Sun, Jul 08, 2018 at 10:23:37PM +0200, Julien Rouhaud wrote:

I just noticed that the number of rows for the IO wait event type
documentation hasn't been updated, see
https://www.postgresql.org/docs/devel/static/monitoring-stats.html#WAIT-EVENT-TABLE.
Trivial patch attached.

Thanks, Julien! Fixed. Indeed the table format was weird..

Hm, do we need that "morerows" thing at all? It seems impossible
that we'll remember to keep it up to date in future. How do you
even check that it's right now?

regards, tom lane

#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#10)
Re: pgsql: Add wait event for fsync of WAL segments

On 2018-Jul-09, Tom Lane wrote:

Michael Paquier <michael@paquier.xyz> writes:

On Sun, Jul 08, 2018 at 10:23:37PM +0200, Julien Rouhaud wrote:

I just noticed that the number of rows for the IO wait event type
documentation hasn't been updated, see
https://www.postgresql.org/docs/devel/static/monitoring-stats.html#WAIT-EVENT-TABLE.
Trivial patch attached.

Thanks, Julien! Fixed. Indeed the table format was weird..

Hm, do we need that "morerows" thing at all? It seems impossible
that we'll remember to keep it up to date in future. How do you
even check that it's right now?

Maybe we should add an XML comment
<!-- remember to update "morerows" above -->
at the end of the table :-)

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In reply to: Alvaro Herrera (#11)
Re: pgsql: Add wait event for fsync of WAL segments

On Mon, Jul 9, 2018 at 11:24 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Maybe we should add an XML comment
<!-- remember to update "morerows" above -->
at the end of the table :-)

+1. I made the same mistake at one point.

--
Peter Geoghegan

#13Michael Paquier
michael@paquier.xyz
In reply to: Peter Geoghegan (#12)
Re: pgsql: Add wait event for fsync of WAL segments

On Mon, Jul 09, 2018 at 12:12:42PM -0700, Peter Geoghegan wrote:

On Mon, Jul 9, 2018 at 11:24 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Maybe we should add an XML comment
<!-- remember to update "morerows" above -->
at the end of the table :-)

+1. I made the same mistake at one point.

Another idea that I have here, is to rework the page for monitoring
stats so as we create one sub-section for each system view, and also one
for the table of wait events. For the wait events, we could then
completely remove the first category column which has morerows and
divide the section into on table per event category.
--
Michael

#14Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#13)
Re: pgsql: Add wait event for fsync of WAL segments

On Mon, Jul 9, 2018 at 4:41 PM, Michael Paquier <michael@paquier.xyz> wrote:

Another idea that I have here, is to rework the page for monitoring
stats so as we create one sub-section for each system view, and also one
for the table of wait events. For the wait events, we could then
completely remove the first category column which has morerows and
divide the section into on table per event category.

+1 from me. I think I proposed that before.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#15Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#14)
Refactor documentation for wait events (Was: pgsql: Add wait event for fsync of WAL segments)

On Fri, Jul 13, 2018 at 04:57:59PM -0500, Robert Haas wrote:

On Mon, Jul 9, 2018 at 4:41 PM, Michael Paquier <michael@paquier.xyz> wrote:

Another idea that I have here, is to rework the page for monitoring
stats so as we create one sub-section for each system view, and also one
for the table of wait events. For the wait events, we could then
completely remove the first category column which has morerows and
divide the section into on table per event category.

+1 from me. I think I proposed that before.

Attached is a proof of concept of that. I have divided the "Viewing
Statistics" section into a subset for each catalog, and each wait event
type gains its sub-section as well. There is a bit more to do with the
indentation and some xreflabels, but I think that this is enough to
begin a discussion.

Thoughts?
--
Michael

Attachments:

wait-event-doc-refactor.patchtext/x-diff; charset=us-asciiDownload+182-34
#16Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#15)
Re: Refactor documentation for wait events (Was: pgsql: Add wait event for fsync of WAL segments)

On Sun, Jul 15, 2018 at 10:10 PM, Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Jul 13, 2018 at 04:57:59PM -0500, Robert Haas wrote:

On Mon, Jul 9, 2018 at 4:41 PM, Michael Paquier <michael@paquier.xyz> wrote:

Another idea that I have here, is to rework the page for monitoring
stats so as we create one sub-section for each system view, and also one
for the table of wait events. For the wait events, we could then
completely remove the first category column which has morerows and
divide the section into on table per event category.

+1 from me. I think I proposed that before.

Attached is a proof of concept of that. I have divided the "Viewing
Statistics" section into a subset for each catalog, and each wait event
type gains its sub-section as well. There is a bit more to do with the
indentation and some xreflabels, but I think that this is enough to
begin a discussion.

Thoughts?

This doesn't seem to get rid of the morerows stuff.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#17Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#16)
Re: Refactor documentation for wait events (Was: pgsql: Add wait event for fsync of WAL segments)

On Mon, Jul 16, 2018 at 11:22:07AM -0400, Robert Haas wrote:

This doesn't seem to get rid of the morerows stuff.

The troubling ones are in monitoring.sgml:
<entry morerows="64"><literal>LWLock</literal></entry>
<entry morerows="9"><literal>Lock</literal></entry>
<entry morerows="13"><literal>Activity</literal></entry>
<entry morerows="7"><literal>Client</literal></entry>
<entry morerows="33"><literal>IPC</literal></entry>
<entry morerows="2"><literal>Timeout</literal></entry>
<entry morerows="66"><literal>IO</literal></entry>

And the patch previously sent removes them, but perhaps I am missing
your point?
--
Michael

#18Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#17)
Re: Refactor documentation for wait events (Was: pgsql: Add wait event for fsync of WAL segments)

On Tue, Jul 17, 2018 at 12:19 AM, Michael Paquier <michael@paquier.xyz> wrote:

And the patch previously sent removes them, but perhaps I am missing
your point?

I was just confused. Sorry for the noise.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company