Why are wait events not reported even though it reads/writes a timeline history file?

Started by Masahiro Ikedaover 5 years ago13 messages
#1Masahiro Ikeda
ikedamsh@oss.nttdata.com
1 attachment(s)

Hi, 

There are two unexpected codes for me about wait events for timeline
history file.
Please let me know your thoughts whether if we need to change.

1. readTimeLineHistory() function in timeline.c

The readTimeLineHistory() reads a timeline history file, 
but it doesn't report “WAIT_EVENT_TIMELINE_HISTORY_READ".

In my understanding, sscanf() is blocking read. 
So, it's important to report a wait event.

2. writeTimeLineHistory() function in timeline.c

The writeTimeLineHistory() function may write a timeline history file
twice,
but it reports “WAIT_EVENT_TIMELINE_HISTORY_WRITE" only once.

It makes sense to report a wait event twice, because both of them use
write().

I attached a patch to mention the code line number.

I checked the commit log which "WAIT_EVENT_TIMELINE_HISTORY_READ" and
"WAIT_EVENT_TIMELINE_HISTORY_WRITE" are committed and the discussion
about it.
But I can't find the reason.

Please give me your comments.
If we need to change, I can make a patch to fix them.

By the way, which is correct "timeline's history file" or "timeline
history file"?
The timeline.c has both. In my understanding, the latter is correct. If
so, I will modify together.

Regards,

--
Masahiro Ikeda
NTT DATA CORPORATION

Attachments:

0001-some-wait-events-for-timeline-history-file-are-not-r.patchtext/x-diff; name=0001-some-wait-events-for-timeline-history-file-are-not-r.patchDownload
From 2e6f342024a075903910b94b37292efd3aa1f3a8 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Date: Thu, 23 Apr 2020 16:24:07 +0900
Subject: [PATCH] some wait events for timeline history file are not reported.

---
 src/backend/access/transam/timeline.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index de57d699af..f140100505 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -140,6 +140,8 @@ readTimeLineHistory(TimeLineID targetTLI)
 		if (*ptr == '\0' || *ptr == '#')
 			continue;
 
+		# Try to read a timeline history file.
+		# But, it doesn't report "WAIT_EVENT_TIMELINE_HISTORY_READ".
 		nfields = sscanf(fline, "%u\t%X/%X", &tli, &switchpoint_hi, &switchpoint_lo);
 
 		if (nfields < 1)
@@ -393,6 +395,8 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 
 	nbytes = strlen(buffer);
 	errno = 0;
+	# Try to write a timeline history file.
+	# But, it doesn't report "WAIT_EVENT_TIMELINE_HISTORY_WRITE".
 	if ((int) write(fd, buffer, nbytes) != nbytes)
 	{
 		int			save_errno = errno;
-- 
2.24.2 (Apple Git-127)

#2Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: Masahiro Ikeda (#1)
Re: Why are wait events not reported even though it reads/writes a timeline history file?

On 2020/04/24 11:29, Masahiro Ikeda wrote:

Hi,

There are two unexpected codes for me about wait events for timeline history file.
Please let me know your thoughts whether if we need to change.

1. readTimeLineHistory() function in timeline.c

The readTimeLineHistory() reads a timeline history file,
but it doesn't report “WAIT_EVENT_TIMELINE_HISTORY_READ".

Yeah, this sounds strange.

In my understanding, sscanf() is blocking read.
So, it's important to report a wait event.

Shouldn't the wait event be reported during fgets() rather than sscanf()?

2. writeTimeLineHistory() function in timeline.c

The writeTimeLineHistory() function may write a timeline history file twice,
but it reports “WAIT_EVENT_TIMELINE_HISTORY_WRITE" only once.

It makes sense to report a wait event twice, because both of them use write().

Yes.

I attached a patch to mention the code line number.

I checked the commit log which "WAIT_EVENT_TIMELINE_HISTORY_READ" and
"WAIT_EVENT_TIMELINE_HISTORY_WRITE" are committed and the discussion about it.
But I can't find the reason.

Please give me your comments.
If we need to change, I can make a patch to fix them.

Thanks! I agree to fix those issues.

By the way, which is correct "timeline's history file" or "timeline history file"?
The timeline.c has both. In my understanding, the latter is correct. If so, I will modify together.

Maybe both are correct?? I have no strong opinion about this.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#3Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Fujii Masao (#2)
1 attachment(s)
Re: Why are wait events not reported even though it reads/writes a timeline history file?

On 2020-04-27 12:25, Fujii Masao wrote:

On 2020/04/24 11:29, Masahiro Ikeda wrote:

Hi,

There are two unexpected codes for me about wait events for timeline
history file.
Please let me know your thoughts whether if we need to change.

1. readTimeLineHistory() function in timeline.c

The readTimeLineHistory() reads a timeline history file,
but it doesn't report “WAIT_EVENT_TIMELINE_HISTORY_READ".

Yeah, this sounds strange.

In my understanding, sscanf() is blocking read.
So, it's important to report a wait event.

Shouldn't the wait event be reported during fgets() rather than
sscanf()?

2. writeTimeLineHistory() function in timeline.c

The writeTimeLineHistory() function may write a timeline history file
twice,
but it reports “WAIT_EVENT_TIMELINE_HISTORY_WRITE" only once.

It makes sense to report a wait event twice, because both of them use
write().

Yes.

I attached a patch to mention the code line number.

I checked the commit log which "WAIT_EVENT_TIMELINE_HISTORY_READ" and
"WAIT_EVENT_TIMELINE_HISTORY_WRITE" are committed and the discussion
about it.
But I can't find the reason.

Please give me your comments.
If we need to change, I can make a patch to fix them.

Thanks! I agree to fix those issues.

Thanks for the comments. I attach a patch to fix those issues.
Please review it.

By the way, which is correct "timeline's history file" or "timeline
history file"?
The timeline.c has both. In my understanding, the latter is correct.
If so, I will modify together.

Maybe both are correct?? I have no strong opinion about this.

OK, I didn't fix it at this time.

Regards,
--
Masahiro Ikeda

Attachments:

0001-Fix-to-report-wait-events-about-timeline-history-fil.patchtext/x-diff; name=0001-Fix-to-report-wait-events-about-timeline-history-fil.patchDownload
From 9e5165673714b929a76bf39792f04ea5e348f0a3 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <ikedamsh@nttdata.co.jp>
Date: Tue, 28 Apr 2020 01:39:25 +0000
Subject: [PATCH] Fix to report wait events about timeline history file.

Even though a timeline history file is read or written,
some wait events are not reported. This patch fixes those issues.
---
 src/backend/access/transam/timeline.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index de57d69..7a652a8 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -123,6 +123,7 @@ readTimeLineHistory(TimeLineID targetTLI)
 	 * Parse the file...
 	 */
 	prevend = InvalidXLogRecPtr;
+	pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
 	while (fgets(fline, sizeof(fline), fd) != NULL)
 	{
 		/* skip leading whitespace and check for # comment */
@@ -172,6 +173,7 @@ readTimeLineHistory(TimeLineID targetTLI)
 
 		/* we ignore the remainder of each line */
 	}
+	pgstat_report_wait_end();
 
 	FreeFile(fd);
 
@@ -393,6 +395,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 
 	nbytes = strlen(buffer);
 	errno = 0;
+	pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_WRITE);
 	if ((int) write(fd, buffer, nbytes) != nbytes)
 	{
 		int			save_errno = errno;
@@ -408,6 +411,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 				(errcode_for_file_access(),
 				 errmsg("could not write to file \"%s\": %m", tmppath)));
 	}
+	pgstat_report_wait_end();
 
 	pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_SYNC);
 	if (pg_fsync(fd) != 0)
-- 
2.9.5

#4Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: Masahiro Ikeda (#3)
Re: Why are wait events not reported even though it reads/writes a timeline history file?

On 2020/04/28 11:10, Masahiro Ikeda wrote:

On 2020-04-27 12:25, Fujii Masao wrote:

On 2020/04/24 11:29, Masahiro Ikeda wrote:

Hi,

There are two unexpected codes for me about wait events for timeline history file.
Please let me know your thoughts whether if we need to change.

1. readTimeLineHistory() function in timeline.c

The readTimeLineHistory() reads a timeline history file,
but it doesn't report “WAIT_EVENT_TIMELINE_HISTORY_READ".

Yeah, this sounds strange.

In my understanding, sscanf() is blocking read.
So, it's important to report a wait event.

Shouldn't the wait event be reported during fgets() rather than sscanf()?

2. writeTimeLineHistory() function in timeline.c

The writeTimeLineHistory() function may write a timeline history file twice,
but it reports “WAIT_EVENT_TIMELINE_HISTORY_WRITE" only once.

It makes sense to report a wait event twice, because both of them use write().

Yes.

I attached a patch to mention the code line number.

I checked the commit log which "WAIT_EVENT_TIMELINE_HISTORY_READ" and
"WAIT_EVENT_TIMELINE_HISTORY_WRITE" are committed and the discussion about it.
But I can't find the reason.

Please give me your comments.
If we need to change, I can make a patch to fix them.

Thanks! I agree to fix those issues.

Thanks for the comments. I attach a patch to fix those issues.
Please review it.

Thanks for the patch!

         prevend = InvalidXLogRecPtr;
+       pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
         while (fgets(fline, sizeof(fline), fd) != NULL)
         {
                 /* skip leading whitespace and check for # comment */
@@ -172,6 +173,7 @@ readTimeLineHistory(TimeLineID targetTLI)

/* we ignore the remainder of each line */
}
+ pgstat_report_wait_end();

Isn't it safer to report the wait event during fgets() rather than putting
those calls around the whole loop, like other code does? For example,
writeTimeLineHistory() reports the wait event during read() rather than
whole loop.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#5Michael Paquier
michael@paquier.xyz
In reply to: Fujii Masao (#4)
Re: Why are wait events not reported even though it reads/writes a timeline history file?

On Tue, Apr 28, 2020 at 02:49:00PM +0900, Fujii Masao wrote:

Isn't it safer to report the wait event during fgets() rather than putting
those calls around the whole loop, like other code does? For example,
writeTimeLineHistory() reports the wait event during read() rather than
whole loop.

Yeah, I/O wait events should be taken only during the duration of the
system calls. Particularly here, you may finish with an elog() that
causes the wait event to be set longer than it should, leading to a
rather incorrect state if a snapshot of pg_stat_activity is taken.
--
Michael

#6Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Michael Paquier (#5)
1 attachment(s)
Re: Why are wait events not reported even though it reads/writes a timeline history file?

On 2020-04-28 15:09, Michael Paquier wrote:

On Tue, Apr 28, 2020 at 02:49:00PM +0900, Fujii Masao wrote:

Isn't it safer to report the wait event during fgets() rather than
putting
those calls around the whole loop, like other code does? For example,
writeTimeLineHistory() reports the wait event during read() rather
than
whole loop.

Yeah, I/O wait events should be taken only during the duration of the
system calls. Particularly here, you may finish with an elog() that
causes the wait event to be set longer than it should, leading to a
rather incorrect state if a snapshot of pg_stat_activity is taken.
--

Thanks for your comments.

I fixed it to report the wait event during fgets() only.
Please review the v2 patch I attached.

Regard,
--
Masahiro Ikeda

Attachments:

v2-0001-Fix-to-report-wait-events-about-timeline-history-.patchtext/x-diff; name=v2-0001-Fix-to-report-wait-events-about-timeline-history-.patchDownload
From 20b6eb5bb7af574e4395f653917db5a54d13d7d5 Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <ikedamsh@nttdata.co.jp>
Date: Tue, 28 Apr 2020 01:39:25 +0000
Subject: [PATCH v2] Fix to report wait events about timeline history file.

Even though a timeline history file is read or written,
some wait events are not reported. This patch fixes those issues.
---
 src/backend/access/transam/timeline.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index de57d69..3e1b90f 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -123,8 +123,15 @@ readTimeLineHistory(TimeLineID targetTLI)
 	 * Parse the file...
 	 */
 	prevend = InvalidXLogRecPtr;
-	while (fgets(fline, sizeof(fline), fd) != NULL)
+	for (;;)
 	{
+		char	   *result;
+		pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
+		result = fgets(fline, sizeof(fline), fd);
+		pgstat_report_wait_end();
+		if (result == NULL)
+			break;
+
 		/* skip leading whitespace and check for # comment */
 		char	   *ptr;
 		TimeLineID	tli;
@@ -393,6 +400,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 
 	nbytes = strlen(buffer);
 	errno = 0;
+	pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_WRITE);
 	if ((int) write(fd, buffer, nbytes) != nbytes)
 	{
 		int			save_errno = errno;
@@ -408,6 +416,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 				(errcode_for_file_access(),
 				 errmsg("could not write to file \"%s\": %m", tmppath)));
 	}
+	pgstat_report_wait_end();
 
 	pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_SYNC);
 	if (pg_fsync(fd) != 0)
-- 
2.9.5

#7Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: Masahiro Ikeda (#6)
Re: Why are wait events not reported even though it reads/writes a timeline history file?

On 2020/04/28 17:42, Masahiro Ikeda wrote:

On 2020-04-28 15:09, Michael Paquier wrote:

On Tue, Apr 28, 2020 at 02:49:00PM +0900, Fujii Masao wrote:

Isn't it safer to report the wait event during fgets() rather than putting
those calls around the whole loop, like other code does? For example,
writeTimeLineHistory() reports the wait event during read() rather than
whole loop.

Yeah, I/O wait events should be taken only during the duration of the
system calls.� Particularly here, you may finish with an elog() that
causes the wait event to be set longer than it should, leading to a
rather incorrect state if a snapshot of pg_stat_activity is taken.
--

Thanks for your comments.

I fixed it to report the wait event during fgets() only.
Please review the v2 patch I attached.

Thanks for updating the patch! Here are the review comments from me.

+		char	   *result;
+		pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
+		result = fgets(fline, sizeof(fline), fd);
+		pgstat_report_wait_end();
+		if (result == NULL)
+			break;
+
  		/* skip leading whitespace and check for # comment */
  		char	   *ptr;

Since the variable name "result" has been already used in this function,
it should be renamed.

The code should not be inject into the variable declaration block.

When reading this patch, I found that IO-error in fgets() has not
been checked there. Though this is not the issue that you reported,
but it seems better to fix it together. So what about adding
the following code?

if (ferror(fd))
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not read file \"%s\": %m", path)));

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

#8Masahiro Ikeda
ikedamsh@oss.nttdata.com
In reply to: Fujii Masao (#7)
1 attachment(s)
Re: Why are wait events not reported even though it reads/writes a timeline history file?

On 2020-05-01 00:25, Fujii Masao wrote:

On 2020/04/28 17:42, Masahiro Ikeda wrote:

On 2020-04-28 15:09, Michael Paquier wrote:

On Tue, Apr 28, 2020 at 02:49:00PM +0900, Fujii Masao wrote:

Isn't it safer to report the wait event during fgets() rather than
putting
those calls around the whole loop, like other code does? For
example,
writeTimeLineHistory() reports the wait event during read() rather
than
whole loop.

Yeah, I/O wait events should be taken only during the duration of the
system calls.  Particularly here, you may finish with an elog() that
causes the wait event to be set longer than it should, leading to a
rather incorrect state if a snapshot of pg_stat_activity is taken.
--

Thanks for your comments.

I fixed it to report the wait event during fgets() only.
Please review the v2 patch I attached.

Thanks for updating the patch! Here are the review comments from me.

+		char	   *result;
+		pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
+		result = fgets(fline, sizeof(fline), fd);
+		pgstat_report_wait_end();
+		if (result == NULL)
+			break;
+
/* skip leading whitespace and check for # comment */
char	   *ptr;

Since the variable name "result" has been already used in this
function,
it should be renamed.

Sorry for that.

I thought to rename it, but I changed to use feof()
for clarify the difference from ferror().

The code should not be inject into the variable declaration block.

Thanks for the comment.
I moved the code block after the variable declaration block.

When reading this patch, I found that IO-error in fgets() has not
been checked there. Though this is not the issue that you reported,
but it seems better to fix it together. So what about adding
the following code?

if (ferror(fd))
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not read file \"%s\": %m", path)));

Thanks, I agree your comment.
I added the above code to the v3 patch I attached.

Regards,
--
Masahiro Ikeda
NTT DATA CORPORATION

Attachments:

v3-0001-Fix-to-report-wait-events-about-timeline-history-.patchtext/x-diff; name=v3-0001-Fix-to-report-wait-events-about-timeline-history-.patchDownload
From 976a796c1ff7ed191d9f3df6c1111a333e700e3a Mon Sep 17 00:00:00 2001
From: Masahiro Ikeda <ikedamsh@oss.nttdata.com>
Date: Fri, 1 May 2020 09:38:51 +0900
Subject: [PATCH v3] Fix to report wait events about timeline history file.

Even though a timeline history file is read or written,
some wait events are not reported. This patch fixes those issues.
---
 src/backend/access/transam/timeline.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index de57d699af..4dfe9fddd1 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -123,7 +123,7 @@ readTimeLineHistory(TimeLineID targetTLI)
 	 * Parse the file...
 	 */
 	prevend = InvalidXLogRecPtr;
-	while (fgets(fline, sizeof(fline), fd) != NULL)
+	for (;;)
 	{
 		/* skip leading whitespace and check for # comment */
 		char	   *ptr;
@@ -132,6 +132,17 @@ readTimeLineHistory(TimeLineID targetTLI)
 		uint32		switchpoint_lo;
 		int			nfields;
 
+		pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
+		fgets(fline, sizeof(fline), fd);
+		pgstat_report_wait_end();
+
+		if (feof(fd))
+			break;
+		else if (ferror(fd))
+			ereport(ERROR,
+				(errcode_for_file_access(),
+				errmsg("could not read file \"%s\": %m", path)));
+
 		for (ptr = fline; *ptr; ptr++)
 		{
 			if (!isspace((unsigned char) *ptr))
@@ -393,6 +404,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 
 	nbytes = strlen(buffer);
 	errno = 0;
+	pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_WRITE);
 	if ((int) write(fd, buffer, nbytes) != nbytes)
 	{
 		int			save_errno = errno;
@@ -408,6 +420,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 				(errcode_for_file_access(),
 				 errmsg("could not write to file \"%s\": %m", tmppath)));
 	}
+	pgstat_report_wait_end();
 
 	pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_SYNC);
 	if (pg_fsync(fd) != 0)
-- 
2.25.1

#9Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: Masahiro Ikeda (#8)
1 attachment(s)
Re: Why are wait events not reported even though it reads/writes a timeline history file?

On 2020/05/01 10:07, Masahiro Ikeda wrote:

On 2020-05-01 00:25, Fujii Masao wrote:

On 2020/04/28 17:42, Masahiro Ikeda wrote:

On 2020-04-28 15:09, Michael Paquier wrote:

On Tue, Apr 28, 2020 at 02:49:00PM +0900, Fujii Masao wrote:

Isn't it safer to report the wait event during fgets() rather than putting
those calls around the whole loop, like other code does? For example,
writeTimeLineHistory() reports the wait event during read() rather than
whole loop.

Yeah, I/O wait events should be taken only during the duration of the
system calls.  Particularly here, you may finish with an elog() that
causes the wait event to be set longer than it should, leading to a
rather incorrect state if a snapshot of pg_stat_activity is taken.
--

Thanks for your comments.

I fixed it to report the wait event during fgets() only.
Please review the v2 patch I attached.

Thanks for updating the patch! Here are the review comments from me.

+        char       *result;
+        pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
+        result = fgets(fline, sizeof(fline), fd);
+        pgstat_report_wait_end();
+        if (result == NULL)
+            break;
+
         /* skip leading whitespace and check for # comment */
         char       *ptr;

Since the variable name "result" has been already used in this function,
it should be renamed.

Sorry for that.

I thought to rename it, but I changed to use feof()
for clarify the difference from ferror().

The code should not be inject into the variable declaration block.

Thanks for the comment.
I moved the code block after the variable declaration block.

When reading this patch, I found that IO-error in fgets() has not
been checked there. Though this is not the issue that you reported,
but it seems better to fix it together. So what about adding
the following code?

    if (ferror(fd))
        ereport(ERROR,
            (errcode_for_file_access(),
             errmsg("could not read file \"%s\": %m", path)));

Thanks, I agree your comment.
I added the above code to the v3 patch I attached.

Thanks for updating the patch! It looks good to me.

I applied cosmetic changes to the patch (attached). Barring any objection,
I will push this patch (also back-patch to v10 where wait-event for timeline
file was added).

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachments:

v4-0001-Fix-to-report-wait-events-about-timeline-history-.patchtext/plain; charset=UTF-8; name=v4-0001-Fix-to-report-wait-events-about-timeline-history-.patch; x-mac-creator=0; x-mac-type=0Download
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index de57d699af..aeaaf44a4a 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -78,7 +78,6 @@ readTimeLineHistory(TimeLineID targetTLI)
 	List	   *result;
 	char		path[MAXPGPATH];
 	char		histfname[MAXFNAMELEN];
-	char		fline[MAXPGPATH];
 	FILE	   *fd;
 	TimeLineHistoryEntry *entry;
 	TimeLineID	lasttli = 0;
@@ -123,15 +122,27 @@ readTimeLineHistory(TimeLineID targetTLI)
 	 * Parse the file...
 	 */
 	prevend = InvalidXLogRecPtr;
-	while (fgets(fline, sizeof(fline), fd) != NULL)
+	for (;;)
 	{
-		/* skip leading whitespace and check for # comment */
+		char		fline[MAXPGPATH];
+		char	   *res;
 		char	   *ptr;
 		TimeLineID	tli;
 		uint32		switchpoint_hi;
 		uint32		switchpoint_lo;
 		int			nfields;
 
+		pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
+		res = fgets(fline, sizeof(fline), fd);
+		pgstat_report_wait_end();
+		if (ferror(fd))
+			ereport(ERROR,
+					(errcode_for_file_access(),
+					 errmsg("could not read file \"%s\": %m", path)));
+		if (res == NULL)
+			break;
+
+		/* skip leading whitespace and check for # comment */
 		for (ptr = fline; *ptr; ptr++)
 		{
 			if (!isspace((unsigned char) *ptr))
@@ -393,6 +404,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 
 	nbytes = strlen(buffer);
 	errno = 0;
+	pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_WRITE);
 	if ((int) write(fd, buffer, nbytes) != nbytes)
 	{
 		int			save_errno = errno;
@@ -408,6 +420,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 				(errcode_for_file_access(),
 				 errmsg("could not write to file \"%s\": %m", tmppath)));
 	}
+	pgstat_report_wait_end();
 
 	pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_SYNC);
 	if (pg_fsync(fd) != 0)
#10Michael Paquier
michael@paquier.xyz
In reply to: Fujii Masao (#9)
Re: Why are wait events not reported even though it reads/writes a timeline history file?

On Fri, May 01, 2020 at 12:04:56PM +0900, Fujii Masao wrote:

I applied cosmetic changes to the patch (attached). Barring any objection,
I will push this patch (also back-patch to v10 where wait-event for timeline
file was added).

Sorry for arriving late to the party. I have one tiny comment.

+		pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
+		res = fgets(fline, sizeof(fline), fd);
+		pgstat_report_wait_end();
+		if (ferror(fd))
+			ereport(ERROR,
+					(errcode_for_file_access(),
+					 errmsg("could not read file \"%s\": %m", path)));
+		if (res == NULL)
+			break;

It seems to me that there is no point to check ferror() if fgets()
does not return NULL, no?
--
Michael

#11Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: Michael Paquier (#10)
1 attachment(s)
Re: Why are wait events not reported even though it reads/writes a timeline history file?

On 2020/05/02 11:24, Michael Paquier wrote:

On Fri, May 01, 2020 at 12:04:56PM +0900, Fujii Masao wrote:

I applied cosmetic changes to the patch (attached). Barring any objection,
I will push this patch (also back-patch to v10 where wait-event for timeline
file was added).

Sorry for arriving late to the party. I have one tiny comment.

+		pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
+		res = fgets(fline, sizeof(fline), fd);
+		pgstat_report_wait_end();
+		if (ferror(fd))
+			ereport(ERROR,
+					(errcode_for_file_access(),
+					 errmsg("could not read file \"%s\": %m", path)));
+		if (res == NULL)
+			break;

It seems to me that there is no point to check ferror() if fgets()
does not return NULL, no?

Yeah, so I updated the patch so that ferror() is called only
when fgets() returns NULL. Attached is the updated version of
the patch.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachments:

v5-0001-Fix-to-report-wait-events-about-timeline-history-.patchtext/plain; charset=UTF-8; name=v5-0001-Fix-to-report-wait-events-about-timeline-history-.patch; x-mac-creator=0; x-mac-type=0Download
diff --git a/src/backend/access/transam/timeline.c b/src/backend/access/transam/timeline.c
index de57d699af..e6a29d9a9b 100644
--- a/src/backend/access/transam/timeline.c
+++ b/src/backend/access/transam/timeline.c
@@ -78,7 +78,6 @@ readTimeLineHistory(TimeLineID targetTLI)
 	List	   *result;
 	char		path[MAXPGPATH];
 	char		histfname[MAXFNAMELEN];
-	char		fline[MAXPGPATH];
 	FILE	   *fd;
 	TimeLineHistoryEntry *entry;
 	TimeLineID	lasttli = 0;
@@ -123,15 +122,30 @@ readTimeLineHistory(TimeLineID targetTLI)
 	 * Parse the file...
 	 */
 	prevend = InvalidXLogRecPtr;
-	while (fgets(fline, sizeof(fline), fd) != NULL)
+	for (;;)
 	{
-		/* skip leading whitespace and check for # comment */
+		char		fline[MAXPGPATH];
+		char	   *res;
 		char	   *ptr;
 		TimeLineID	tli;
 		uint32		switchpoint_hi;
 		uint32		switchpoint_lo;
 		int			nfields;
 
+		pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_READ);
+		res = fgets(fline, sizeof(fline), fd);
+		pgstat_report_wait_end();
+		if (res == NULL)
+		{
+			if (ferror(fd))
+				ereport(ERROR,
+						(errcode_for_file_access(),
+						 errmsg("could not read file \"%s\": %m", path)));
+
+			break;
+		}
+
+		/* skip leading whitespace and check for # comment */
 		for (ptr = fline; *ptr; ptr++)
 		{
 			if (!isspace((unsigned char) *ptr))
@@ -393,6 +407,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 
 	nbytes = strlen(buffer);
 	errno = 0;
+	pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_WRITE);
 	if ((int) write(fd, buffer, nbytes) != nbytes)
 	{
 		int			save_errno = errno;
@@ -408,6 +423,7 @@ writeTimeLineHistory(TimeLineID newTLI, TimeLineID parentTLI,
 				(errcode_for_file_access(),
 				 errmsg("could not write to file \"%s\": %m", tmppath)));
 	}
+	pgstat_report_wait_end();
 
 	pgstat_report_wait_start(WAIT_EVENT_TIMELINE_HISTORY_SYNC);
 	if (pg_fsync(fd) != 0)
#12Michael Paquier
michael@paquier.xyz
In reply to: Fujii Masao (#11)
Re: Why are wait events not reported even though it reads/writes a timeline history file?

On Thu, May 07, 2020 at 04:51:16PM +0900, Fujii Masao wrote:

Yeah, so I updated the patch so that ferror() is called only
when fgets() returns NULL. Attached is the updated version of
the patch.

Thanks for the new patch. LGTM.
--
Michael

#13Fujii Masao
masao.fujii@oss.nttdata.com
In reply to: Michael Paquier (#12)
Re: Why are wait events not reported even though it reads/writes a timeline history file?

On 2020/05/07 21:24, Michael Paquier wrote:

On Thu, May 07, 2020 at 04:51:16PM +0900, Fujii Masao wrote:

Yeah, so I updated the patch so that ferror() is called only
when fgets() returns NULL. Attached is the updated version of
the patch.

Thanks for the new patch. LGTM.

Pushed. Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION