Doesn't pgstat_report_wal() handle the argument "force" incorrectly
Hi,
pgstat_report_wal() calls pgstat_flush_wal() and pgstat_flush_io(). When
calling them, pgstat_report_wal() specifies its argument "force" as the
argument of them, as follows. But according to the code of
pgstat_flush_wal() and pgstat_flush_io(), their argument is "nowait" and
its meaning seems the opposite of "force". This means that, even when
checkpointer etc calls pgstat_report_wal() with force=true to forcibly
flush the statistics, pgstat_flush_wal() and pgstat_flush_io() skip
flushing the statistics if they fail to acquire the lock immediately
because they are called with nowait=true. This seems unexpected behavior
and a bug.
void
pgstat_report_wal(bool force)
{
pgstat_flush_wal(force);
pgstat_flush_io(force);
}
BTW, pgstat_report_stat() treats "nowait" and "force" as the opposite
one, as follows.
/* don't wait for lock acquisition when !force */
nowait = !force;
Ryoga Yoshida
On Fri, Sep 22, 2023 at 01:58:37PM +0900, Ryoga Yoshida wrote:
pgstat_report_wal() calls pgstat_flush_wal() and pgstat_flush_io(). When
calling them, pgstat_report_wal() specifies its argument "force" as the
argument of them, as follows. But according to the code of
pgstat_flush_wal() and pgstat_flush_io(), their argument is "nowait" and its
meaning seems the opposite of "force". This means that, even when
checkpointer etc calls pgstat_report_wal() with force=true to forcibly flush
the statistics, pgstat_flush_wal() and pgstat_flush_io() skip flushing the
statistics if they fail to acquire the lock immediately because they are
called with nowait=true. This seems unexpected behavior and a bug.
It seems to me that you are right here. It would make sense to me to
say that force=true is equivalent to nowait=false, as in "I'm OK to
wait on the lockas I want to make sure that the stats are flushed at
this point". Currently force=true means nowait=true, as in "I'm OK to
not have the stats flushed if I cannot take the lock".
Seeing the three callers of pgstat_report_wal(), the checkpointer
wants to force its way twice, and the WAL writer does not care if they
are not flushed immediately at it loops forever in this path.
A comment at the top of pgstat_report_wal() would be nice to document
that a bit better, at least.
--
Michael
On 2023-09-25 09:56, Michael Paquier wrote:
It seems to me that you are right here. It would make sense to me to
say that force=true is equivalent to nowait=false, as in "I'm OK to
wait on the lockas I want to make sure that the stats are flushed at
this point". Currently force=true means nowait=true, as in "I'm OK to
not have the stats flushed if I cannot take the lock".Seeing the three callers of pgstat_report_wal(), the checkpointer
wants to force its way twice, and the WAL writer does not care if they
are not flushed immediately at it loops forever in this path.A comment at the top of pgstat_report_wal() would be nice to document
that a bit better, at least.
Thank you for the review. Certainly, adding a comments is a good idea. I
added a comment.
Ryoga Yoshida
Attachments:
v1-0002-bug-fix-in-foce-within-pgstat_report_wal.patchtext/x-diff; name=v1-0002-bug-fix-in-foce-within-pgstat_report_wal.patchDownload+7-2
On Mon, Sep 25, 2023 at 11:27:27AM +0900, Ryoga Yoshida wrote:
Thank you for the review. Certainly, adding a comments is a good idea. I
added a comment.
Hmm. How about the attached version with some tweaks?
--
Michael
Attachments:
v2-0002-bug-fix-in-foce-within-pgstat_report_wal.patchtext/x-diff; charset=us-asciiDownload+8-3
On 2023-09-25 12:47, Michael Paquier wrote:
in attached file
+ /* like in pgstat.c, don't wait for lock acquisition when !force */
Isn't it the case with force=true and !force that it doesn't wait for
the lock acquisition. In fact, force may be false.
Ryoga Yoshida
On Mon, Sep 25, 2023 at 02:16:22PM +0900, Ryoga Yoshida wrote:
On 2023-09-25 12:47, Michael Paquier wrote:
in attached file+ /* like in pgstat.c, don't wait for lock acquisition when !force */
Isn't it the case with force=true and !force that it doesn't wait for the
lock acquisition. In fact, force may be false.
We would not wait on the lock if force=false, which would do
nowait=true. And !force reads the same to me as force=false.
Anyway, I am OK to remove this part. That seems to confuse you, so
you may not be the only one who would read this comment.
Another idea would be to do like in pgstat.c by adding the following
line, then use "nowait" to call each sub-function:
nowait = !force;
pgstat_flush_wal(nowait);
pgstat_flush_io(nowait);
--
Michael
On 2023-09-25 14:38, Michael Paquier wrote:
We would not wait on the lock if force=false, which would do
nowait=true. And !force reads the same to me as force=false.Anyway, I am OK to remove this part. That seems to confuse you, so
you may not be the only one who would read this comment.
When I first read it, I didn't read that !force as force=false, so
removing it might be better.
Another idea would be to do like in pgstat.c by adding the following
line, then use "nowait" to call each sub-function:
nowait = !force;
pgstat_flush_wal(nowait);
pgstat_flush_io(nowait);
That's very clear and I think it's good.
Ryoga Yoshida
On Mon, Sep 25, 2023 at 02:49:50PM +0900, Ryoga Yoshida wrote:
On 2023-09-25 14:38, Michael Paquier wrote:
Another idea would be to do like in pgstat.c by adding the following
line, then use "nowait" to call each sub-function:
nowait = !force;
pgstat_flush_wal(nowait);
pgstat_flush_io(nowait);That's very clear and I think it's good.
Done this way down to 15, then, with more comment polishing.
--
Michael