pgbench logging broken by time logic changes
Back on March 10 Thomas Munro committed and wrestled multiple reworks of
the pgbench code from Fabien and the crew. The feature to synchronize
startup I'm looking forward to testing now that I have a packaged beta.
Variations on that problem have bit me so many times I added code last year
to my pgbench processing pipeline to just throw out the first and last 10%
of every data set.
Before I could get to startup timing I noticed the pgbench logging output
was broken via commit 547f04e7 "Improve time logic":
/messages/by-id/E1lJqpF-00064e-C6@gemulon.postgresql.org
A lot of things are timed in pgbench now so I appreciate the idea. Y'all
started that whole big thread about sync on my birthday though and I didn't
follow the details of what that was reviewed against. For the logging use
case I suspect it's just broken everywhere. The two platforms I tested
were PGDG Ubuntu beta1 apt install and Mac git build. Example:
$ createdb pgbench
$ pgbench -i -s 1 pgbench
$ pgbench -S -T 1 -l pgbench
$ head pgbench_log.*
0 1 1730 0 1537380 70911
0 2 541 0 1537380 71474
The epoch time is the 5th column in the output, and this week it should
look like this:
0 1 1411 0 1623767029 732926
0 2 711 0 1623767029 733660
If you're not an epoch guru who recognizes what's wrong already, you might
grab https://github.com/gregs1104/pgbench-tools/ and party like it's 1970
to see it:
$ ~/repos/pgbench-tools/log-to-csv 1 local < pgbench_log* | head
1970-01-18 14:03:00.070911,0,1.73,1,local
1970-01-18 14:03:00.071474,0,0.541,1,local
I have a lot of community oriented work backed up behind this right now, so
I'm gonna be really honest. This time rework commit in its current form
makes me uncomfortable at this point in the release schedule. The commit
has already fought through two rounds of platform specific bug fixes. But
since the buildfarm doesn't test the logging feature, that whole process is
suspect.
My take on the PostgreSQL way to proceed: this bug exposes that pgbench
logging is a feature we finally need to design testing for. We need a new
buildfarm test and then a march through a full release phase to see how it
goes. Only then should we start messing with the time logic. Even if we
fixed the source today on both my test platforms, I'd still be nervous that
beta 2 could ship and more performance testing could fall over from this
modification. And that's cutting things a little close.
The fastest way to get me back to comfortable would be to unwind 547f04e7
and its associated fixes and take it back to review. I understand the
intent and value; I appreciate the work so far. The big industry
architecture shift from Intel to ARM has me worried about time overhead
again, the old code is wonky, and in the PG15 release cycle I already have
resources planned around this area.
# PG15 Plans
I didn't intend to roll back in after time away and go right to a revert
review. But I also really don't want to start my public PG14 story
documenting the reality that I had to use PG13's pgbench to generate my
examples either. I can't fight much with this logging problem while also
doing my planned public performance testing of PG14. I already had to push
back a solid bit of Beta 1 PR from this week, some "community PG is great!"
promotional blogging.
Let me offer what I can commit to from Crunchy corporate. I'm about to
submit multiple pgbench feature changes to the open CF starting July, with
David Christiansen. We and the rest of Crunchy will happily help re-review
this time change idea, its logging issues, testing, rejoin the study of
platform time call overhead, and bash the whole mess into shape for PG15.
I personally am looking forward to it.
The commit made a functional change to the way connection time is
displayed; that I can take or leave as committed. I'm not sure it can be
decoupled from the rest of the changes. It did cause a small breaking
pgbench output parsing problem for me, just trivial regex adjustment. That
break would fit in fine with my upcoming round of submissions.
--
Greg Smith greg.smith@crunchydata.com
Director of Open Source Strategy, Crunchy Data
On Wed, 16 Jun 2021 10:49:36 -0400
Gregory Smith <gregsmithpgsql@gmail.com> wrote:
A lot of things are timed in pgbench now so I appreciate the idea. Y'all
started that whole big thread about sync on my birthday though and I didn't
follow the details of what that was reviewed against. For the logging use
case I suspect it's just broken everywhere. The two platforms I tested
were PGDG Ubuntu beta1 apt install and Mac git build. Example:$ createdb pgbench
$ pgbench -i -s 1 pgbench
$ pgbench -S -T 1 -l pgbench
$ head pgbench_log.*
0 1 1730 0 1537380 70911
0 2 541 0 1537380 71474The epoch time is the 5th column in the output, and this week it should
look like this:0 1 1411 0 1623767029 732926
0 2 711 0 1623767029 733660If you're not an epoch guru who recognizes what's wrong already, you might
grab https://github.com/gregs1104/pgbench-tools/ and party like it's 1970
to see it:$ ~/repos/pgbench-tools/log-to-csv 1 local < pgbench_log* | head
1970-01-18 14:03:00.070911,0,1.73,1,local
1970-01-18 14:03:00.071474,0,0.541,1,local
After the commit, pgbench tries to get the current timestamp by calling
pg_time_now(). This uses INSTR_TIME_SET_CURRENT in it, but this macro
can call clock_gettime(CLOCK_MONOTONIC[_RAW], ) instead of gettimeofday
or clock_gettime(CLOCK_REALTIME, ). When CLOCK_MONOTONIC[_RAW] is used,
clock_gettime doesn't return epoch time. Therefore, we can use
INSTR_TIME_SET_CURRENT aiming to calculate a duration, but we should
not have used this to get the current timestamp.
I think we can fix this issue by using gettimeofday for logging as before
this was changed. I attached the patch.
Regards,
Yugo Nagata
--
Yugo NAGATA <nagata@sraoss.co.jp>
Attachments:
pgbench-log-timestamp.patchtext/x-diff; name=pgbench-log-timestamp.patchDownload+19-4
Hello Greg,
I have a lot of community oriented work backed up behind this right now, so
I'm gonna be really honest. This time rework commit in its current form
makes me uncomfortable at this point in the release schedule. The commit
has already fought through two rounds of platform specific bug fixes. But
since the buildfarm doesn't test the logging feature, that whole process is
suspect.
Logging is/should going to be fixed.
My take on the PostgreSQL way to proceed: this bug exposes that pgbench
logging is a feature we finally need to design testing for.
Sure.
The key feedback for me is the usual one: what is not tested does
not work. Wow:-)
I'm unhappy because I already added tap tests for time-sensitive features
(-T and others, maybe logging aggregates, cannot remember), which have
been removed because they could fail under some circonstances (eg very
very very very slow hosts), or required some special handling (a few lines
of code) in pgbench, and the net result of this is there is not a single
test in place for some features:-(
There is no problem with proposing tests, the problem is that they are
accepted, or if they are accepted then that they are not removed at the
first small issue but rather fixed, or their limitations accepted, because
testing time-sensitive features is not as simple as testing functional
features.
Note that currently there is not a single test of psql with autocommit off
or with "on error rollback". Last time a submitted tap tests to raise psql
test coverage from 50% to 90%, it was rejected. I'll admit that I'm tired
arguing that more testing is required, and I'm very happy if someone else
is ready to try again. Good luck! :-)
We need a new buildfarm test and then a march through a full release
phase to see how it goes.
Only then should we start messing with the time logic. Even if we fixed
the source today on both my test platforms, I'd still be nervous that
beta 2 could ship and more performance testing could fall over from this
modification. And that's cutting things a little close.
Well, the point beta is to discover bugs not caught by reviews and dev
tests, fix them, and possibly add tests which would have caught them.
If you revert all features on the first issue in a corner case and put it
back to the queue, then I do not see why the review and dev tests will be
much better on the next round, so it does not really help moving things
forward.
IMHO, the pragmatic approach is to look at fixing first, and maybe revert
if the problems are deep. I'm not sure this is obviously the case here.
--
Fabien.
pg_time_now(). This uses INSTR_TIME_SET_CURRENT in it, but this macro
can call clock_gettime(CLOCK_MONOTONIC[_RAW], ) instead of gettimeofday
or clock_gettime(CLOCK_REALTIME, ). When CLOCK_MONOTONIC[_RAW] is used,
clock_gettime doesn't return epoch time. Therefore, we can use
INSTR_TIME_SET_CURRENT aiming to calculate a duration, but we should
not have used this to get the current timestamp.I think we can fix this issue by using gettimeofday for logging as before
this was changed. I attached the patch.
I cannot say that I'm thrilled by having multiple tv stuff back in several
place. I can be okay with one, though. What about the attached? Does it
make sense?
--
Fabien.
Attachments:
pgbench-epoch-1.patchtext/x-diff; name=pgbench-epoch-1.patchDownload+3-3
On 6/16/21 2:59 PM, Fabien COELHO wrote:
Hello Greg,
I have a lot of community oriented work backed up behind this right
now, so
I'm gonna be really honest. This time rework commit in its current form
makes me uncomfortable at this point in the release schedule. The
commit
has already fought through two rounds of platform specific bug
fixes. But
since the buildfarm doesn't test the logging feature, that whole
process is
suspect.Logging is/should going to be fixed.
My take on the PostgreSQL way to proceed: this bug exposes that pgbench
logging is a feature we finally need to design testing for.Sure.
The key feedback for me is the usual one: what is not tested does not
work. Wow:-)
Agreed.
I'm unhappy because I already added tap tests for time-sensitive
features (-T and others, maybe logging aggregates, cannot remember),
which have been removed because they could fail under some
circonstances (eg very very very very slow hosts), or required some
special handling (a few lines of code) in pgbench, and the net result
of this is there is not a single test in place for some features:-(
I'm not familiar with exactly what happened in this case, but tests need
to be resilient over a wide range of performance characteristics. One
way around this issue might be to have a way of detecting that it's on a
slow platform and if so either skipping tests (Test::More provides
plenty of support for this) or expecting different results.
There is no problem with proposing tests, the problem is that they are
accepted, or if they are accepted then that they are not removed at
the first small issue but rather fixed, or their limitations accepted,
because testing time-sensitive features is not as simple as testing
functional features.Note that currently there is not a single test of psql with autocommit
off or with "on error rollback". Last time a submitted tap tests to
raise psql test coverage from 50% to 90%, it was rejected. I'll admit
that I'm tired arguing that more testing is required, and I'm very
happy if someone else is ready to try again. Good luck! :-)
:-(
We need a new buildfarm test and then a march through a full release
phase to see how it goes.Only then should we start messing with the time logic. Even if we
fixed the source today on both my test platforms, I'd still be
nervous that beta 2 could ship and more performance testing could
fall over from this modification. And that's cutting things a little
close.Well, the point beta is to discover bugs not caught by reviews and dev
tests, fix them, and possibly add tests which would have caught them.If you revert all features on the first issue in a corner case and put
it back to the queue, then I do not see why the review and dev tests
will be much better on the next round, so it does not really help
moving things forward.IMHO, the pragmatic approach is to look at fixing first, and maybe
revert if the problems are deep. I'm not sure this is obviously the
case here.
It does look like the submitted fix basically reverts the changes w.r.t.
this timestamp logging.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Hi Greg,
On Thu, Jun 17, 2021 at 2:49 AM Gregory Smith <gregsmithpgsql@gmail.com> wrote:
Back on March 10 Thomas Munro committed and wrestled multiple reworks of the pgbench code from Fabien and the crew. The feature to synchronize startup I'm looking forward to testing now that I have a packaged beta. Variations on that problem have bit me so many times I added code last year to my pgbench processing pipeline to just throw out the first and last 10% of every data set.
Yeah, commit aeb57af8 is a nice improvement and was the main thing I
wanted to get into the tree for 14 in this area, because it was
measuring the wrong thing.
Before I could get to startup timing I noticed the pgbench logging output was broken via commit 547f04e7 "Improve time logic": /messages/by-id/E1lJqpF-00064e-C6@gemulon.postgresql.org
It does suck that we broke the logging and that it took 3 months for
anyone to notice and report it to the list. Seems like it should be
straightforward to fix, though, with fixes already proposed (though I
haven't studied them yet, will do).
I have a lot of community oriented work backed up behind this right now, so I'm gonna be really honest. This time rework commit in its current form makes me uncomfortable at this point in the release schedule. The commit has already fought through two rounds of platform specific bug fixes. But since the buildfarm doesn't test the logging feature, that whole process is suspect.
It's true that this work produced a few rounds of small portability
follow-ups: c427de42 (work around strange hacks elsewhere in the tree
for AIX), 68b34b23 (missing calling convention specifier on Windows),
and de91c3b9 (adjust pthread missing-function code for threadless
builds). These were problems that didn't show up on developer or CI
systems (including threadless and Windows), and IMHO are typical sorts
of problems you expect to have to work through when stuff hits the
build farm, especially when using new system interfaces. So I don't
think any of that, on its own, supports reverting anything here.
My take on the PostgreSQL way to proceed: this bug exposes that pgbench logging is a feature we finally need to design testing for. We need a new buildfarm test and then a march through a full release phase to see how it goes. Only then should we start messing with the time logic. Even if we fixed the source today on both my test platforms, I'd still be nervous that beta 2 could ship and more performance testing could fall over from this modification. And that's cutting things a little close.
The fastest way to get me back to comfortable would be to unwind 547f04e7 and its associated fixes and take it back to review. I understand the intent and value; I appreciate the work so far. The big industry architecture shift from Intel to ARM has me worried about time overhead again, the old code is wonky, and in the PG15 release cycle I already have resources planned around this area.
Let me study the proposed fixes on this and the other thread about
pgbench logging for a bit.
Glad to hear that you're working on this area. I guess you might be
researching stuff along the same sorts of lines as in the thread
"Reduce timing overhead of EXPLAIN ANALYZE using rdtsc?" (though
that's about the executor). As I already expressed in that thread, if
the backend's instrumentation code is improved as proposed there,
we'll probably want to rip some of these pgbench changes out anyway
and go back to common instrumentation code.
For that reason, I'm not super attached to that new pg_time_usec_t
stuff at all, and wouldn't be sad if we reverted that piece. I am
moderately attached to the sync changes, though. pgbench 13 is
objectively producing incorrect results in that respect.
On Thu, Jun 17, 2021 at 12:36:10PM +1200, Thomas Munro wrote:
For that reason, I'm not super attached to that new pg_time_usec_t
stuff at all, and wouldn't be sad if we reverted that piece. I am
moderately attached to the sync changes, though. pgbench 13 is
objectively producing incorrect results in that respect.
There is another item in this area where pgbench uses incorrect maths
when aggregating the stats of transactions mid-run and at the end of a
thread, issue caused by 547f04e as this code path forgot to handle the
s <-> us conversion:
/messages/by-id/CAF7igB1r6wRfSCEAB-iZBKxkowWY6+dFF2jObSdd9+iVK+vHJg@mail.gmail.com
Wouldn't it be better to put all those fixes into the same bag? If
you drop the business with pg_time_usec_t, it looks like we don't
really need to do anything there.
--
Michael
On Wed, 16 Jun 2021 21:11:41 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:
pg_time_now(). This uses INSTR_TIME_SET_CURRENT in it, but this macro
can call clock_gettime(CLOCK_MONOTONIC[_RAW], ) instead of gettimeofday
or clock_gettime(CLOCK_REALTIME, ). When CLOCK_MONOTONIC[_RAW] is used,
clock_gettime doesn't return epoch time. Therefore, we can use
INSTR_TIME_SET_CURRENT aiming to calculate a duration, but we should
not have used this to get the current timestamp.I think we can fix this issue by using gettimeofday for logging as before
this was changed. I attached the patch.I cannot say that I'm thrilled by having multiple tv stuff back in several
place. I can be okay with one, though. What about the attached? Does it
make sense?
At first, I also thought of fixing pg_time_now() to use gettimeofday() instead
of INSTR_TIME_SET_CURRENT, but I noticed that using INSTR_TIME_SET_CURRENT is
proper to measure time interval. I mean, this macro uses
lock_gettime(CLOCK_MONOTONIC, ) if avilable, which give reliable interval
timing even in the face of changes to the system clock due to NTP.
The commit 547f04e7 changed all of INSTR_TIME_SET_CURRENT, gettimeofday(), and
time() to pg_now_time() which calls INSTR_TIME_SET_CURRENT in it. So, my patch
intented to revert these changes only about gettimeofday() and time(), and remain
changes about INSTR_TIME_SET_CURRENT.
I attached the updated patch because I forgot to revert pg_now_time() to time(NULL).
Another idea to fix is adding 'use_epoch' flag to pg_now_time() like below,
pg_time_now(bool use_epoch)
{
if (use_epoch)
{
struct timeval tv;
gettimeofday(&tv, NULL);
return tv.tv_sec * 1000000 + tv.tv_usec;
}
else
{
instr_time now;
INSTR_TIME_SET_CURRENT(now);
return (pg_time_usec_t) INSTR_TIME_GET_MICROSEC(now);
}
}
or making an additional function that returns epoch time.
By the way, there is another advantage of using clock_gettime(). That is, this
returns precise results in nanoseconds. However, we can not benefit from it because
pg_time_now() converts the value to uint64 by using INSTR_TIME_GET_MICROSEC. So,
if we would like more precious statistics in pgbench, it might be better to use
INSTR_TIME_GET_MICROSEC instead of pg_time_now in other places.
Regards,
Yugo Nagata
--
Yugo NAGATA <nagata@sraoss.co.jp>
Attachments:
pgbench-log-timestamp-2.patchtext/x-diff; name=pgbench-log-timestamp-2.patchDownload+23-8
At Thu, 17 Jun 2021 12:23:42 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in
On Wed, 16 Jun 2021 21:11:41 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:I cannot say that I'm thrilled by having multiple tv stuff back in several
place. I can be okay with one, though. What about the attached? Does it
make sense?
+1 The patch rounds down sd->start_time from ms to s but it seems to
me a degradation.
At first, I also thought of fixing pg_time_now() to use gettimeofday() instead
of INSTR_TIME_SET_CURRENT, but I noticed that using INSTR_TIME_SET_CURRENT is
proper to measure time interval. I mean, this macro uses
lock_gettime(CLOCK_MONOTONIC, ) if avilable, which give reliable interval
timing even in the face of changes to the system clock due to NTP.
If I understand the op correctly, the problem here is the time values
in pgbench log file are based on a bogus epoch. If it's the only issue
here and and if we just want to show the time based on the unix epoch
time, just recording the difference would work as I scketched in the
attached. (Precisely theepoch would move if we set the system clock
but I don't think that matters:p)
By the way, there is another advantage of using clock_gettime(). That is, this
returns precise results in nanoseconds. However, we can not benefit from it because
pg_time_now() converts the value to uint64 by using INSTR_TIME_GET_MICROSEC. So,
if we would like more precious statistics in pgbench, it might be better to use
INSTR_TIME_GET_MICROSEC instead of pg_time_now in other places.
I'm not sure we have transaction lasts for very short time that
nanoseconds matters.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
show_uepoch_time_in_logs.patchtext/x-patch; charset=us-asciiDownload+10-3
On Thu, 17 Jun 2021 14:17:56 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Thu, 17 Jun 2021 12:23:42 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in
On Wed, 16 Jun 2021 21:11:41 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:I cannot say that I'm thrilled by having multiple tv stuff back in several
place. I can be okay with one, though. What about the attached? Does it
make sense?+1 The patch rounds down sd->start_time from ms to s but it seems to
me a degradation.
I don't think that's matter because sd->start_time is used only for
log aggregation and aggregate-interval is specified in seconds, though.
At first, I also thought of fixing pg_time_now() to use gettimeofday() instead
of INSTR_TIME_SET_CURRENT, but I noticed that using INSTR_TIME_SET_CURRENT is
proper to measure time interval. I mean, this macro uses
lock_gettime(CLOCK_MONOTONIC, ) if avilable, which give reliable interval
timing even in the face of changes to the system clock due to NTP.If I understand the op correctly, the problem here is the time values
in pgbench log file are based on a bogus epoch. If it's the only issue
here and and if we just want to show the time based on the unix epoch
time, just recording the difference would work as I scketched in the
attached. (Precisely theepoch would move if we set the system clock
but I don't think that matters:p)
That makes sense. If the system clock is shifted due to NTP (for example)
it would not affect the measurement although timestamps in logs could be shifted
because gettimeofday is called only once.
If we fix it in this way, we should fix also printProgressReport().
if (progress_timestamp)
{
- snprintf(tbuf, sizeof(tbuf), "%.3f s", PG_TIME_GET_DOUBLE(now));
+ snprintf(tbuf, sizeof(tbuf), "%.3f s", PG_TIME_GET_DOUBLE(now * epoch_shift));
}
By the way, there is another advantage of using clock_gettime(). That is, this
returns precise results in nanoseconds. However, we can not benefit from it because
pg_time_now() converts the value to uint64 by using INSTR_TIME_GET_MICROSEC. So,
if we would like more precious statistics in pgbench, it might be better to use
INSTR_TIME_GET_MICROSEC instead of pg_time_now in other places.I'm not sure we have transaction lasts for very short time that
nanoseconds matters.
I thought it might affect the accuracy when statistics are accumulated
through a huge numbers of transactions, but I am fine with it if no one
cares it.
Regards,
Yugo Nagata
--
Yugo NAGATA <nagata@sraoss.co.jp>
Hello Yugo-san,
I think we can fix this issue by using gettimeofday for logging as before
this was changed. I attached the patch.I cannot say that I'm thrilled by having multiple tv stuff back in several
place. I can be okay with one, though. What about the attached? Does it
make sense?At first, I also thought of fixing pg_time_now() to use gettimeofday() instead
of INSTR_TIME_SET_CURRENT, but I noticed that using INSTR_TIME_SET_CURRENT is
proper to measure time interval. I mean, this macro uses
lock_gettime(CLOCK_MONOTONIC, ) if avilable, which give reliable interval
timing even in the face of changes to the system clock due to NTP.
Ok, I was thinking that it was possible that there was this kind of
implication. Now, does it matter that much if a few transactions are
skewed by NTP from time to time? Having to deal with different time
functions in the same file seems painful.
The commit 547f04e7 changed all of INSTR_TIME_SET_CURRENT, gettimeofday(), and
time() to pg_now_time() which calls INSTR_TIME_SET_CURRENT in it. So, my patch
intented to revert these changes only about gettimeofday() and time(), and remain
changes about INSTR_TIME_SET_CURRENT.
Hmmm.
pg_time_now(bool use_epoch)
{
if (use_epoch)
{
struct timeval tv;
gettimeofday(&tv, NULL);
return tv.tv_sec * 1000000 + tv.tv_usec;
}
else
{
instr_time now;
INSTR_TIME_SET_CURRENT(now);
return (pg_time_usec_t) INSTR_TIME_GET_MICROSEC(now);
}
}or making an additional function that returns epoch time.
Yes, but when to call which version? How to avoid confusion? After giving
it some thoughts, ISTM that the best short-term decision is just to have
epoch everywhere, i.e. having now() to rely on gettimeofday, because:
- at least one user is unhappy with not having epoch in the log file,
and indeed it makes sense to be unhappy about that if they want to
correlated logs. So I agree to undo that, or provide an option to undo
it.
- having different times with different origins at different point in
the code makes it really hard to understand and maintain, and if we
trade maintainability for precision it should really be worth it, and
I'm not sure that working around NTP adjustment is worth it right now.
In the not so short term, I'd say that the best approach would be use
relative time internally and just to offset these with a global epoch
start time when displaying a timestamp.
By the way, there is another advantage of using clock_gettime(). That is, this
returns precise results in nanoseconds. However, we can not benefit from it because
pg_time_now() converts the value to uint64 by using INSTR_TIME_GET_MICROSEC. So,
if we would like more precious statistics in pgbench, it might be better to use
INSTR_TIME_GET_MICROSEC instead of pg_time_now in other places.
The INSTR_TIME macros are pretty ugly and inefficient, especially when
time arithmetic is involved because they re-implements 64 bits operations
based on 32 bit ints. I really wanted to get rid of that as much as
possible. From a database benchmarking perspective ISTM that ᅵs is the
right smallest unit, given that a transaction implies significant delays
such as networks communications, parsing, and so on. So I do not think we
should ever need nanos.
In conclusion, ISTM that it is enough to simply change now to call
gettimeofday to fix the issue raised by Greg. This is patch v1 on the
thread.
--
Fabien.
Hello,
I cannot say that I'm thrilled by having multiple tv stuff back in several
place. I can be okay with one, though. What about the attached? Does it
make sense?+1 The patch rounds down sd->start_time from ms to s but it seems to
me a degradation.
Yes, please we should not use time.
At first, I also thought of fixing pg_time_now() to use gettimeofday() instead
of INSTR_TIME_SET_CURRENT, but I noticed that using INSTR_TIME_SET_CURRENT is
proper to measure time interval. I mean, this macro uses
lock_gettime(CLOCK_MONOTONIC, ) if avilable, which give reliable interval
timing even in the face of changes to the system clock due to NTP.If I understand the op correctly, the problem here is the time values
in pgbench log file are based on a bogus epoch.
It is not "bogus", but is not necessary epoch depending on the underlying
function called behind by INSTR_TIME macros, and people are entitled to
expect epoch for log correlations.
If it's the only issue
here and and if we just want to show the time based on the unix epoch
time, just recording the difference would work as I scketched in the
attached. (Precisely theepoch would move if we set the system clock
but I don't think that matters:p)
I do like the approach.
I'm hesitant to promote it for fixing the beta, but the code impact is
small enough, so I'd say yes. Maybe there is a similar issue with progress
which should probably use the same approach. I think that aligning the
implementations can wait for pg15.
The patch as white space issues. Attached an updated version which fixes
that, adds comments and simplify the code a little bit.
I'm not sure we have transaction lasts for very short time that
nanoseconds matters.
Indeed.
--
Fabien.
Attachments:
pgbench-epoch-4.patchtext/x-diff; name=pgbench-epoch-4.patchDownload+15-1
Hello Thomas,
Before I could get to startup timing I noticed the pgbench logging
output was broken via commit 547f04e7 "Improve time logic":
/messages/by-id/E1lJqpF-00064e-C6@gemulon.postgresql.orgIt does suck that we broke the logging and that it took 3 months for
anyone to notice and report it to the list.
Indeed. Well, it also demonstrates that beta are useful.
Seems like it should be straightforward to fix, though, with fixes
already proposed (though I haven't studied them yet, will do).
I think that fixing logging is simple enough, thus a revert is not
necessary.
I have a lot of community oriented work backed up behind this right
now, so I'm gonna be really honest. This time rework commit in its
current form makes me uncomfortable at this point in the release
schedule. The commit has already fought through two rounds of platform
specific bug fixes. But since the buildfarm doesn't test the logging
feature, that whole process is suspect.It's true that this work produced a few rounds of small portability
follow-ups: c427de42 (work around strange hacks elsewhere in the tree
for AIX), 68b34b23 (missing calling convention specifier on Windows),
and de91c3b9 (adjust pthread missing-function code for threadless
builds). These were problems that didn't show up on developer or CI
systems (including threadless and Windows), and IMHO are typical sorts
of problems you expect to have to work through when stuff hits the
build farm, especially when using new system interfaces. So I don't
think any of that, on its own, supports reverting anything here.
Yep, the buildfarm is here to catch portability issues, and it does its
job:-) There is no doubt that logging is has been broken because of lack
of tests in this area, shame on us. I think it is easy to fix.
[...] For that reason, I'm not super attached to that new pg_time_usec_t
stuff at all, and wouldn't be sad if we reverted that piece.
Well, I was sooo happy to get rid of INSTR_TIME ugly and inefficient
macros in pgbench… so anything looks better to me.
Note that Michaël is having a look at fixing pgbench logging issues.
--
Fabien.
Wouldn't it be better to put all those fixes into the same bag?
Attached.
--
Fabien.
Attachments:
pgbench-log-fix-6.patchtext/x-diff; name=pgbench-log-fix-6.patchDownload
Wouldn't it be better to put all those fixes into the same bag?
Attached.
Even better if the patch is not empty.
--
Fabien.
Attachments:
pgbench-log-fix-6.patchtext/x-diff; name=pgbench-log-fix-6.patchDownload+64-28
On Thu, 17 Jun 2021 10:18:03 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Wouldn't it be better to put all those fixes into the same bag?
Attached.
Even better if the patch is not empty.
I found you forgot to fix printProgressReport().
Also, according to the document, interval_start in Aggregated Logging
seems to be printed in seconds instead of ms.
<para>
Here is some example output:
<screen>
1345828501 5601 1542744 483552416 61 2573
1345828503 7884 1979812 565806736 60 1479
1345828505 7208 1979422 567277552 59 1391
1345828507 7685 1980268 569784714 60 1398
1345828509 7073 1979779 573489941 236 1411
</screen></para>
If we obey the document and keep the back-compatibility, we should fix
logAgg().
The attached patch includes these fixes.
Regards,
Yugo Nagata
--
Yugo NAGATA <nagata@sraoss.co.jp>
Attachments:
pgbench-log-fix-7.patchtext/x-diff; name=pgbench-log-fix-7.patchDownload+65-29
I found you forgot to fix printProgressReport().
Indeed.
Also, according to the document, interval_start in Aggregated Logging
seems to be printed in seconds instead of ms.
Indeed. I'm unsure about what we should really want there, but for a beta
bug fix I agree that it must simply comply to the old documented behavior.
The attached patch includes these fixes.
Thanks. Works for me.
--
Fabien.
On Thu, Jun 17, 2021 at 7:24 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Seems like it should be straightforward to fix, though, with fixes
already proposed (though I haven't studied them yet, will do).I think that fixing logging is simple enough, thus a revert is not
necessary.
I prepared a draft revert patch for discussion, just in case it comes
in handy. This reverts "pgbench: Improve time logic.", but "pgbench:
Synchronize client threads." remains (slightly rearranged).
I'm on the fence TBH, I can see that it's really small things and it
seems we have the patches, but it's late, late enough that
benchmarking gurus are showing up to benchmark with it for real, and
it's not great to be getting in the way of that with what is mostly
refactoring work, so I don't think it would be a bad thing if we
agreed to try again in 15. (A number of arguments for and against
moving pgbench out of the postgresql source tree and release cycle
occur to me, but I guess that's a topic for another thread.)
[...] For that reason, I'm not super attached to that new pg_time_usec_t
stuff at all, and wouldn't be sad if we reverted that piece.Well, I was sooo happy to get rid of INSTR_TIME ugly and inefficient
macros in pgbench… so anything looks better to me.
I don't love it either, in this code or the executor. (I know you
also don't like the THREAD_CREATE etc macros. I have something to
propose to improve that for 15....)
Note that Michaël is having a look at fixing pgbench logging issues.
Yeah I've been catching up with these threads.
Attachments:
0001-Revert-pgbench-Improve-time-logic.patchtext/x-patch; charset=US-ASCII; name=0001-Revert-pgbench-Improve-time-logic.patchDownload+248-236
On 6/17/21 8:49 AM, Thomas Munro wrote:
On Thu, Jun 17, 2021 at 7:24 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Seems like it should be straightforward to fix, though, with fixes
already proposed (though I haven't studied them yet, will do).I think that fixing logging is simple enough, thus a revert is not
necessary.I prepared a draft revert patch for discussion, just in case it comes
in handy. This reverts "pgbench: Improve time logic.", but "pgbench:
Synchronize client threads." remains (slightly rearranged).I'm on the fence TBH, I can see that it's really small things and it
seems we have the patches, but it's late, late enough that
benchmarking gurus are showing up to benchmark with it for real, and
it's not great to be getting in the way of that with what is mostly
refactoring work, so I don't think it would be a bad thing if we
agreed to try again in 15.
Is there an identified issue beyond the concrete example Greg gave of
the timestamps?
We are still fixing a few things with potentially far more impact than
anything in pgbench, so fixing this wouldn't bother me that much, as
long as we get it done for Beta2.
(A number of arguments for and against
moving pgbench out of the postgresql source tree and release cycle
occur to me, but I guess that's a topic for another thread.)
Indeed.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Wed, Jun 16, 2021 at 2:59 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
I'm unhappy because I already added tap tests for time-sensitive features
(-T and others, maybe logging aggregates, cannot remember), which have
been removed because they could fail under some circonstances (eg very
very very very slow hosts), or required some special handling (a few lines
of code) in pgbench, and the net result of this is there is not a single
test in place for some features:-(
I understand your struggle and I hope I was clear about two things:
-I am excited by all the progress made in pgbench, and this problem is an
integration loose end rather than a developer failure at any level.
-Doing better in this messy area takes a team that goes from development to
release management, and I had no right to complain unless I brought
resources to improve in the specific areas of the process that I want to be
better.
I think the only thing you and I disagree on is that you see a "first issue
in a corner case" where I see a process failure that is absolutely vital
for me to improve. Since the reality is that I might be the best
positioned person to actually move said process forward in a meaningful
long-term way, I have every intention of applying pressure to the area
you're frustrated at. Crunchy has a whole parallel review team to the
community one now focused on what our corporate and government customers
need for software process control and procedure compliance. The primary
business problem I'm working on now is how to include performance review in
that mix.
I already know I need to re-engage with you over how I need min/max numbers
in the aggregate logging output to accomplish some valuable goals. When I
get around to that this summer, I'd really enjoy talking with you a bit,
video call or something, about really any community topic you're frustrated
with. I have a lot riding now on the productivity of the PostgreSQL hacker
community and I want everyone to succeed at the best goals.
There is no problem with proposing tests, the problem is that they are
accepted, or if they are accepted then that they are not removed at the
first small issue but rather fixed, or their limitations accepted, because
testing time-sensitive features is not as simple as testing functional
features.
For 2020 Crunchy gave me a sort of sabbatical year to research community
oriented benchmarking topics. Having a self contained project in my home
turned out to be the perfect way to spend *that* wreck of a year.
I made significant progress toward the idea of having a performance farm
for PostgreSQL. On my laptop today is a 14GB database with 1s resolution
latency traces for 663 days of pgbench time running 4 workloads across a
small bare metal farm of various operating systems and hardware classes. I
can answer questions like "how long does a typical SSD take to execute an
INSERT commit?" across my farm with SQL. It's at the "works for me!" stage
of development, and I thought this was the right time in the development
cycle to start sharing improvement ideas from my work; thus the other
submissions in progress I alluded to.
The logging feature is in an intermediate spot where validating it requires
light custom tooling that compares its output against known variables like
the system time. It doesn't quite have a performance component to it.
Since this time logic detail is a well known portability minefield, I
thought demanding that particular test was a pretty easy sell.
That you in particular are frustrated here makes perfect sense to me. I am
fresh and ready to carry this forward some distance, and I hope the outcome
makes you happy