Fixing pgbench's logging of transaction timestamps

Started by Tom Laneover 9 years ago4 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

pgbench's -l option is coded using inappropriate familiarity with the
contents of struct instr_time. I recall complaining about that when
the code went in, but to little avail. However, it needs to be fixed
if we're to switch over to using clock_gettime() as discussed in
the gettimeofday thread,
/messages/by-id/17524.1483063359@sss.pgh.pa.us
And really it's unacceptably bad code even without that consideration,
because it doesn't work on Windows.

There are at least three ways we could fix it:

1. Switch over to printing the timestamp in the form of elapsed seconds
since the pgbench run start, as in the attached draft patch (which is
code-complete but lacks necessary documentation changes). You could make
an argument that this is a better definition than what's there: in most
situations, people are going to want the elapsed time, and right now they
have to do painful manual arithmetic to get it. About the only reason
I can see for liking the current definition is that it makes it possible
to correlate the pgbench log with external events, and I'm not sure
whether that's especially useful.

2. Have pgbench save both the INSTR_TIME_SET_CURRENT() and gettimeofday()
results for the run start instant. In doLog(), compute the elapsed time
from run start much as in the attached patch, but then add it to the saved
gettimeofday() result and print that in the existing format. This would
preserve the existing output format at the cost of a very small amount
of extra arithmetic per log line. However, it's got a nasty problem if
we use clock_gettime(CLOCK_MONOTONIC) in instr_time, which I think would
be the typical case. To the extent that CLOCK_MONOTONIC diverges from
CLOCK_REALTIME, which it would in case of external adjustments to the
system clock, the printed timestamps would no longer match the actual
system clock, which destroys the argument that you could correlate the
pgbench log with other events. (I imagine the same problem would apply
in the Windows implementation.)

3. Forget about using the instr_time result and just have doLog() execute
gettimeofday() to obtain the timestamp to print. This is kind of
conceptually ugly, but realistically the added overhead is probably
insignificant. A larger objection might be that on Windows, the result
of gettimeofday() isn't very high precision ... but it'd still be a huge
improvement over the non-answer you get now.

I'm inclined to think that #2 isn't a very good choice; it appears to
preserve the current behavior but really doesn't. So we should either
change the behavior as in #1 or expend an extra system call as in #3.
Preferences?

regards, tom lane

Attachments:

pgbench-log-delta-time.patchtext/x-diff; charset=us-ascii; name=pgbench-log-delta-time.patchDownload+24-23
#2Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Tom Lane (#1)
Re: Fixing pgbench's logging of transaction timestamps

Hello and happy new year,

-l [...]

My 0.02ᅵ:

There are at least three ways we could fix it:

1. Switch over to printing the timestamp in the form of elapsed seconds
since the pgbench run start, [...] About the only reason I can see for
liking the current definition is that it makes it possible to correlate
the pgbench log with external events, and I'm not sure whether that's
especially useful.

I have found that the ability to correlate different logs, esp. pgbench
and postgres, is "sometimes" useful.

2. Have pgbench save both the INSTR_TIME_SET_CURRENT() and
gettimeofday() results for the run start instant. However, it's got a
nasty problem [...]

I'm not sure how wide the problem would be. I would not expect the
correlation to be perfect, as there are various protocol/client overheads
here and there anyway.

3. Forget about using the instr_time result and just have doLog() execute
gettimeofday() to obtain the timestamp to print. This is kind of
conceptually ugly, but realistically the added overhead is probably
insignificant. A larger objection might be that on Windows, the result
of gettimeofday() isn't very high precision ... but it'd still be a huge
improvement over the non-answer you get now.

Yep.

I'm inclined to think that #2 isn't a very good choice; it appears to
preserve the current behavior but really doesn't. So we should either
change the behavior as in #1 or expend an extra system call as in #3.
Preferences?

Marginal preference for #3 for KIS? Otherwise any three options seems
better than the current status.

--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fabien COELHO (#2)
Re: Fixing pgbench's logging of transaction timestamps

Fabien COELHO <coelho@cri.ensmp.fr> writes:

3. Forget about using the instr_time result and just have doLog() execute
gettimeofday() to obtain the timestamp to print. This is kind of
conceptually ugly, but realistically the added overhead is probably
insignificant. A larger objection might be that on Windows, the result
of gettimeofday() isn't very high precision ... but it'd still be a huge
improvement over the non-answer you get now.

Yep.

I'm inclined to think that #2 isn't a very good choice; it appears to
preserve the current behavior but really doesn't. So we should either
change the behavior as in #1 or expend an extra system call as in #3.
Preferences?

Marginal preference for #3 for KIS? Otherwise any three options seems
better than the current status.

OK, done that way.

BTW, why is it that the --aggregate-interval option is unsupported on
Windows? Is that an artifact of the same disease of assuming too much
about how instr_time is represented? I don't see any very good reason
for it other than the weird decision to store the result of
INSTR_TIME_GET_DOUBLE in a "long", which seems rather broken in any case.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#3)
Re: Fixing pgbench's logging of transaction timestamps

I wrote:

BTW, why is it that the --aggregate-interval option is unsupported on
Windows? Is that an artifact of the same disease of assuming too much
about how instr_time is represented? I don't see any very good reason
for it other than the weird decision to store the result of
INSTR_TIME_GET_DOUBLE in a "long", which seems rather broken in any case.

After looking closer, I see the real issue is that it prints the integer
part of INSTR_TIME_GET_DOUBLE and documents that as being a Unix
timestamp. So that's not going to do either. I solved it the same way
as in the other code path, ie just eat the cost of doing our own time
inquiry.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers