review: pgbench - aggregation of info written into log

Started by Pavel Stehuleover 13 years ago34 messageshackers
Jump to latest
#1Pavel Stehule
pavel.stehule@gmail.com

Hello

I did review of pgbench patch
http://archives.postgresql.org/pgsql-hackers/2012-09/msg00737.php

* this patch is cleanly patched

* I had a problem with doc

make[1]: Entering directory `/home/pavel/src/postgresql/doc/src/sgml'
openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .
-c /usr/share/sgml/docbook/dsssl-stylesheets/catalog -d stylesheet.dsl
-t sgml -i output-html -V html-index postgres.sgml
openjade:pgbench.sgml:767:8:E: document type does not allow element
"TITLE" here; missing one of "ABSTRACT", "AUTHORBLURB", "MSGSET",
"CALLOUTLIST", "ITEMIZEDLIST", "ORDEREDLIST", "SEGMENTEDLIST",
"VARIABLELIST", "CAUTION", "IMPORTANT", "NOTE", "TIP", "WARNING",
"FORMALPARA", "BLOCKQUOTE", "EQUATION", "EXAMPLE", "FIGURE", "TABLE",
"PROCEDURE", "SIDEBAR", "QANDASET", "REFSECT3" start-tag
make[1]: *** [HTML.index] Error 1
make[1]: *** Deleting file `HTML.index'
make[1]: Leaving directory `/home/pavel/src/postgresql/doc/src/sgml

* pgbench is compiled without warnings

* there is a few issues in source code

+			
+			/* should we aggregate the results or not? */
+			if (use_log_agg)
+			{
+				
+				/* are we still in the same interval? if yes, accumulate the
+				 * values (print them otherwise) */
+				if (agg->start_time + use_log_agg >= INSTR_TIME_GET_DOUBLE(now))
+				{
+					

* a real time range for aggregation is dynamic (pgbench is not
realtime application), so I am not sure if following constraint has
sense

 +	if ((duration > 0) && (use_log_agg > 0) && (duration % use_log_agg != 0)) {
+		fprintf(stderr, "duration (%d) must be a multiple of aggregation
interval (%d)\n", duration, use_log_agg);
+		exit(1);
+	}

* it doesn't flush last aggregated data (it is mentioned by Tomas:
"Also, I've realized the last interval may not be logged at all - I'll
take a look into this in the next version of the patch."). And it can
be significant for higher values of "use_log_agg"

* a name of variable "use_log_agg" is not good - maybe "log_agg_interval" ?

Regards

Pavel Stehule

#2Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Pavel Stehule (#1)
Re: review: pgbench - aggregation of info written into log

Hi,

attached is a v4 of the patch. There are not many changes, mostly some
simple tidying up, except for handling the Windows.

After a bit more investigation, it seems to me that we can't really get
the same behavior as in other systems - basically the timestamp is
unavailable so we can't log the interval start. So it seems to me the
best we can do is to disable this option on Windows (and this is done in
the patch). So when you try to use "--aggregate-interval" on Win it will
complain it's an unknown option.

Now that I think of it, there might be a better solution that would not
mimic the Linux/Unix behaviour exactly - we do have elapsed time since
the start of the benchmark, so maybe we should use this instead of the
timestamp? I mean on systems with reasonable gettimeofday we'd get

1345828501 5601 1542744 483552416 61 2573
1345828503 7884 1979812 565806736 60 1479
1345828505 7208 1979422 567277552 59 1391
1345828507 7685 1980268 569784714 60 1398
1345828509 7073 1979779 573489941 236 1411

but on Windows we'd get

0 5601 1542744 483552416 61 2573
2 7884 1979812 565806736 60 1479
4 7208 1979422 567277552 59 1391
6 7685 1980268 569784714 60 1398
8 7073 1979779 573489941 236 1411

i.e. the first column is "seconds since start of the test". Hmmmm, and
maybe we could call 'gettimeofday' at the beginning, to get the
timestamp of the test start (AFAIK the issue on Windows is the
resolution, but it should be enough). Then we could add it up with the
elapsed time and we'd get the same output as on Linux.

And maybe this could be done in regular pgbench execution too? But I
really need help from someone who knows Windows and can test this.

Comments regarding Pavel's reviews are below:

On 2.10.2012 20:08, Pavel Stehule wrote:

Hello

I did review of pgbench patch
http://archives.postgresql.org/pgsql-hackers/2012-09/msg00737.php

* this patch is cleanly patched

* I had a problem with doc

make[1]: Entering directory `/home/pavel/src/postgresql/doc/src/sgml'
openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .
-c /usr/share/sgml/docbook/dsssl-stylesheets/catalog -d stylesheet.dsl
-t sgml -i output-html -V html-index postgres.sgml
openjade:pgbench.sgml:767:8:E: document type does not allow element
"TITLE" here; missing one of "ABSTRACT", "AUTHORBLURB", "MSGSET",
"CALLOUTLIST", "ITEMIZEDLIST", "ORDEREDLIST", "SEGMENTEDLIST",
"VARIABLELIST", "CAUTION", "IMPORTANT", "NOTE", "TIP", "WARNING",
"FORMALPARA", "BLOCKQUOTE", "EQUATION", "EXAMPLE", "FIGURE", "TABLE",
"PROCEDURE", "SIDEBAR", "QANDASET", "REFSECT3" start-tag
make[1]: *** [HTML.index] Error 1
make[1]: *** Deleting file `HTML.index'
make[1]: Leaving directory `/home/pavel/src/postgresql/doc/src/sgml

Hmmm, I've never got the docs to build at all, all I do get is a heap of
some SGML/DocBook related issues. Can you investigate a bit more where's
the issue? I don't remember messing with the docs in a way that might
cause this ... mostly copy'n'paste edits.

* pgbench is compiled without warnings

* there is a few issues in source code

+			
+			/* should we aggregate the results or not? */
+			if (use_log_agg)
+			{
+				
+				/* are we still in the same interval? if yes, accumulate the
+				 * values (print them otherwise) */
+				if (agg->start_time + use_log_agg >= INSTR_TIME_GET_DOUBLE(now))
+				{
+					

Errr, so what are the issues here?

* a real time range for aggregation is dynamic (pgbench is not
realtime application), so I am not sure if following constraint has
sense

+	if ((duration > 0) && (use_log_agg > 0) && (duration % use_log_agg != 0)) {
+		fprintf(stderr, "duration (%d) must be a multiple of aggregation
interval (%d)\n", duration, use_log_agg);
+		exit(1);
+	}

I'm not sure what might be the issue here either. If the test duration
is set (using "-T" option), then this kicks in and says that the
duration should be a multiple of aggregation interval. Seems like a
sensible assumption to me. Or do you think this is check should be removed?

* it doesn't flush last aggregated data (it is mentioned by Tomas:
"Also, I've realized the last interval may not be logged at all - I'll
take a look into this in the next version of the patch."). And it can
be significant for higher values of "use_log_agg"

Yes, and I'm still not sure how to fix this in practice. But I do have
this on my TODO.

* a name of variable "use_log_agg" is not good - maybe "log_agg_interval" ?

I've changed it to agg_interval.

regards
Tomas

Attachments:

pgbench-aggregated-v4.patchtext/plain; charset=UTF-8; name=pgbench-aggregated-v4.patchDownload+186-15
#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tomas Vondra (#2)
Re: review: pgbench - aggregation of info written into log

Hello

2012/11/12 Tomas Vondra <tv@fuzzy.cz>:

Hi,

attached is a v4 of the patch. There are not many changes, mostly some
simple tidying up, except for handling the Windows.

After a bit more investigation, it seems to me that we can't really get
the same behavior as in other systems - basically the timestamp is
unavailable so we can't log the interval start. So it seems to me the
best we can do is to disable this option on Windows (and this is done in
the patch). So when you try to use "--aggregate-interval" on Win it will
complain it's an unknown option.

Now that I think of it, there might be a better solution that would not
mimic the Linux/Unix behaviour exactly - we do have elapsed time since
the start of the benchmark, so maybe we should use this instead of the
timestamp? I mean on systems with reasonable gettimeofday we'd get

1345828501 5601 1542744 483552416 61 2573
1345828503 7884 1979812 565806736 60 1479
1345828505 7208 1979422 567277552 59 1391
1345828507 7685 1980268 569784714 60 1398
1345828509 7073 1979779 573489941 236 1411

but on Windows we'd get

0 5601 1542744 483552416 61 2573
2 7884 1979812 565806736 60 1479
4 7208 1979422 567277552 59 1391
6 7685 1980268 569784714 60 1398
8 7073 1979779 573489941 236 1411

i.e. the first column is "seconds since start of the test". Hmmmm, and
maybe we could call 'gettimeofday' at the beginning, to get the
timestamp of the test start (AFAIK the issue on Windows is the
resolution, but it should be enough). Then we could add it up with the
elapsed time and we'd get the same output as on Linux.

And maybe this could be done in regular pgbench execution too? But I
really need help from someone who knows Windows and can test this.

Comments regarding Pavel's reviews are below:

On 2.10.2012 20:08, Pavel Stehule wrote:

Hello

I did review of pgbench patch
http://archives.postgresql.org/pgsql-hackers/2012-09/msg00737.php

* this patch is cleanly patched

* I had a problem with doc

make[1]: Entering directory `/home/pavel/src/postgresql/doc/src/sgml'
openjade -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D .
-c /usr/share/sgml/docbook/dsssl-stylesheets/catalog -d stylesheet.dsl
-t sgml -i output-html -V html-index postgres.sgml
openjade:pgbench.sgml:767:8:E: document type does not allow element
"TITLE" here; missing one of "ABSTRACT", "AUTHORBLURB", "MSGSET",
"CALLOUTLIST", "ITEMIZEDLIST", "ORDEREDLIST", "SEGMENTEDLIST",
"VARIABLELIST", "CAUTION", "IMPORTANT", "NOTE", "TIP", "WARNING",
"FORMALPARA", "BLOCKQUOTE", "EQUATION", "EXAMPLE", "FIGURE", "TABLE",
"PROCEDURE", "SIDEBAR", "QANDASET", "REFSECT3" start-tag
make[1]: *** [HTML.index] Error 1
make[1]: *** Deleting file `HTML.index'
make[1]: Leaving directory `/home/pavel/src/postgresql/doc/src/sgml

Hmmm, I've never got the docs to build at all, all I do get is a heap of
some SGML/DocBook related issues. Can you investigate a bit more where's
the issue? I don't remember messing with the docs in a way that might
cause this ... mostly copy'n'paste edits.

* pgbench is compiled without warnings

* there is a few issues in source code

+
+                     /* should we aggregate the results or not? */
+                     if (use_log_agg)
+                     {
+
+                             /* are we still in the same interval? if yes, accumulate the
+                              * values (print them otherwise) */
+                             if (agg->start_time + use_log_agg >= INSTR_TIME_GET_DOUBLE(now))
+                             {
+

Errr, so what are the issues here?

using a use_log_agg variable - bad name for variable - it is fixed

* a real time range for aggregation is dynamic (pgbench is not
realtime application), so I am not sure if following constraint has
sense

+    if ((duration > 0) && (use_log_agg > 0) && (duration % use_log_agg != 0)) {
+             fprintf(stderr, "duration (%d) must be a multiple of aggregation
interval (%d)\n", duration, use_log_agg);
+             exit(1);
+     }

I'm not sure what might be the issue here either. If the test duration
is set (using "-T" option), then this kicks in and says that the
duration should be a multiple of aggregation interval. Seems like a
sensible assumption to me. Or do you think this is check should be removed?

* it doesn't flush last aggregated data (it is mentioned by Tomas:
"Also, I've realized the last interval may not be logged at all - I'll
take a look into this in the next version of the patch."). And it can
be significant for higher values of "use_log_agg"

Yes, and I'm still not sure how to fix this in practice. But I do have
this on my TODO.

* a name of variable "use_log_agg" is not good - maybe "log_agg_interval" ?

I've changed it to agg_interval.

ok

issues:

* empty lines with invisible chars (tabs) + and sometimes empty lines
after and before {}

* adjustment of start_time

+                                               * the desired interval */
+                                               while (agg->start_time
+ agg_interval < INSTR_TIME_GET_DOUBLE(now))
+
agg->start_time = agg->start_time + agg_interval;

can "skip" one interval - so when transaction time will be larger or
similar to agg_interval - then results can be strange. We have to know
real length of interval

Regards

Pavel

regards
Tomas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Andres Freund
andres@anarazel.de
In reply to: Pavel Stehule (#3)
Re: review: pgbench - aggregation of info written into log

Hi Tomas,

On 2012-11-27 14:55:59 +0100, Pavel Stehule wrote:

attached is a v4 of the patch. There are not many changes, mostly some
simple tidying up, except for handling the Windows.

After a quick look I am not sure what all the talk about windows is
about? instr_time.h seems to provide all you need, even for windows? The
only issue of gettimeofday() for windows seems to be that it is that its
not all that fast an not too high precision, but that shouldn't be a
problem in this case?

Could you expand a bit on the problems?

* I had a problem with doc

The current patch has conflict markers in the sgml source, there seems
to have been some unresolved merge. Maybe that's all that causes the
errors?

Whats your problem with setting up the doc toolchain?

issues:

* empty lines with invisible chars (tabs) + and sometimes empty lines
after and before {}

* adjustment of start_time

+                                               * the desired interval */
+                                               while (agg->start_time
+ agg_interval < INSTR_TIME_GET_DOUBLE(now))
+
agg->start_time = agg->start_time + agg_interval;

can "skip" one interval - so when transaction time will be larger or
similar to agg_interval - then results can be strange. We have to know
real length of interval

Could you post a patch that adresses these issues?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Andres Freund (#4)
Re: review: pgbench - aggregation of info written into log

On 8.12.2012 16:33, Andres Freund wrote:

Hi Tomas,

On 2012-11-27 14:55:59 +0100, Pavel Stehule wrote:

attached is a v4 of the patch. There are not many changes, mostly some
simple tidying up, except for handling the Windows.

After a quick look I am not sure what all the talk about windows is
about? instr_time.h seems to provide all you need, even for windows? The
only issue of gettimeofday() for windows seems to be that it is that its
not all that fast an not too high precision, but that shouldn't be a
problem in this case?

Could you expand a bit on the problems?

Well, in the current pgbench.c, there's this piece of code

#ifndef WIN32
/* This is more than we really ought to know about instr_time */
fprintf(logfile, "%d %d %.0f %d %ld %ld\n",
st->id, st->cnt, usec, st->use_file,
(long) now.tv_sec, (long) now.tv_usec);
#else
/* On Windows, instr_time doesn't provide a timestamp anyway */
fprintf(logfile, "%d %d %.0f %d 0 0\n",
st->id, st->cnt, usec, st->use_file);
#endif

and the Windows-related discussion in this patch is about the same issue.

As I understand it the problem seems to be that INSTR_TIME_SET_CURRENT
does not provide the timestamp at all.

I was thinking about improving this by combining gettimeofday and
INSTR_TIME, but I have no Windows box to test it on :-(

* I had a problem with doc

The current patch has conflict markers in the sgml source, there seems
to have been some unresolved merge. Maybe that's all that causes the
errors?

Ah, I see. Probably my fault, I'll fix that.

Whats your problem with setting up the doc toolchain?

issues:

* empty lines with invisible chars (tabs) + and sometimes empty lines
after and before {}

* adjustment of start_time

+                                               * the desired interval */
+                                               while (agg->start_time
+ agg_interval < INSTR_TIME_GET_DOUBLE(now))
+
agg->start_time = agg->start_time + agg_interval;

can "skip" one interval - so when transaction time will be larger or
similar to agg_interval - then results can be strange. We have to know
real length of interval

Could you post a patch that adresses these issues?

Yes, I'll work on it today/tomorrow and I'll send an improved patch.

Tomas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Andres Freund (#4)
Re: review: pgbench - aggregation of info written into log

Hi,

attached is a v5 of this patch. Details below:

On 8.12.2012 16:33, Andres Freund wrote:

Hi Tomas,

On 2012-11-27 14:55:59 +0100, Pavel Stehule wrote:

attached is a v4 of the patch. There are not many changes, mostly some
simple tidying up, except for handling the Windows.

After a quick look I am not sure what all the talk about windows is
about? instr_time.h seems to provide all you need, even for windows? The
only issue of gettimeofday() for windows seems to be that it is that its
not all that fast an not too high precision, but that shouldn't be a
problem in this case?

Could you expand a bit on the problems?

As explained in the previous message, this is an existing problem with
unavailable timestamp. I'm not very fond of adding Linux-only features,
but fixing that was not the goal of this patch.

* I had a problem with doc

The current patch has conflict markers in the sgml source, there seems
to have been some unresolved merge. Maybe that's all that causes the
errors?

Whats your problem with setting up the doc toolchain?

Yeah, my fault because of incomplete merge. But the real culprit was a
missing refsect2. Fixed.

issues:

* empty lines with invisible chars (tabs) + and sometimes empty lines
after and before {}

Fixed (I've removed the lines etc.)

* adjustment of start_time

+                                               * the desired interval */
+                                               while (agg->start_time
+ agg_interval < INSTR_TIME_GET_DOUBLE(now))
+
agg->start_time = agg->start_time + agg_interval;

can "skip" one interval - so when transaction time will be larger or
similar to agg_interval - then results can be strange. We have to know
real length of interval

Could you post a patch that adresses these issues?

So, in the end I've rewritten the section advancing the start_time. Now
it works so that when skipping an interval (because of a very long
transaction), it will print lines even for those "empty" intervals.

So for example with a transaction file containing a single query

SELECT pg_sleep(1.5);

and an interval length of 1 second, you'll get something like this:

1355009677 0 0 0 0 0
1355009678 1 1501028 2253085056784 1501028 1501028
1355009679 0 0 0 0 0
1355009680 1 1500790 2252370624100 1500790 1500790
1355009681 1 1500723 2252169522729 1500723 1500723
1355009682 0 0 0 0 0
1355009683 1 1500719 2252157516961 1500719 1500719
1355009684 1 1500703 2252109494209 1500703 1500703
1355009685 0 0 0 0 0

which is IMHO the best way to deal with this.

I've fixed several minor issues, added a few comments.

regards
Tomas

Attachments:

pgbench-aggregated-v5.patchtext/plain; charset=UTF-8; name=pgbench-aggregated-v5.patchDownload+193-14
#7Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tomas Vondra (#6)
Re: review: pgbench - aggregation of info written into log

Hi,

I'm looking into this as a committer. It seems that this is the
newest patch and the reviewer(Pavel) stated that it is ready for
commit. However, I noticed that this patch adds a Linux/UNIX only
feature(not available on Windows). So I would like to ask cores if
this is ok or not.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

Hi,

attached is a v5 of this patch. Details below:

On 8.12.2012 16:33, Andres Freund wrote:

Hi Tomas,

On 2012-11-27 14:55:59 +0100, Pavel Stehule wrote:

attached is a v4 of the patch. There are not many changes, mostly some
simple tidying up, except for handling the Windows.

After a quick look I am not sure what all the talk about windows is
about? instr_time.h seems to provide all you need, even for windows? The
only issue of gettimeofday() for windows seems to be that it is that its
not all that fast an not too high precision, but that shouldn't be a
problem in this case?

Could you expand a bit on the problems?

As explained in the previous message, this is an existing problem with
unavailable timestamp. I'm not very fond of adding Linux-only features,
but fixing that was not the goal of this patch.

* I had a problem with doc

The current patch has conflict markers in the sgml source, there seems
to have been some unresolved merge. Maybe that's all that causes the
errors?

Whats your problem with setting up the doc toolchain?

Yeah, my fault because of incomplete merge. But the real culprit was a
missing refsect2. Fixed.

issues:

* empty lines with invisible chars (tabs) + and sometimes empty lines
after and before {}

Fixed (I've removed the lines etc.)

* adjustment of start_time

+                                               * the desired interval */
+                                               while (agg->start_time
+ agg_interval < INSTR_TIME_GET_DOUBLE(now))
+
agg->start_time = agg->start_time + agg_interval;

can "skip" one interval - so when transaction time will be larger or
similar to agg_interval - then results can be strange. We have to know
real length of interval

Could you post a patch that adresses these issues?

So, in the end I've rewritten the section advancing the start_time. Now
it works so that when skipping an interval (because of a very long
transaction), it will print lines even for those "empty" intervals.

So for example with a transaction file containing a single query

SELECT pg_sleep(1.5);

and an interval length of 1 second, you'll get something like this:

1355009677 0 0 0 0 0
1355009678 1 1501028 2253085056784 1501028 1501028
1355009679 0 0 0 0 0
1355009680 1 1500790 2252370624100 1500790 1500790
1355009681 1 1500723 2252169522729 1500723 1500723
1355009682 0 0 0 0 0
1355009683 1 1500719 2252157516961 1500719 1500719
1355009684 1 1500703 2252109494209 1500703 1500703
1355009685 0 0 0 0 0

which is IMHO the best way to deal with this.

I've fixed several minor issues, added a few comments.

regards
Tomas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Andrew Dunstan
andrew@dunslane.net
In reply to: Tatsuo Ishii (#7)
Re: review: pgbench - aggregation of info written into log

On 01/16/2013 05:59 PM, Tatsuo Ishii wrote:

Hi,

I'm looking into this as a committer. It seems that this is the
newest patch and the reviewer(Pavel) stated that it is ready for
commit. However, I noticed that this patch adds a Linux/UNIX only
feature(not available on Windows). So I would like to ask cores if
this is ok or not.

I haven't been following the thread, but if the complaint is that
Windows doesn't have accurate high-resolution timers, which is what it
kinda looks like from the rest of your message, then it's not true.
Every version since Windows2000 has had
QueryPerformanceCounter()/QueryPerformanceFrequency(). And we use it:
see src/include/portability/instr_time.h

If that's not the problem, then can someone please point me at the
message that sets the problem out clearly, or else just recap it?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Andrew Dunstan (#8)
Re: review: pgbench - aggregation of info written into log

I'm looking into this as a committer. It seems that this is the
newest patch and the reviewer(Pavel) stated that it is ready for
commit. However, I noticed that this patch adds a Linux/UNIX only
feature(not available on Windows). So I would like to ask cores if
this is ok or not.

I haven't been following the thread, but if the complaint is that
Windows doesn't have accurate high-resolution timers, which is what it
kinda looks like from the rest of your message, then it's not
true. Every version since Windows2000 has had
QueryPerformanceCounter()/QueryPerformanceFrequency(). And we use it:
see src/include/portability/instr_time.h

In my understanding the problem is not related to resolution.

If that's not the problem, then can someone please point me at the
message that sets the problem out clearly, or else just recap it?

It seems instr_time.h on Windows simply does not provide current
timestamp. From pgbench.c:

/*
* if transaction finished, record the time it took in the log
*/
if (logfile && commands[st->state + 1] == NULL)
{
instr_time now;
instr_time diff;
double usec;

INSTR_TIME_SET_CURRENT(now);
diff = now;
INSTR_TIME_SUBTRACT(diff, st->txn_begin);
usec = (double) INSTR_TIME_GET_MICROSEC(diff);

#ifndef WIN32
/* This is more than we really ought to know about instr_time */
fprintf(logfile, "%d %d %.0f %d %ld %ld\n",
st->id, st->cnt, usec, st->use_file,
(long) now.tv_sec, (long) now.tv_usec);
#else
/* On Windows, instr_time doesn't provide a timestamp anyway */
fprintf(logfile, "%d %d %.0f %d 0 0\n",
st->id, st->cnt, usec, st->use_file);
#endif
}
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Andrew Dunstan
andrew@dunslane.net
In reply to: Tatsuo Ishii (#9)
Re: review: pgbench - aggregation of info written into log

On 01/16/2013 06:48 PM, Tatsuo Ishii wrote:

I'm looking into this as a committer. It seems that this is the
newest patch and the reviewer(Pavel) stated that it is ready for
commit. However, I noticed that this patch adds a Linux/UNIX only
feature(not available on Windows). So I would like to ask cores if
this is ok or not.

I haven't been following the thread, but if the complaint is that
Windows doesn't have accurate high-resolution timers, which is what it
kinda looks like from the rest of your message, then it's not
true. Every version since Windows2000 has had
QueryPerformanceCounter()/QueryPerformanceFrequency(). And we use it:
see src/include/portability/instr_time.h

In my understanding the problem is not related to resolution.

If that's not the problem, then can someone please point me at the
message that sets the problem out clearly, or else just recap it?

It seems instr_time.h on Windows simply does not provide current
timestamp. From pgbench.c:

/*
* if transaction finished, record the time it took in the log
*/
if (logfile && commands[st->state + 1] == NULL)
{
instr_time now;
instr_time diff;
double usec;

INSTR_TIME_SET_CURRENT(now);
diff = now;
INSTR_TIME_SUBTRACT(diff, st->txn_begin);
usec = (double) INSTR_TIME_GET_MICROSEC(diff);

#ifndef WIN32
/* This is more than we really ought to know about instr_time */
fprintf(logfile, "%d %d %.0f %d %ld %ld\n",
st->id, st->cnt, usec, st->use_file,
(long) now.tv_sec, (long) now.tv_usec);
#else
/* On Windows, instr_time doesn't provide a timestamp anyway */
fprintf(logfile, "%d %d %.0f %d 0 0\n",
st->id, st->cnt, usec, st->use_file);
#endif
}

This might be way more than we want to do, but there is an article that
describes some techniques for doing what seems to be missing (AIUI):

<http://msdn.microsoft.com/en-us/magazine/cc163996.aspx&gt;

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Andrew Dunstan (#10)
Re: review: pgbench - aggregation of info written into log

It seems instr_time.h on Windows simply does not provide current
timestamp. From pgbench.c:

/*
* if transaction finished, record the time it took in the log
*/
if (logfile && commands[st->state + 1] == NULL)
{
instr_time now;
instr_time diff;
double usec;

INSTR_TIME_SET_CURRENT(now);
diff = now;
INSTR_TIME_SUBTRACT(diff, st->txn_begin);
usec = (double) INSTR_TIME_GET_MICROSEC(diff);

#ifndef WIN32
/* This is more than we really ought to know about instr_time */
fprintf(logfile, "%d %d %.0f %d %ld %ld\n",
st->id, st->cnt, usec, st->use_file,
(long) now.tv_sec, (long) now.tv_usec);
#else
/* On Windows, instr_time doesn't provide a timestamp anyway */
fprintf(logfile, "%d %d %.0f %d 0 0\n",
st->id, st->cnt, usec, st->use_file);
#endif
}

This might be way more than we want to do, but there is an article
that describes some techniques for doing what seems to be missing
(AIUI):

<http://msdn.microsoft.com/en-us/magazine/cc163996.aspx&gt;

Even this would be doable, I'm afraid it may not fit in 9.3 if we
think about the current status of CF. So our choice would be:

1) Postpone the patch to 9.4

2) Commit the patch in 9.3 without Windows support

I personally am ok with #2. We traditionally avoid particular paltform
specific features on PostgreSQL. However I think the policiy could be
losen for contrib staffs. Also pgbench is just a client program. We
could always use pgbench on UNIX/Linux if we truely need the feature.

What do you think?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Andrew Dunstan
andrew@dunslane.net
In reply to: Tatsuo Ishii (#11)
Re: review: pgbench - aggregation of info written into log

On 01/16/2013 08:05 PM, Tatsuo Ishii wrote:

It seems instr_time.h on Windows simply does not provide current
timestamp. From pgbench.c:

/*
* if transaction finished, record the time it took in the log
*/
if (logfile && commands[st->state + 1] == NULL)
{
instr_time now;
instr_time diff;
double usec;

INSTR_TIME_SET_CURRENT(now);
diff = now;
INSTR_TIME_SUBTRACT(diff, st->txn_begin);
usec = (double) INSTR_TIME_GET_MICROSEC(diff);

#ifndef WIN32
/* This is more than we really ought to know about instr_time */
fprintf(logfile, "%d %d %.0f %d %ld %ld\n",
st->id, st->cnt, usec, st->use_file,
(long) now.tv_sec, (long) now.tv_usec);
#else
/* On Windows, instr_time doesn't provide a timestamp anyway */
fprintf(logfile, "%d %d %.0f %d 0 0\n",
st->id, st->cnt, usec, st->use_file);
#endif
}

This might be way more than we want to do, but there is an article
that describes some techniques for doing what seems to be missing
(AIUI):

<http://msdn.microsoft.com/en-us/magazine/cc163996.aspx&gt;

Even this would be doable, I'm afraid it may not fit in 9.3 if we
think about the current status of CF. So our choice would be:

1) Postpone the patch to 9.4

2) Commit the patch in 9.3 without Windows support

I personally am ok with #2. We traditionally avoid particular paltform
specific features on PostgreSQL. However I think the policiy could be
losen for contrib staffs. Also pgbench is just a client program. We
could always use pgbench on UNIX/Linux if we truely need the feature.

What do you think?

Fair enough, I was just trying to point out alternatives. We have
committed platform-specific features before now. I hope it doesn't just
get left like this, though.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Andrew Dunstan (#12)
Re: review: pgbench - aggregation of info written into log

This might be way more than we want to do, but there is an article
that describes some techniques for doing what seems to be missing
(AIUI):

<http://msdn.microsoft.com/en-us/magazine/cc163996.aspx&gt;

Even this would be doable, I'm afraid it may not fit in 9.3 if we
think about the current status of CF. So our choice would be:

1) Postpone the patch to 9.4

2) Commit the patch in 9.3 without Windows support

I personally am ok with #2. We traditionally avoid particular paltform
specific features on PostgreSQL. However I think the policiy could be
losen for contrib staffs. Also pgbench is just a client program. We
could always use pgbench on UNIX/Linux if we truely need the feature.

What do you think?

Fair enough, I was just trying to point out alternatives. We have
committed platform-specific features before now. I hope it doesn't
just get left like this, though.

Yeah, I hope someone pick this up and propose as a TODO item. In the
mean time, I'm going to commit the patch without Windows support
unless there's objection.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Magnus Hagander
magnus@hagander.net
In reply to: Tatsuo Ishii (#13)
Re: review: pgbench - aggregation of info written into log

On Thu, Jan 17, 2013 at 2:35 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:

This might be way more than we want to do, but there is an article
that describes some techniques for doing what seems to be missing
(AIUI):

<http://msdn.microsoft.com/en-us/magazine/cc163996.aspx&gt;

Even this would be doable, I'm afraid it may not fit in 9.3 if we
think about the current status of CF. So our choice would be:

1) Postpone the patch to 9.4

2) Commit the patch in 9.3 without Windows support

I personally am ok with #2. We traditionally avoid particular paltform
specific features on PostgreSQL. However I think the policiy could be
losen for contrib staffs. Also pgbench is just a client program. We
could always use pgbench on UNIX/Linux if we truely need the feature.

What do you think?

Fair enough, I was just trying to point out alternatives. We have
committed platform-specific features before now. I hope it doesn't
just get left like this, though.

We have committed platform-specific features before, but generally
only when it's not *possible* to do them for all platforms. For
example the posix_fadvise stuff isn't available on Windows at all, so
there isn't much we can do there.

Yeah, I hope someone pick this up and propose as a TODO item. In the
mean time, I'm going to commit the patch without Windows support
unless there's objection.

Perhaps we should actually hold off until someone committs to actually
getting it fixed in the next version? If we do have that, then we can
commit it as a partial feature, but if we just "hope someone picks it
up", that's leaving it very loose..

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Dave Page
dpage@pgadmin.org
In reply to: Magnus Hagander (#14)
Re: review: pgbench - aggregation of info written into log

On Thu, Jan 17, 2013 at 9:36 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Thu, Jan 17, 2013 at 2:35 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:

This might be way more than we want to do, but there is an article
that describes some techniques for doing what seems to be missing
(AIUI):

<http://msdn.microsoft.com/en-us/magazine/cc163996.aspx&gt;

Even this would be doable, I'm afraid it may not fit in 9.3 if we
think about the current status of CF. So our choice would be:

1) Postpone the patch to 9.4

2) Commit the patch in 9.3 without Windows support

I personally am ok with #2. We traditionally avoid particular paltform
specific features on PostgreSQL. However I think the policiy could be
losen for contrib staffs. Also pgbench is just a client program. We
could always use pgbench on UNIX/Linux if we truely need the feature.

What do you think?

Fair enough, I was just trying to point out alternatives. We have
committed platform-specific features before now. I hope it doesn't
just get left like this, though.

We have committed platform-specific features before, but generally
only when it's not *possible* to do them for all platforms. For
example the posix_fadvise stuff isn't available on Windows at all, so
there isn't much we can do there.

Right - having platform specific features for other reasons like lack
of time is a slippery slope in my opinion. We should not get into such
a habit or Windows will quickly become a second class platform as far
as PostgreSQL features are concerned.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Magnus Hagander
magnus@hagander.net
In reply to: Dave Page (#15)
Re: review: pgbench - aggregation of info written into log

On Thu, Jan 17, 2013 at 11:09 AM, Dave Page <dpage@pgadmin.org> wrote:

On Thu, Jan 17, 2013 at 9:36 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Thu, Jan 17, 2013 at 2:35 AM, Tatsuo Ishii <ishii@postgresql.org> wrote:

This might be way more than we want to do, but there is an article
that describes some techniques for doing what seems to be missing
(AIUI):

<http://msdn.microsoft.com/en-us/magazine/cc163996.aspx&gt;

Even this would be doable, I'm afraid it may not fit in 9.3 if we
think about the current status of CF. So our choice would be:

1) Postpone the patch to 9.4

2) Commit the patch in 9.3 without Windows support

I personally am ok with #2. We traditionally avoid particular paltform
specific features on PostgreSQL. However I think the policiy could be
losen for contrib staffs. Also pgbench is just a client program. We
could always use pgbench on UNIX/Linux if we truely need the feature.

What do you think?

Fair enough, I was just trying to point out alternatives. We have
committed platform-specific features before now. I hope it doesn't
just get left like this, though.

We have committed platform-specific features before, but generally
only when it's not *possible* to do them for all platforms. For
example the posix_fadvise stuff isn't available on Windows at all, so
there isn't much we can do there.

Right - having platform specific features for other reasons like lack
of time is a slippery slope in my opinion. We should not get into such
a habit or Windows will quickly become a second class platform as far
as PostgreSQL features are concerned.

Especially since there is no lack of time - the functionality is
there, it just looks (significantly) different.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Magnus Hagander (#14)
Re: review: pgbench - aggregation of info written into log

Dne 17.01.2013 10:36, Magnus Hagander napsal:

On Thu, Jan 17, 2013 at 2:35 AM, Tatsuo Ishii <ishii@postgresql.org>
wrote:

This might be way more than we want to do, but there is an
article
that describes some techniques for doing what seems to be missing
(AIUI):

<http://msdn.microsoft.com/en-us/magazine/cc163996.aspx&gt;

Even this would be doable, I'm afraid it may not fit in 9.3 if we
think about the current status of CF. So our choice would be:

1) Postpone the patch to 9.4

2) Commit the patch in 9.3 without Windows support

I personally am ok with #2. We traditionally avoid particular
paltform
specific features on PostgreSQL. However I think the policiy
could be
losen for contrib staffs. Also pgbench is just a client program.
We
could always use pgbench on UNIX/Linux if we truely need the
feature.

What do you think?

Fair enough, I was just trying to point out alternatives. We have
committed platform-specific features before now. I hope it doesn't
just get left like this, though.

We have committed platform-specific features before, but generally
only when it's not *possible* to do them for all platforms. For
example the posix_fadvise stuff isn't available on Windows at all, so
there isn't much we can do there.

Maybe, although this platform-dependence already exists in pgbench to
some
extent (the Windows branch is unable to log the timestamps of
transactions).

It would certainly be better if pgbench was able to handle Windows and
Linux equally, but that was not the aim of this patch.

Yeah, I hope someone pick this up and propose as a TODO item. In the
mean time, I'm going to commit the patch without Windows support
unless there's objection.

Perhaps we should actually hold off until someone committs to
actually
getting it fixed in the next version? If we do have that, then we can
commit it as a partial feature, but if we just "hope someone picks it
up", that's leaving it very loose..

Well, given that I'm an author of that patch, that 'someone' would have
to be me. The problem is I have access to absolutely no Windows
machines,
not mentioning the development tools (and that I have no clue about
it).

I vaguely remember there were people on this list doing Windows
development
on a virtual machine or something. Any interest in working on this /
giving
me some help?

Tomas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Magnus Hagander (#16)
Re: review: pgbench - aggregation of info written into log

Dne 17.01.2013 11:16, Magnus Hagander napsal:

On Thu, Jan 17, 2013 at 11:09 AM, Dave Page <dpage@pgadmin.org>
wrote:

On Thu, Jan 17, 2013 at 9:36 AM, Magnus Hagander
<magnus@hagander.net> wrote:

We have committed platform-specific features before, but generally
only when it's not *possible* to do them for all platforms. For
example the posix_fadvise stuff isn't available on Windows at all,
so
there isn't much we can do there.

Right - having platform specific features for other reasons like
lack
of time is a slippery slope in my opinion. We should not get into
such
a habit or Windows will quickly become a second class platform as
far
as PostgreSQL features are concerned.

Especially since there is no lack of time - the functionality is
there, it just looks (significantly) different.

Really? Any link to relevant docs or something?

When doing some research in this field, the only option I was able to
come up
with was combining gettimeofday() with the timing functionality, and do
something
like this:

1) call gettimeofday() at thread start, giving a common unix
timestamp
2) measure the time from the thread start using the conters (for each
transaction)
3) combine those values

This might of course give up to a second difference compared to the
actual time
(because of the gettimeofday precision), but IMHO that's fine.

An even simpler option would omit the (1), so the timestamps would
start at 0.

Or is there a better way?

Tomas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Andrew Dunstan
andrew@dunslane.net
In reply to: Tomas Vondra (#18)
Re: review: pgbench - aggregation of info written into log

On 01/17/2013 06:11 AM, Tomas Vondra wrote:

Dne 17.01.2013 11:16, Magnus Hagander napsal:

On Thu, Jan 17, 2013 at 11:09 AM, Dave Page <dpage@pgadmin.org> wrote:

On Thu, Jan 17, 2013 at 9:36 AM, Magnus Hagander
<magnus@hagander.net> wrote:

We have committed platform-specific features before, but generally
only when it's not *possible* to do them for all platforms. For
example the posix_fadvise stuff isn't available on Windows at all, so
there isn't much we can do there.

Right - having platform specific features for other reasons like lack
of time is a slippery slope in my opinion. We should not get into such
a habit or Windows will quickly become a second class platform as far
as PostgreSQL features are concerned.

Especially since there is no lack of time - the functionality is
there, it just looks (significantly) different.

Really? Any link to relevant docs or something?

When doing some research in this field, the only option I was able to
come up
with was combining gettimeofday() with the timing functionality, and
do something
like this:

1) call gettimeofday() at thread start, giving a common unix timestamp
2) measure the time from the thread start using the conters (for
each transaction)
3) combine those values

This might of course give up to a second difference compared to the
actual time
(because of the gettimeofday precision), but IMHO that's fine.

The link I posted showed a technique which essentially uses edge
detection on gettimeofday() to get an accurate correspondence between
the time of day and the performance counter, following which you can
supposedly calculate the time of day with a high degree of accuracy just
from the performance counter. You should be able to do this just once,
at program start. If you don't care that much about the initial
precision then your procedure should work fine, I think.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Andrew Dunstan
andrew@dunslane.net
In reply to: Tomas Vondra (#17)
Re: review: pgbench - aggregation of info written into log

On 01/17/2013 06:04 AM, Tomas Vondra wrote:

The problem is I have access to absolutely no Windows machines,
not mentioning the development tools (and that I have no clue about it).

I vaguely remember there were people on this list doing Windows
development
on a virtual machine or something. Any interest in working on this /
giving
me some help?

One of the items on my very long TODO list (see stuff elsewhere about
committers not doing enough reviewing and committing) is to create a
package that can easily be run to set Windows Postgres development
environments for MSVC, Mingw and Cygwin on Amazon, GoGrid etc.

Once I get that done I'll be a lot less sympathetic to "I don't have
access to Windows" pleas.

Windows does run in a VM very well, but if you're going to set it up on
your own VM environment, (e.h. VirtualBox or KVM/qemu) you need your own
legal copy to install there. If you don't already have one, that will
set you back about $140.00 (for w7 Pro) in the USA. Note that that's
significantly better than the situation with OSX, which you can't run at
all except on Apple hardware.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Dave Page
dpage@pgadmin.org
In reply to: Andrew Dunstan (#20)
#22Magnus Hagander
magnus@hagander.net
In reply to: Andrew Dunstan (#20)
#23Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Magnus Hagander (#22)
#24Robert Haas
robertmhaas@gmail.com
In reply to: Tatsuo Ishii (#23)
#25Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Robert Haas (#24)
#26Craig Ringer
craig@2ndquadrant.com
In reply to: Magnus Hagander (#22)
#27Noah Misch
noah@leadboat.com
In reply to: Craig Ringer (#26)
#28Craig Ringer
craig@2ndquadrant.com
In reply to: Noah Misch (#27)
#29Dave Page
dpage@pgadmin.org
In reply to: Craig Ringer (#28)
#30Andrew Dunstan
andrew@dunslane.net
In reply to: Dave Page (#29)
#31Dave Page
dpage@pgadmin.org
In reply to: Andrew Dunstan (#30)
#32Magnus Hagander
magnus@hagander.net
In reply to: Dave Page (#31)
#33Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tatsuo Ishii (#25)
#34Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tatsuo Ishii (#33)