PATCH: pgbench - random sampling of transaction written into log

Started by Tomas Vondraover 13 years ago10 messages

tv@fuzzy.cz

over 13 years ago

1 attachment(s)

Hi,

attached is a patch that adds support for random sampling in pgbench, when
it's executed with "-l" flag. You can do for example this:

$ pgbench -l -T 120 -R 1 db

and then only 1% of transactions will be written into the log file. If you
omit the tag, all the transactions are written (i.e. it's backward
compatible).

Recently I've been using pgbench on hardware that can handle a lot of
transactions (various flavors of SSDs, lots of RAM, ...), and in such
cases the log files may get pretty big (even several gigabytes for a
single 1-hour run). A reasonable random sample (along with the stats
provided by pgbench itself) is often more than enough in such cases.

kind regards
Tomas

Attachments:

pgbench-sampling.difftext/plain; name=pgbench-sampling.diffDownload

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 00cab73..e849b2b 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -145,6 +145,9 @@ char	   *index_tablespace = NULL;
 #define naccounts	100000
 
 bool		use_log;			/* log transaction latencies to a file */
+bool		use_log_sampling;	/* sample the log randomly */
+int			nsample_rate = 100;	/* default log sampling rate */
+
 bool		is_connect;			/* establish connection for each transaction */
 bool		is_latencies;		/* report per-command latencies */
 int			main_pid;			/* main process id used in log filename */
@@ -364,6 +367,7 @@ usage(void)
 		   "  -f FILENAME  read transaction script from FILENAME\n"
 		   "  -j NUM       number of threads (default: 1)\n"
 		   "  -l           write transaction times to log file\n"
+		   "  -R NUM       log sampling rate in pct (default: 100)\n"
 		   "  -M simple|extended|prepared\n"
 		   "               protocol for submitting queries to server (default: simple)\n"
 		   "  -n           do not run VACUUM before tests\n"
@@ -877,21 +881,25 @@ top:
 			instr_time	diff;
 			double		usec;
 
-			INSTR_TIME_SET_CURRENT(now);
-			diff = now;
-			INSTR_TIME_SUBTRACT(diff, st->txn_begin);
-			usec = (double) INSTR_TIME_GET_MICROSEC(diff);
+			/* either no sampling or is within the sample */
+			if ((! use_log_sampling) || (rand() % 100 < nsample_rate)) {
+
+				INSTR_TIME_SET_CURRENT(now);
+				diff = now;
+				INSTR_TIME_SUBTRACT(diff, st->txn_begin);
+				usec = (double) INSTR_TIME_GET_MICROSEC(diff);
 
 #ifndef WIN32
-			/* This is more than we really ought to know about instr_time */
-			fprintf(logfile, "%d %d %.0f %d %ld %ld\n",
-					st->id, st->cnt, usec, st->use_file,
-					(long) now.tv_sec, (long) now.tv_usec);
+				/* This is more than we really ought to know about instr_time */
+				fprintf(logfile, "%d %d %.0f %d %ld %ld\n",
+						st->id, st->cnt, usec, st->use_file,
+						(long) now.tv_sec, (long) now.tv_usec);
 #else
-			/* On Windows, instr_time doesn't provide a timestamp anyway */
-			fprintf(logfile, "%d %d %.0f %d 0 0\n",
-					st->id, st->cnt, usec, st->use_file);
+				/* On Windows, instr_time doesn't provide a timestamp anyway */
+				fprintf(logfile, "%d %d %.0f %d 0 0\n",
+						st->id, st->cnt, usec, st->use_file);
 #endif
+			}
 		}
 
 		if (commands[st->state]->type == SQL_COMMAND)
@@ -1962,7 +1970,7 @@ main(int argc, char **argv)
 	state = (CState *) xmalloc(sizeof(CState));
 	memset(state, 0, sizeof(CState));
 
-	while ((c = getopt_long(argc, argv, "ih:nvp:dSNc:j:Crs:t:T:U:lf:D:F:M:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "ih:nvp:dSNc:j:Crs:t:T:U:lf:R:D:F:M:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -2070,6 +2078,15 @@ main(int argc, char **argv)
 			case 'l':
 				use_log = true;
 				break;
+			case 'R':
+				use_log_sampling = true;
+				nsample_rate = atoi(optarg);
+				if (nsample_rate <= 0 || nsample_rate > 100)
+				{
+					fprintf(stderr, "invalid sampling rate: %d\n", nsample_rate);
+					exit(1);
+				}
+				break;
 			case 'f':
 				ttype = 3;
 				filename = optarg;
@@ -2158,6 +2175,12 @@ main(int argc, char **argv)
 		exit(1);
 	}
 
+	/* -R may be used only with -l */
+	if (use_log_sampling && (! use_log)) {
+		fprintf(stderr, "log sampling rate is allowed only when logging transactions\n");
+		exit(1);
+	}
+
 	/*
 	 * is_latencies only works with multiple threads in thread-based
 	 * implementations, not fork-based ones, because it supposes that the
diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml
index 437fcea..962e446 100644
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@@ -317,6 +317,22 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
      </varlistentry>
 
      <varlistentry>
+      <term><option>-R</option> <replaceable>rate</></term>
+      <listitem>
+       <para>
+        Sampling rate, used when writing data into the log in percent. 100 means all
+        transactions will be logged, 1 means only 1% of the transactions will be logged.
+        Default is 100 (all transactions).
+       </para>
+       <para>
+        Be careful when processing the log file - e.g. when computing tps values, you
+        need to multiply the numbers accordingly (e.g. with 1% sample you'll get 1/100
+        of the actual tps).
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-M</option> <replaceable>querymode</></term>
       <listitem>
        <para>

Jeff Janes

jeff.janes@gmail.com

over 13 years ago

In reply to: Tomas Vondra (#1)

Re: PATCH: pgbench - random sampling of transaction written into log

On Fri, Aug 24, 2012 at 2:16 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

Hi,

attached is a patch that adds support for random sampling in pgbench, when
it's executed with "-l" flag. You can do for example this:

$ pgbench -l -T 120 -R 1 db

and then only 1% of transactions will be written into the log file. If you
omit the tag, all the transactions are written (i.e. it's backward
compatible).

Hi Tomas,

You use the rand() function. Isn't that function not thread-safe?
Or, if it is thread-safe, does it accomplish that with a mutex? That
was a problem with a different rand function used in pgbench that
Robert Haas fixed a while ago, 4af43ee3f165c8e4b332a7e680.

Also, what benefit is had by using modulus on rand(), rather than just
modulus on an incrementing counter?

Could you explain the value of this patch, given your other one that
does aggregation? If both were accepted, I think I would always use
the aggregation one in preference to this one.

Thanks,

Jeff

Tomas Vondra

tv@fuzzy.cz

over 13 years ago

In reply to: Jeff Janes (#2)

Re: PATCH: pgbench - random sampling of transaction written into log

On 26.8.2012 00:19, Jeff Janes wrote:

On Fri, Aug 24, 2012 at 2:16 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

Hi,

attached is a patch that adds support for random sampling in pgbench, when
it's executed with "-l" flag. You can do for example this:

$ pgbench -l -T 120 -R 1 db

and then only 1% of transactions will be written into the log file. If you
omit the tag, all the transactions are written (i.e. it's backward
compatible).

Hi Tomas,

You use the rand() function. Isn't that function not thread-safe?
Or, if it is thread-safe, does it accomplish that with a mutex? That
was a problem with a different rand function used in pgbench that
Robert Haas fixed a while ago, 4af43ee3f165c8e4b332a7e680.

Hi Jeff,

Aha! Good catch. I've used rand() which seems to be neither reentrant or
thread-safe (unless the man page is incorrect). Anyway, using pg_erand48
or getrand seems like an appropriate fix.

Also, what benefit is had by using modulus on rand(), rather than just
modulus on an incrementing counter?

Hmm, I was thinking about that too, but I wasn't really sure how would
that behave with multiple SQL files etc. But now I see the files are
actually chosen randomly, so using a counter seems like a good idea.

Could you explain the value of this patch, given your other one that
does aggregation? If both were accepted, I think I would always use
the aggregation one in preference to this one.

The difference is that the sample contains information that is simply
unavailable in the aggregated output. For example when using multiple
files, you can compute per-file averages from the sample, but the
aggregated output contains just a single line for all files combined.

Tomas

Tomas Vondra

tv@fuzzy.cz

over 13 years ago

In reply to: Tomas Vondra (#3)

1 attachment(s)

Re: PATCH: pgbench - random sampling of transaction written into log

On 26.8.2012 02:48, Tomas Vondra wrote:

On 26.8.2012 00:19, Jeff Janes wrote:

On Fri, Aug 24, 2012 at 2:16 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

Hi,

attached is a patch that adds support for random sampling in pgbench, when
it's executed with "-l" flag. You can do for example this:

$ pgbench -l -T 120 -R 1 db

and then only 1% of transactions will be written into the log file. If you
omit the tag, all the transactions are written (i.e. it's backward
compatible).

Hi Tomas,

You use the rand() function. Isn't that function not thread-safe?
Or, if it is thread-safe, does it accomplish that with a mutex? That
was a problem with a different rand function used in pgbench that
Robert Haas fixed a while ago, 4af43ee3f165c8e4b332a7e680.

Hi Jeff,

Aha! Good catch. I've used rand() which seems to be neither reentrant or
thread-safe (unless the man page is incorrect). Anyway, using pg_erand48
or getrand seems like an appropriate fix.

Also, what benefit is had by using modulus on rand(), rather than just
modulus on an incrementing counter?

Hmm, I was thinking about that too, but I wasn't really sure how would
that behave with multiple SQL files etc. But now I see the files are
actually chosen randomly, so using a counter seems like a good idea.

Attached is an improved patch, with a call to rand() replaced with
getrand().

I was thinking about the counter but I'm not really sure how to handle
cases like "39%" - I'm not sure a plain (counter % 100 < 37) is not a
good sampling, because it always keeps continuous sequences of
transactions. Maybe there's a clever way to use a counter, but let's
stick to a getrand() unless we can prove is't causing issues. Especially
considering that a lot of data won't be be written at all with low
sampling rates.

kind regards
Tomas

Attachments:

pgbench-sampling-v2.difftext/plain; charset=UTF-8; name=pgbench-sampling-v2.diffDownload

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 00cab73..12c1338 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -145,6 +145,9 @@ char	   *index_tablespace = NULL;
 #define naccounts	100000
 
 bool		use_log;			/* log transaction latencies to a file */
+bool		use_log_sampling;	/* sample the log randomly */
+int			nsample_rate = 100;	/* default log sampling rate */
+
 bool		is_connect;			/* establish connection for each transaction */
 bool		is_latencies;		/* report per-command latencies */
 int			main_pid;			/* main process id used in log filename */
@@ -364,6 +367,7 @@ usage(void)
 		   "  -f FILENAME  read transaction script from FILENAME\n"
 		   "  -j NUM       number of threads (default: 1)\n"
 		   "  -l           write transaction times to log file\n"
+		   "  -R NUM       log sampling rate in pct (default: 100)\n"
 		   "  -M simple|extended|prepared\n"
 		   "               protocol for submitting queries to server (default: simple)\n"
 		   "  -n           do not run VACUUM before tests\n"
@@ -877,21 +881,25 @@ top:
 			instr_time	diff;
 			double		usec;
 
-			INSTR_TIME_SET_CURRENT(now);
-			diff = now;
-			INSTR_TIME_SUBTRACT(diff, st->txn_begin);
-			usec = (double) INSTR_TIME_GET_MICROSEC(diff);
+			/* either no sampling or is within the sample */
+			if ((! use_log_sampling) || (getrand(thread, 0, 99) < nsample_rate)) {
+
+				INSTR_TIME_SET_CURRENT(now);
+				diff = now;
+				INSTR_TIME_SUBTRACT(diff, st->txn_begin);
+				usec = (double) INSTR_TIME_GET_MICROSEC(diff);
 
 #ifndef WIN32
-			/* This is more than we really ought to know about instr_time */
-			fprintf(logfile, "%d %d %.0f %d %ld %ld\n",
-					st->id, st->cnt, usec, st->use_file,
-					(long) now.tv_sec, (long) now.tv_usec);
+				/* This is more than we really ought to know about instr_time */
+				fprintf(logfile, "%d %d %.0f %d %ld %ld\n",
+						st->id, st->cnt, usec, st->use_file,
+						(long) now.tv_sec, (long) now.tv_usec);
 #else
-			/* On Windows, instr_time doesn't provide a timestamp anyway */
-			fprintf(logfile, "%d %d %.0f %d 0 0\n",
-					st->id, st->cnt, usec, st->use_file);
+				/* On Windows, instr_time doesn't provide a timestamp anyway */
+				fprintf(logfile, "%d %d %.0f %d 0 0\n",
+						st->id, st->cnt, usec, st->use_file);
 #endif
+			}
 		}
 
 		if (commands[st->state]->type == SQL_COMMAND)
@@ -1962,7 +1970,7 @@ main(int argc, char **argv)
 	state = (CState *) xmalloc(sizeof(CState));
 	memset(state, 0, sizeof(CState));
 
-	while ((c = getopt_long(argc, argv, "ih:nvp:dSNc:j:Crs:t:T:U:lf:D:F:M:", long_options, &optindex)) != -1)
+	while ((c = getopt_long(argc, argv, "ih:nvp:dSNc:j:Crs:t:T:U:lf:R:D:F:M:", long_options, &optindex)) != -1)
 	{
 		switch (c)
 		{
@@ -2070,6 +2078,15 @@ main(int argc, char **argv)
 			case 'l':
 				use_log = true;
 				break;
+			case 'R':
+				use_log_sampling = true;
+				nsample_rate = atoi(optarg);
+				if (nsample_rate <= 0 || nsample_rate > 100)
+				{
+					fprintf(stderr, "invalid sampling rate: %d\n", nsample_rate);
+					exit(1);
+				}
+				break;
 			case 'f':
 				ttype = 3;
 				filename = optarg;
@@ -2158,6 +2175,12 @@ main(int argc, char **argv)
 		exit(1);
 	}
 
+	/* -R may be used only with -l */
+	if (use_log_sampling && (! use_log)) {
+		fprintf(stderr, "log sampling rate is allowed only when logging transactions\n");
+		exit(1);
+	}
+
 	/*
 	 * is_latencies only works with multiple threads in thread-based
 	 * implementations, not fork-based ones, because it supposes that the
diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml
index 437fcea..962e446 100644
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@@ -317,6 +317,22 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
      </varlistentry>
 
      <varlistentry>
+      <term><option>-R</option> <replaceable>rate</></term>
+      <listitem>
+       <para>
+        Sampling rate, used when writing data into the log in percent. 100 means all
+        transactions will be logged, 1 means only 1% of the transactions will be logged.
+        Default is 100 (all transactions).
+       </para>
+       <para>
+        Be careful when processing the log file - e.g. when computing tps values, you
+        need to multiply the numbers accordingly (e.g. with 1% sample you'll get 1/100
+        of the actual tps).
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-M</option> <replaceable>querymode</></term>
       <listitem>
        <para>

Robert Haas

robertmhaas@gmail.com

over 13 years ago

In reply to: Tomas Vondra (#4)

Re: PATCH: pgbench - random sampling of transaction written into log

On Sun, Aug 26, 2012 at 1:04 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

Attached is an improved patch, with a call to rand() replaced with
getrand().

I was thinking about the counter but I'm not really sure how to handle
cases like "39%" - I'm not sure a plain (counter % 100 < 37) is not a
good sampling, because it always keeps continuous sequences of
transactions. Maybe there's a clever way to use a counter, but let's
stick to a getrand() unless we can prove is't causing issues. Especially
considering that a lot of data won't be be written at all with low
sampling rates.

I like this patch, and I think sticking with a random number is a good
idea. But I have two suggestions. Number one, I think the sampling
rate should be stored as a float, not an integer, because I can easily
imagine wanting a sampling rate that is not an integer percentage -
especially, one that is less than one percent, like half a percent or
a tenth of a percent. Also, I suggest that the command-line option
should be a long option rather than a single character option. That
will be more mnemonic and avoid using up too many single letter
options, of which there is a limited supply. So to sample every
hundredth result, you could do something like this:

pgbench --latency-sample-rate 0.01

Another option I personally think would be useful is an option to
record only those latencies that are above some minimum bound, like
this:

pgbench --latency-only-if-more-than $MICROSECONDS

The problem with recording all the latencies is that it tends to have
a material impact on throughput. Your patch should address that for
the case where you just want to characterize the latency, but it would
also be nice to have a way of recording the outliers.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tomas Vondra

tv@fuzzy.cz

over 13 years ago

In reply to: Robert Haas (#5)

Re: PATCH: pgbench - random sampling of transaction written into log

On 30 Srpen 2012, 17:46, Robert Haas wrote:

On Sun, Aug 26, 2012 at 1:04 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

Attached is an improved patch, with a call to rand() replaced with
getrand().

I was thinking about the counter but I'm not really sure how to handle
cases like "39%" - I'm not sure a plain (counter % 100 < 37) is not a
good sampling, because it always keeps continuous sequences of
transactions. Maybe there's a clever way to use a counter, but let's
stick to a getrand() unless we can prove is't causing issues. Especially
considering that a lot of data won't be be written at all with low
sampling rates.

I like this patch, and I think sticking with a random number is a good
idea. But I have two suggestions. Number one, I think the sampling
rate should be stored as a float, not an integer, because I can easily
imagine wanting a sampling rate that is not an integer percentage -
especially, one that is less than one percent, like half a percent or
a tenth of a percent. Also, I suggest that the command-line option
should be a long option rather than a single character option. That
will be more mnemonic and avoid using up too many single letter
options, of which there is a limited supply. So to sample every
hundredth result, you could do something like this:

pgbench --latency-sample-rate 0.01

Right, I was thinking about that too. I'll do that in the next version of
the patch.

Another option I personally think would be useful is an option to
record only those latencies that are above some minimum bound, like
this:

pgbench --latency-only-if-more-than $MICROSECONDS

The problem with recording all the latencies is that it tends to have
a material impact on throughput. Your patch should address that for
the case where you just want to characterize the latency, but it would
also be nice to have a way of recording the outliers.

That sounds like a pretty trivial patch. I've been thinking about yet
another option - histograms (regular or with exponential bins).

What I'm not sure about is which of these options should be allowed at the
same time - to me, combinations like 'sampling + aggregation' don't make
much sense. Maybe except 'latency-only-if-more-than + aggregation'.

Tomas

Robert Haas

robertmhaas@gmail.com

over 13 years ago

In reply to: Tomas Vondra (#6)

Re: PATCH: pgbench - random sampling of transaction written into log

On Thu, Aug 30, 2012 at 3:48 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

That sounds like a pretty trivial patch. I've been thinking about yet
another option - histograms (regular or with exponential bins).

I thought about that, too, but I think high-outliers is a lot more
useful. At least for the kinds of things I worry about.

What I'm not sure about is which of these options should be allowed at the
same time - to me, combinations like 'sampling + aggregation' don't make
much sense. Maybe except 'latency-only-if-more-than + aggregation'.

Maybe, but perhaps not even. I don't have a problem with saying that
the user is allowed to pick at most one method of reducing the output
volume. I think we could say: either sample, or take high outliers,
or aggregate. If we want to make some of those work in combination,
fine, but I don't think it's absolutely required. What WILL be
required is a clear error message telling you what you did wrong if
you use an unsupported combination.

BTW, I think that using any of these options should automatically
enable -l, rather than requiring it to be specified separately.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tomas Vondra

tv@fuzzy.cz

over 13 years ago

In reply to: Robert Haas (#7)

Re: PATCH: pgbench - random sampling of transaction written into log

On 30 Srpen 2012, 23:44, Robert Haas wrote:

On Thu, Aug 30, 2012 at 3:48 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

That sounds like a pretty trivial patch. I've been thinking about yet
another option - histograms (regular or with exponential bins).

I thought about that, too, but I think high-outliers is a lot more
useful. At least for the kinds of things I worry about.

OK, let's fix the current patches first. We can add more options later.

What I'm not sure about is which of these options should be allowed at
the
same time - to me, combinations like 'sampling + aggregation' don't make
much sense. Maybe except 'latency-only-if-more-than + aggregation'.

Maybe, but perhaps not even. I don't have a problem with saying that
the user is allowed to pick at most one method of reducing the output
volume. I think we could say: either sample, or take high outliers,
or aggregate. If we want to make some of those work in combination,
fine, but I don't think it's absolutely required. What WILL be
required is a clear error message telling you what you did wrong if
you use an unsupported combination.

I'll allow a single option - we can enable combinations that make sense
later.

BTW, I think that using any of these options should automatically
enable -l, rather than requiring it to be specified separately.

Good idea.

Tomas

Tomas Vondra

tv@fuzzy.cz

over 13 years ago

In reply to: Tomas Vondra (#8)

1 attachment(s)

Re: PATCH: pgbench - random sampling of transaction written into log

Dne 31.08.2012 00:01, Tomas Vondra napsal:

On 30 Srpen 2012, 23:44, Robert Haas wrote:

On Thu, Aug 30, 2012 at 3:48 PM, Tomas Vondra <tv@fuzzy.cz> wrote:

That sounds like a pretty trivial patch. I've been thinking about
yet
another option - histograms (regular or with exponential bins).

I thought about that, too, but I think high-outliers is a lot more
useful. At least for the kinds of things I worry about.

OK, let's fix the current patches first. We can add more options
later.

So, here is a fixed version of the patch. I've made these changes:

1) The option is now '--sampling-rate' instead of '-R' and accepts
float arguments. I've decided not to use 0.01 = 1% but use 1 = 1%, so it
accepts values between 0 and 100. I find this more natural.

2) I've removed one of the two new variables, the remaining one is used
both to check whether the sampling is enabled and keep the value.

3) I've decided not to enable "-l" whenever the "--sampling-rate" is
used. Again, I find this a bit less magic behavior.

4) I've fixed some formatting issues - if you notice another one that I
don't see, let me know.

Tomas

Attachments:

pgbench-sampling-v3.difftext/plain; name=pgbench-sampling-v3.diffDownload

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 00cab73..578aeb3 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -130,6 +130,11 @@ int			foreign_keys = 0;
 int			unlogged_tables = 0;
 
 /*
+ * use log sampling (rate => 1 = 100%, 0 = don't use)
+ */
+double		use_sample_rate = 0.0;
+
+/*
  * tablespace selection
  */
 char	   *tablespace = NULL;
@@ -364,6 +369,8 @@ usage(void)
 		   "  -f FILENAME  read transaction script from FILENAME\n"
 		   "  -j NUM       number of threads (default: 1)\n"
 		   "  -l           write transaction times to log file\n"
+		   "  --sampling-rate NUM\n"
+		   "               sampling rate of the log (e.g. 0.01 for 1%% sample)\n"
 		   "  -M simple|extended|prepared\n"
 		   "               protocol for submitting queries to server (default: simple)\n"
 		   "  -n           do not run VACUUM before tests\n"
@@ -877,21 +884,26 @@ top:
 			instr_time	diff;
 			double		usec;
 
-			INSTR_TIME_SET_CURRENT(now);
-			diff = now;
-			INSTR_TIME_SUBTRACT(diff, st->txn_begin);
-			usec = (double) INSTR_TIME_GET_MICROSEC(diff);
+			/* either no sampling or is within the sample */
+			if ((use_sample_rate == 0.0) || (pg_erand48(thread->random_state) <= use_sample_rate))
+			{
+
+				INSTR_TIME_SET_CURRENT(now);
+				diff = now;
+				INSTR_TIME_SUBTRACT(diff, st->txn_begin);
+				usec = (double) INSTR_TIME_GET_MICROSEC(diff);
 
 #ifndef WIN32
-			/* This is more than we really ought to know about instr_time */
-			fprintf(logfile, "%d %d %.0f %d %ld %ld\n",
-					st->id, st->cnt, usec, st->use_file,
-					(long) now.tv_sec, (long) now.tv_usec);
+				/* This is more than we really ought to know about instr_time */
+				fprintf(logfile, "%d %d %.0f %d %ld %ld\n",
+						st->id, st->cnt, usec, st->use_file,
+						(long) now.tv_sec, (long) now.tv_usec);
 #else
-			/* On Windows, instr_time doesn't provide a timestamp anyway */
-			fprintf(logfile, "%d %d %.0f %d 0 0\n",
-					st->id, st->cnt, usec, st->use_file);
+				/* On Windows, instr_time doesn't provide a timestamp anyway */
+				fprintf(logfile, "%d %d %.0f %d 0 0\n",
+						st->id, st->cnt, usec, st->use_file);
 #endif
+			}
 		}
 
 		if (commands[st->state]->type == SQL_COMMAND)
@@ -1918,6 +1930,7 @@ main(int argc, char **argv)
 		{"index-tablespace", required_argument, NULL, 3},
 		{"tablespace", required_argument, NULL, 2},
 		{"unlogged-tables", no_argument, &unlogged_tables, 1},
+		{"sampling-rate", required_argument, NULL, 4},
 		{NULL, 0, NULL, 0}
 	};
 
@@ -2123,6 +2136,14 @@ main(int argc, char **argv)
 			case 3:				/* index-tablespace */
 				index_tablespace = optarg;
 				break;
+			case 4:
+				use_sample_rate = atof(optarg)/100;
+				if (use_sample_rate <= 0.0 || use_sample_rate > 1.0)
+				{
+					fprintf(stderr, "invalid sampling rate: %f\n", use_sample_rate);
+					exit(1);
+				}
+				break;
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 				exit(1);
@@ -2158,6 +2179,12 @@ main(int argc, char **argv)
 		exit(1);
 	}
 
+	/* --sampling-rate may be used only with -l */
+	if (use_sample_rate > 0 && (! use_log)) {
+		fprintf(stderr, "log sampling rate is allowed only when logging transactions\n");
+		exit(1);
+	}
+
 	/*
 	 * is_latencies only works with multiple threads in thread-based
 	 * implementations, not fork-based ones, because it supposes that the
diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml
index 437fcea..67b21aa 100644
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@@ -317,6 +317,22 @@ pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
      </varlistentry>
 
      <varlistentry>
+      <term><option>--sampling-rate</option> <replaceable>rate</></term>
+      <listitem>
+       <para>
+        Sampling rate, used when writing data into the log in percent. 100 means all
+        transactions will be logged, 1 means only 1% of the transactions will be logged.
+        This may also be a float, e.g. 12.5 means 12.5% will be sampled.
+       </para>
+       <para>
+        Be careful when processing the log file - e.g. when computing tps values, you
+        need to multiply the numbers accordingly (e.g. with 1% sample you'll get 1/100
+        of the actual tps).
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>-M</option> <replaceable>querymode</></term>
       <listitem>
        <para>

#10

Heikki Linnakangas

hlinnakangas@vmware.com

over 13 years ago

In reply to: Tomas Vondra (#9)

Re: PATCH: pgbench - random sampling of transaction written into log

On 03.09.2012 01:40, Tomas Vondra wrote:

So, here is a fixed version of the patch. I've made these changes:

Committed with some minor kibitzing.

1) The option is now '--sampling-rate' instead of '-R' and accepts float
arguments. I've decided not to use 0.01 = 1% but use 1 = 1%, so it
accepts values between 0 and 100. I find this more natural.

I find a fraction, ie. 0.01 = 1% more natural, so I changed it back to
that. I realize this is bike-shedding, but I believe Robert also found
this more natural, as that's what he suggested upthread.

I edited the comments and variable names a little bit, and also added a
small paragraph to the pgbench user manual about this under
"Per-Transaction Logging", in addition to the explanation under
"Benchmarking Options"

- Heikki