Ragged latency log data in multi-threaded pgbench

Started by Greg Smithalmost 16 years ago14 messages
#1Greg Smith
greg@2ndquadrant.com

Just noticed a problem popping up sometimes with the new multi-threaded
pgbench. This is on a Linux RPM build (the alpha4 set) compiled with
'--disable-thread-safety'. Still trying to nail down whether that's a
requirement for this problem to appear or not. I did most of my review
of this feature with it turned on, and haven't been seeing this problem
on other systems that are thread safe. Not sure yet if that's cause and
effect or coincidence yet.

Here's a sample invocation that produces ragged output for me on my one
system:

pgbench -S -T 5 -c 4 -j 4 -l pgbench

The log file produced by this (pgbench_log.<pid>) is supposed to consist
of a series of lines in the following format:

client,trans,latency,filenum,sec,usec

It looks like the switch between clients running on separate workers can
lead to a mix of their respective lines showing up though. Here's a
couple of typical samples, with the bad line in the middle of each set:

1 138 178 0 1268665788 607559
1 139 182 0 1268665788 607751
1 1402 0 2491 0 1268665788 586135
2 1 264 0 1268665788 586463
2 2 192 0 1268665788 586665

1 274 160 0 1268665788 632966
1 275 178 0 1268665788 633154
1 276 184 0 126866578 178 0 1268665788 614015
2 141 190 0 1268665788 614252
2 142 169 0 1268665788 614430

2 274 178 0 1268665788 639218
2 275 175 0 1268665788 639402
2 276 169 0 126866578 171 0 1268665788 626933
0 141 185 0 1268665788 627165
0 142 202 0 1268665788 627377

Looks like sometimes a client is only getting part of its line written
out before getting stomped on by the next one. I think one of the
assumptions being made about how to safely write to this log file may be
broken by the multi-process implementation, which is what you get when
thread-safety is not available.

Since there should only be 6 fields here, I think you can find whether a
given log file has this problem or not like this:

cat pgbench_log.xxxxx | cut -d " " -f 7 | sort | uniq

If anything comes out of that, the latency log file has at least one bad
line in it.

Similarly, this:

cat pgbench_log.xxxxx | cut -d " " -f 1 | sort | uniq

Should only show the client numbers; here there's some first columns
with much bigger numbers too.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Greg Smith (#1)
Re: Ragged latency log data in multi-threaded pgbench

Greg Smith wrote:

Just noticed a problem popping up sometimes with the new
multi-threaded pgbench. This is on a Linux RPM build (the alpha4 set)
compiled with '--disable-thread-safety'. Still trying to nail down
whether that's a requirement for this problem to appear or not. I did
most of my review of this feature with it turned on, and haven't been
seeing this problem on other systems that are thread safe. Not sure
yet if that's cause and effect or coincidence yet.

We had to turn handsprings to prevent this sort of effect with the
logging collector, which was a requirement of being able to implement
CSV logging sanely. So I'm not surprised by this report.

cheers

andrew

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Smith (#1)
Re: Ragged latency log data in multi-threaded pgbench

Greg Smith <greg@2ndquadrant.com> writes:

Looks like sometimes a client is only getting part of its line written
out before getting stomped on by the next one. I think one of the
assumptions being made about how to safely write to this log file may be
broken by the multi-process implementation, which is what you get when
thread-safety is not available.

pgbench doesn't make any effort at all to avoid interleaved writes on
that file. I don't think there is anything much that can be done about
it when you are using the forked-processes implementation. It's
probably possible for it to show up on the multi-threads version too,
depending on how hard libc tries to interlock stdio calls.

regards, tom lane

#4Josh Berkus
josh@agliodbs.com
In reply to: Greg Smith (#1)
Re: Ragged latency log data in multi-threaded pgbench

On 3/15/10 8:41 AM, Greg Smith wrote:

Just noticed a problem popping up sometimes with the new multi-threaded
pgbench. This is on a Linux RPM build (the alpha4 set) compiled with
'--disable-thread-safety'. Still trying to nail down whether that's a
requirement for this problem to appear or not. I did most of my review
of this feature with it turned on, and haven't been seeing this problem
on other systems that are thread safe. Not sure yet if that's cause and
effect or coincidence yet.

For my part, telling people that multi-thread pgbench doesn't work
correctly on systems which are not thread-safe seems perfectly OK.

--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com

#5Takahiro Itagaki
itagaki.takahiro@oss.ntt.co.jp
In reply to: Greg Smith (#1)
Re: Ragged latency log data in multi-threaded pgbench

Greg Smith <greg@2ndquadrant.com> wrote:

It looks like the switch between clients running on separate workers can
lead to a mix of their respective lines showing up though.

Oops. There might be two solutions for the issue:
1. Use explicit locks. The lock primitive will be pthread_mutex for
multi-threaded implementations or semaphore for multi-threaded ones.
2. Use per-thread log files.
File names would be "pgbench_log.<main-process-id>.<thread-id>".

Which is better, or another idea?

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

#6Greg Smith
greg@2ndquadrant.com
In reply to: Takahiro Itagaki (#5)
Re: Ragged latency log data in multi-threaded pgbench

Takahiro Itagaki wrote:

1. Use explicit locks. The lock primitive will be pthread_mutex for
multi-threaded implementations or semaphore for multi-threaded ones.
2. Use per-thread log files.
File names would be "pgbench_log.<main-process-id>.<thread-id>".

I'm concerned that the locking itself will turn into a new pgbench
bottleneck, just as we're clearing the point where it's not for the
first time in a while. And that sounds like it has its own potential
risks/complexity involved.

I could live with per-thread log files. I think my pgbench-tools is the
main consumer of these latency logs floating around right now, I just
pushed a 9.0 update to handle the multiple workers option today that
discovered this). It doesn't make any difference to what I'm doing how
many file I have to process. Just a few lines of extra shell code for
me to pull the rest into the import. That seems like the simplest
solution that's guaranteed to work, just push the problem onto the
client side instead where it's easier to deal with.

Unless someone feels strongly that these have to be interleaved into one
file, based on Andrew's suggestion that this is a hard problem to get
right and Tom's suggestion that this might even extend into the proper
threaded version too, I think a log file per worker is the easiest way
out of this.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Takahiro Itagaki (#5)
Re: Ragged latency log data in multi-threaded pgbench

Takahiro Itagaki <itagaki.takahiro@oss.ntt.co.jp> writes:

Greg Smith <greg@2ndquadrant.com> wrote:

It looks like the switch between clients running on separate workers can
lead to a mix of their respective lines showing up though.

Oops. There might be two solutions for the issue:
1. Use explicit locks. The lock primitive will be pthread_mutex for
multi-threaded implementations or semaphore for multi-threaded ones.
2. Use per-thread log files.
File names would be "pgbench_log.<main-process-id>.<thread-id>".

I think #1 is out of the question, as the synchronization overhead will
do serious damage to the whole point of having a multithreaded pgbench.
#2 might be a reasonable idea.

regards, tom lane

#8Takahiro Itagaki
itagaki.takahiro@oss.ntt.co.jp
In reply to: Tom Lane (#7)
Re: Ragged latency log data in multi-threaded pgbench

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Takahiro Itagaki <itagaki.takahiro@oss.ntt.co.jp> writes:

Oops. There might be two solutions for the issue:
1. Use explicit locks. The lock primitive will be pthread_mutex for
multi-threaded implementations or semaphore for multi-threaded ones.
2. Use per-thread log files.
File names would be "pgbench_log.<main-process-id>.<thread-id>".

I think #1 is out of the question, as the synchronization overhead will
do serious damage to the whole point of having a multithreaded pgbench.
#2 might be a reasonable idea.

Ok, I'll go for #2.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

#9Takahiro Itagaki
itagaki.takahiro@oss.ntt.co.jp
In reply to: Takahiro Itagaki (#8)
1 attachment(s)
Re: Ragged latency log data in multi-threaded pgbench

Takahiro Itagaki <itagaki.takahiro@oss.ntt.co.jp> wrote:

2. Use per-thread log files.
File names would be "pgbench_log.<main-process-id>.<thread-id>".

Here is a patch to implement per-thread log files for pgbench -l.

The log filenames are "pgbench_log.<main-process-id>.<thread-serial-number>"
for each thread, but the first thread (including single-threaded) still uses
"pgbench_log.<main-process-id>" for the name because of compatibility.

Example:
$ pgbench -c16 -j4 -l
$ ls
pgbench_log.2196 pgbench_log.2196.1 pgbench_log.2196.2 pgbench_log.2196.3

Comments and suggenstions welcome.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

Attachments:

pgbench_log_20100316.patchapplication/octet-stream; name=pgbench_log_20100316.patchDownload
diff -cprN head/contrib/pgbench/pgbench.c work/contrib/pgbench/pgbench.c
*** head/contrib/pgbench/pgbench.c	2010-02-26 11:29:55.507518000 +0900
--- work/contrib/pgbench/pgbench.c	2010-03-16 11:48:29.572863239 +0900
*************** int			fillfactor = 100;
*** 131,141 ****
  #define ntellers	10
  #define naccounts	100000
  
- FILE	   *LOGFILE = NULL;
- 
  bool		use_log;			/* log transaction latencies to a file */
! 
! int			is_connect;			/* establish connection for each transaction */
  
  char	   *pghost = "";
  char	   *pgport = "";
--- 131,139 ----
  #define ntellers	10
  #define naccounts	100000
  
  bool		use_log;			/* log transaction latencies to a file */
! bool		is_connect;			/* establish connection for each transaction */
! int			main_pid;			/* main process id used in log filename */
  
  char	   *pghost = "";
  char	   *pgport = "";
*************** typedef struct
*** 183,188 ****
--- 181,187 ----
   */
  typedef struct
  {
+ 	int			tid;			/* thread id */
  	pthread_t	thread;			/* thread handle */
  	CState	   *state;			/* array of CState */
  	int			nstate;			/* length of state[] */
*************** clientDone(CState *st, bool ok)
*** 741,747 ****
  
  /* return false iff client should be disconnected */
  static bool
! doCustom(CState *st, instr_time *conn_time)
  {
  	PGresult   *res;
  	Command   **commands;
--- 740,746 ----
  
  /* return false iff client should be disconnected */
  static bool
! doCustom(CState *st, instr_time *conn_time, FILE *log_file)
  {
  	PGresult   *res;
  	Command   **commands;
*************** top:
*** 778,784 ****
  		/*
  		 * transaction finished: record the time it took in the log
  		 */
! 		if (use_log && commands[st->state + 1] == NULL)
  		{
  			instr_time	now;
  			instr_time	diff;
--- 777,783 ----
  		/*
  		 * transaction finished: record the time it took in the log
  		 */
! 		if (log_file && commands[st->state + 1] == NULL)
  		{
  			instr_time	now;
  			instr_time	diff;
*************** top:
*** 791,802 ****
  
  #ifndef WIN32
  			/* This is more than we really ought to know about instr_time */
! 			fprintf(LOGFILE, "%d %d %.0f %d %ld %ld\n",
  					st->id, st->cnt, usec, st->use_file,
  					(long) now.tv_sec, (long) now.tv_usec);
  #else
  			/* On Windows, instr_time doesn't provide a timestamp anyway */
! 			fprintf(LOGFILE, "%d %d %.0f %d 0 0\n",
  					st->id, st->cnt, usec, st->use_file);
  #endif
  		}
--- 790,801 ----
  
  #ifndef WIN32
  			/* This is more than we really ought to know about instr_time */
! 			fprintf(log_file, "%d %d %.0f %d %ld %ld\n",
  					st->id, st->cnt, usec, st->use_file,
  					(long) now.tv_sec, (long) now.tv_usec);
  #else
  			/* On Windows, instr_time doesn't provide a timestamp anyway */
! 			fprintf(log_file, "%d %d %.0f %d 0 0\n",
  					st->id, st->cnt, usec, st->use_file);
  #endif
  		}
*************** top:
*** 857,863 ****
  		INSTR_TIME_ACCUM_DIFF(*conn_time, end, start);
  	}
  
! 	if (use_log && st->state == 0)
  		INSTR_TIME_SET_CURRENT(st->txn_begin);
  
  	if (commands[st->state]->type == SQL_COMMAND)
--- 856,862 ----
  		INSTR_TIME_ACCUM_DIFF(*conn_time, end, start);
  	}
  
! 	if (log_file && st->state == 0)
  		INSTR_TIME_SET_CURRENT(st->txn_begin);
  
  	if (commands[st->state]->type == SQL_COMMAND)
*************** main(int argc, char **argv)
*** 1833,1839 ****
  				}
  				break;
  			case 'C':
! 				is_connect = 1;
  				break;
  			case 's':
  				scale_given = true;
--- 1832,1838 ----
  				}
  				break;
  			case 'C':
! 				is_connect = true;
  				break;
  			case 's':
  				scale_given = true;
*************** main(int argc, char **argv)
*** 1955,1960 ****
--- 1954,1965 ----
  		exit(1);
  	}
  
+ 	/*
+ 	 * save main process id in the global variable because process id will be
+ 	 * changed after fork.
+ 	 */
+ 	main_pid = (int) getpid();
+ 
  	if (nclients > 1)
  	{
  		state = (CState *) realloc(state, sizeof(CState) * nclients);
*************** main(int argc, char **argv)
*** 1980,1999 ****
  		}
  	}
  
- 	if (use_log)
- 	{
- 		char		logpath[64];
- 
- 		snprintf(logpath, 64, "pgbench_log.%d", (int) getpid());
- 		LOGFILE = fopen(logpath, "w");
- 
- 		if (LOGFILE == NULL)
- 		{
- 			fprintf(stderr, "Couldn't open logfile \"%s\": %s", logpath, strerror(errno));
- 			exit(1);
- 		}
- 	}
- 
  	if (debug)
  	{
  		if (duration <= 0)
--- 1985,1990 ----
*************** main(int argc, char **argv)
*** 2111,2116 ****
--- 2102,2108 ----
  	threads = (TState *) malloc(sizeof(TState) * nthreads);
  	for (i = 0; i < nthreads; i++)
  	{
+ 		threads[i].tid = i;
  		threads[i].state = &state[nclients / nthreads * i];
  		threads[i].nstate = nclients / nthreads;
  		INSTR_TIME_SET_CURRENT(threads[i].start_time);
*************** main(int argc, char **argv)
*** 2159,2166 ****
  	INSTR_TIME_SET_CURRENT(total_time);
  	INSTR_TIME_SUBTRACT(total_time, start_time);
  	printResults(ttype, total_xacts, nclients, nthreads, total_time, conn_total_time);
- 	if (LOGFILE)
- 		fclose(LOGFILE);
  
  	return 0;
  }
--- 2151,2156 ----
*************** threadRun(void *arg)
*** 2171,2176 ****
--- 2161,2167 ----
  	TState	   *thread = (TState *) arg;
  	CState	   *state = thread->state;
  	TResult    *result;
+ 	FILE	   *log_file = NULL;		/* per-thread log file */
  	instr_time	start,
  				end;
  	int			nstate = thread->nstate;
*************** threadRun(void *arg)
*** 2180,2186 ****
  	result = malloc(sizeof(TResult));
  	INSTR_TIME_SET_ZERO(result->conn_time);
  
! 	if (is_connect == 0)
  	{
  		/* make connections to the database */
  		for (i = 0; i < nstate; i++)
--- 2171,2195 ----
  	result = malloc(sizeof(TResult));
  	INSTR_TIME_SET_ZERO(result->conn_time);
  
! 	/* open log file if requested */
! 	if (use_log)
! 	{
! 		char		logpath[64];
! 
! 		if (thread->tid == 0)
! 			snprintf(logpath, 64, "pgbench_log.%d", main_pid);
! 		else
! 			snprintf(logpath, 64, "pgbench_log.%d.%d", main_pid, thread->tid);
! 		log_file = fopen(logpath, "w");
! 
! 		if (log_file == NULL)
! 		{
! 			fprintf(stderr, "Couldn't open logfile \"%s\": %s", logpath, strerror(errno));
! 			goto done;
! 		}
! 	}
! 
! 	if (!is_connect)
  	{
  		/* make connections to the database */
  		for (i = 0; i < nstate; i++)
*************** threadRun(void *arg)
*** 2202,2208 ****
  		int			prev_ecnt = st->ecnt;
  
  		st->use_file = getrand(0, num_files - 1);
! 		if (!doCustom(st, &result->conn_time))
  			remains--;			/* I've aborted */
  
  		if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)
--- 2211,2217 ----
  		int			prev_ecnt = st->ecnt;
  
  		st->use_file = getrand(0, num_files - 1);
! 		if (!doCustom(st, &result->conn_time, log_file))
  			remains--;			/* I've aborted */
  
  		if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)
*************** threadRun(void *arg)
*** 2304,2310 ****
  			if (st->con && (FD_ISSET(PQsocket(st->con), &input_mask)
  							|| commands[st->state]->type == META_COMMAND))
  			{
! 				if (!doCustom(st, &result->conn_time))
  					remains--;	/* I've aborted */
  			}
  
--- 2313,2319 ----
  			if (st->con && (FD_ISSET(PQsocket(st->con), &input_mask)
  							|| commands[st->state]->type == META_COMMAND))
  			{
! 				if (!doCustom(st, &result->conn_time, log_file))
  					remains--;	/* I've aborted */
  			}
  
*************** done:
*** 2326,2331 ****
--- 2335,2342 ----
  		result->xacts += state[i].cnt;
  	INSTR_TIME_SET_CURRENT(end);
  	INSTR_TIME_ACCUM_DIFF(result->conn_time, end, start);
+ 	if (log_file)
+ 		fclose(log_file);
  	return result;
  }
  
diff -cprN head/doc/src/sgml/pgbench.sgml work/doc/src/sgml/pgbench.sgml
*** head/doc/src/sgml/pgbench.sgml	2009-12-16 09:17:22.613136000 +0900
--- work/doc/src/sgml/pgbench.sgml	2010-03-16 11:59:11.389926000 +0900
*************** END;
*** 551,556 ****
--- 551,564 ----
     taken by each transaction to a logfile.  The logfile will be named
     <filename>pgbench_log.<replaceable>nnn</></filename>, where
     <replaceable>nnn</> is the PID of the pgbench process.
+    If <literal>-j</> option is 2 or higher, each worker threads use a
+    dedicated log file. Those log files except the first thread will be named
+    <filename>pgbench_log.<replaceable>nnn</>.<replaceable>mmm</></filename>,
+    where <replaceable>mmm</> is a sequential number of the threads starting
+    with 1.
+   </para>
+ 
+   <para>
     The format of the log is:
  
     <programlisting>
#10Greg Smith
greg@2ndquadrant.com
In reply to: Takahiro Itagaki (#9)
1 attachment(s)
Re: Ragged latency log data in multi-threaded pgbench

Takahiro Itagaki wrote:

The log filenames are "pgbench_log.<main-process-id>.<thread-serial-number>"
for each thread, but the first thread (including single-threaded) still uses
"pgbench_log.<main-process-id>" for the name because of compatibility.

Attached is an updated version that I think is ready to commit. Only
changes are docs--I rewrote those to improve the wording some. The code
looked and tested fine to me. I just added support for the new format
to pgbench-tools and am back to happily running large batches of tests
using it again.

I confirmed a few things:

-On my CentOS system, the original problem is masked if you have
"--enable-thread-safety" on; the multi-threaded output shows up without
any broken lines into the single file. As I suspected it's only the
multi-process implementation that shows the issue here. Since Tom
points out that's luck rather than something that should be relied upon,
I don't think that actually changes what to do here, it just explains
why this wasn't obvious in earlier testing--normally I have thread
safety on nowadays.

-Patch corrects the problem. I took a build without thread safety on,
demonstrated the issue with its pgbench. Apply the patch, rebuild just
pgbench, run again; new multiple log files have no issue.

-It's easy to convert existing scripts to utilize the new multiple log
format. Right now the current idiom you're forced into using when
running pgbench scripts is to track the PID it's run as, then use
something like:

mv pgbench_log.${PID} pgbench.log

To convert to a stable filename for later processing. Now you just use
something like this instead:

cat pgbench_log.${PID}* > pgbench.log
rm -f pgbench_log.${PID}*

And that works fine.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

Attachments:

pgbench_log_20100318.patchtext/x-patch; name=pgbench_log_20100318.patchDownload
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 0019db4..28a8c84 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -131,11 +131,9 @@ int			fillfactor = 100;
 #define ntellers	10
 #define naccounts	100000
 
-FILE	   *LOGFILE = NULL;
-
 bool		use_log;			/* log transaction latencies to a file */
-
-int			is_connect;			/* establish connection for each transaction */
+bool		is_connect;			/* establish connection for each transaction */
+int			main_pid;			/* main process id used in log filename */
 
 char	   *pghost = "";
 char	   *pgport = "";
@@ -183,6 +181,7 @@ typedef struct
  */
 typedef struct
 {
+	int			tid;			/* thread id */
 	pthread_t	thread;			/* thread handle */
 	CState	   *state;			/* array of CState */
 	int			nstate;			/* length of state[] */
@@ -741,7 +740,7 @@ clientDone(CState *st, bool ok)
 
 /* return false iff client should be disconnected */
 static bool
-doCustom(CState *st, instr_time *conn_time)
+doCustom(CState *st, instr_time *conn_time, FILE *log_file)
 {
 	PGresult   *res;
 	Command   **commands;
@@ -778,7 +777,7 @@ top:
 		/*
 		 * transaction finished: record the time it took in the log
 		 */
-		if (use_log && commands[st->state + 1] == NULL)
+		if (log_file && commands[st->state + 1] == NULL)
 		{
 			instr_time	now;
 			instr_time	diff;
@@ -791,12 +790,12 @@ top:
 
 #ifndef WIN32
 			/* This is more than we really ought to know about instr_time */
-			fprintf(LOGFILE, "%d %d %.0f %d %ld %ld\n",
+			fprintf(log_file, "%d %d %.0f %d %ld %ld\n",
 					st->id, st->cnt, usec, st->use_file,
 					(long) now.tv_sec, (long) now.tv_usec);
 #else
 			/* On Windows, instr_time doesn't provide a timestamp anyway */
-			fprintf(LOGFILE, "%d %d %.0f %d 0 0\n",
+			fprintf(log_file, "%d %d %.0f %d 0 0\n",
 					st->id, st->cnt, usec, st->use_file);
 #endif
 		}
@@ -857,7 +856,7 @@ top:
 		INSTR_TIME_ACCUM_DIFF(*conn_time, end, start);
 	}
 
-	if (use_log && st->state == 0)
+	if (log_file && st->state == 0)
 		INSTR_TIME_SET_CURRENT(st->txn_begin);
 
 	if (commands[st->state]->type == SQL_COMMAND)
@@ -1833,7 +1832,7 @@ main(int argc, char **argv)
 				}
 				break;
 			case 'C':
-				is_connect = 1;
+				is_connect = true;
 				break;
 			case 's':
 				scale_given = true;
@@ -1955,6 +1954,12 @@ main(int argc, char **argv)
 		exit(1);
 	}
 
+	/*
+	 * save main process id in the global variable because process id will be
+	 * changed after fork.
+	 */
+	main_pid = (int) getpid();
+
 	if (nclients > 1)
 	{
 		state = (CState *) realloc(state, sizeof(CState) * nclients);
@@ -1980,20 +1985,6 @@ main(int argc, char **argv)
 		}
 	}
 
-	if (use_log)
-	{
-		char		logpath[64];
-
-		snprintf(logpath, 64, "pgbench_log.%d", (int) getpid());
-		LOGFILE = fopen(logpath, "w");
-
-		if (LOGFILE == NULL)
-		{
-			fprintf(stderr, "Couldn't open logfile \"%s\": %s", logpath, strerror(errno));
-			exit(1);
-		}
-	}
-
 	if (debug)
 	{
 		if (duration <= 0)
@@ -2111,6 +2102,7 @@ main(int argc, char **argv)
 	threads = (TState *) malloc(sizeof(TState) * nthreads);
 	for (i = 0; i < nthreads; i++)
 	{
+		threads[i].tid = i;
 		threads[i].state = &state[nclients / nthreads * i];
 		threads[i].nstate = nclients / nthreads;
 		INSTR_TIME_SET_CURRENT(threads[i].start_time);
@@ -2159,8 +2151,6 @@ main(int argc, char **argv)
 	INSTR_TIME_SET_CURRENT(total_time);
 	INSTR_TIME_SUBTRACT(total_time, start_time);
 	printResults(ttype, total_xacts, nclients, nthreads, total_time, conn_total_time);
-	if (LOGFILE)
-		fclose(LOGFILE);
 
 	return 0;
 }
@@ -2171,6 +2161,7 @@ threadRun(void *arg)
 	TState	   *thread = (TState *) arg;
 	CState	   *state = thread->state;
 	TResult    *result;
+	FILE	   *log_file = NULL;		/* per-thread log file */
 	instr_time	start,
 				end;
 	int			nstate = thread->nstate;
@@ -2180,7 +2171,25 @@ threadRun(void *arg)
 	result = malloc(sizeof(TResult));
 	INSTR_TIME_SET_ZERO(result->conn_time);
 
-	if (is_connect == 0)
+	/* open log file if requested */
+	if (use_log)
+	{
+		char		logpath[64];
+
+		if (thread->tid == 0)
+			snprintf(logpath, 64, "pgbench_log.%d", main_pid);
+		else
+			snprintf(logpath, 64, "pgbench_log.%d.%d", main_pid, thread->tid);
+		log_file = fopen(logpath, "w");
+
+		if (log_file == NULL)
+		{
+			fprintf(stderr, "Couldn't open logfile \"%s\": %s", logpath, strerror(errno));
+			goto done;
+		}
+	}
+
+	if (!is_connect)
 	{
 		/* make connections to the database */
 		for (i = 0; i < nstate; i++)
@@ -2202,7 +2211,7 @@ threadRun(void *arg)
 		int			prev_ecnt = st->ecnt;
 
 		st->use_file = getrand(0, num_files - 1);
-		if (!doCustom(st, &result->conn_time))
+		if (!doCustom(st, &result->conn_time, log_file))
 			remains--;			/* I've aborted */
 
 		if (st->ecnt > prev_ecnt && commands[st->state]->type == META_COMMAND)
@@ -2304,7 +2313,7 @@ threadRun(void *arg)
 			if (st->con && (FD_ISSET(PQsocket(st->con), &input_mask)
 							|| commands[st->state]->type == META_COMMAND))
 			{
-				if (!doCustom(st, &result->conn_time))
+				if (!doCustom(st, &result->conn_time, log_file))
 					remains--;	/* I've aborted */
 			}
 
@@ -2326,6 +2335,8 @@ done:
 		result->xacts += state[i].cnt;
 	INSTR_TIME_SET_CURRENT(end);
 	INSTR_TIME_ACCUM_DIFF(result->conn_time, end, start);
+	if (log_file)
+		fclose(log_file);
 	return result;
 }
 
diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml
index 72fb0e5..a141395 100644
--- a/doc/src/sgml/pgbench.sgml
+++ b/doc/src/sgml/pgbench.sgml
@@ -551,6 +551,16 @@ END;
    taken by each transaction to a logfile.  The logfile will be named
    <filename>pgbench_log.<replaceable>nnn</></filename>, where
    <replaceable>nnn</> is the PID of the pgbench process.
+   If the <literal>-j</> option is 2 or higher, creating multiple worker
+   threads, each will have its own log file. The first worker will use the
+   the same name for its log file as in the standard single worker case.  
+   The additional log files for the other workers will be named
+   <filename>pgbench_log.<replaceable>nnn</>.<replaceable>mmm</></filename>,
+   where <replaceable>mmm</> is a sequential number for each worker starting
+   with 1.
+  </para>
+
+  <para>
    The format of the log is:
 
    <programlisting>
#11Takahiro Itagaki
itagaki.takahiro@oss.ntt.co.jp
In reply to: Greg Smith (#10)
Re: Ragged latency log data in multi-threaded pgbench

Greg Smith <greg@2ndquadrant.com> wrote:

Attached is an updated version that I think is ready to commit. Only
changes are docs--I rewrote those to improve the wording some.

Thanks for the correction. Applied.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

#12Greg Smith
greg@2ndquadrant.com
In reply to: Takahiro Itagaki (#11)
Re: Ragged latency log data in multi-threaded pgbench

Takahiro Itagaki wrote:

Greg Smith <greg@2ndquadrant.com> wrote:

Attached is an updated version that I think is ready to commit. Only
changes are docs--I rewrote those to improve the wording some.

Thanks for the correction. Applied.

By the way: the pgbench.sgml that you committed looks like it passed
through a system that added a CR to every line in it. Probably not the
way you intended to commit that.

So far I've done over 40 hours of pgbench runtime worth of testing (>500
runs) using the patched version without any issues, the code itself
continues to act fine.

--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com www.2ndQuadrant.us

#13Takahiro Itagaki
itagaki.takahiro@oss.ntt.co.jp
In reply to: Greg Smith (#12)
Re: Ragged latency log data in multi-threaded pgbench

Greg Smith <greg@2ndquadrant.com> wrote:

By the way: the pgbench.sgml that you committed looks like it passed
through a system that added a CR to every line in it. Probably not the
way you intended to commit that.

Oops, fixed. Thanks.

Regards,
---
Takahiro Itagaki
NTT Open Source Software Center

#14Magnus Hagander
magnus@hagander.net
In reply to: Takahiro Itagaki (#13)
Re: Ragged latency log data in multi-threaded pgbench

On Tue, Mar 23, 2010 at 05:10, Takahiro Itagaki
<itagaki.takahiro@oss.ntt.co.jp> wrote:

Greg Smith <greg@2ndquadrant.com> wrote:

By the way: the pgbench.sgml that you committed looks like it passed
through a system that added a CR to every line in it.  Probably not the
way you intended to commit that.

Oops, fixed. Thanks.

My guess is that this happened because you committed from Windows?

I learned the hard way that this is a bad idea. (Luckily I learned it
on other CVS projects, before I started committing to PostgreSQL).
There are settings to make it not do that, but they are not reliable.
I'd strongly suggest that you always just do a cvs diff on windows and
then use a staging machine running linux or bsd or something to apply
it through. And then you *always* run those patches through something
like "fromdos". It's a bunch of extra steps, but it's really the only
way to do it reliably.

If that's not at all what happened, then well, it's still good advice
I think, even if it doesn't apply :-)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/