Poorly-thought-out handling of double variables in pgbench

Started by Tom Laneover 9 years ago7 messages

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

While testing 7a622b273 I happened to notice this:

\set x greatest(3, 2, 4.9)
create table mytab (x numeric);
insert into mytab values(:x);

results in this table:

x
----------------------
4.900000000000000355
(1 row)

The reason for that is that the result of a "double" calculation
is coerced to text like this:

sprintf(res, "%.18e", result.u.dval);

apparently on the theory that this will result in being able to convert it
back to double with no loss of low-order bits. But of course the last two
or three digits of such output are pretty much garbage to the naked eye.
Then what gets interpolated as the variable value is something like
'4.900000000000000355e+00'.

I think this is a very poor decision; it's going to cause surprises and
probably bug reports. Moreover, if we were testing this behavior in the
buildfarm (which we are not, but only because the test coverage for
pgbench is so shoddy), we would be seeing failures, because those
low-order digits are going to be platform dependent.

The only value of doing it like this would be if people chained multiple
floating-point calculations in successive pgbench \set commands and needed
full precision to be retained from one \set to the next ... but how likely
is that?

A minimum-change fix would be to print only DBL_DIG digits here. A better
answer, perhaps, would be to store double-valued variables in double
format to begin with, coercing to text only when and if the value is
interpolated into a string. That's probably a bigger change than we
want to be putting in right now, though I'm a bit tempted to go try it.

Thoughts?

BTW, just to add insult to injury, the debug() function prints double
values with "%.f", which evidently had even less thought put into it.
That one should definitely be %g with DBL_DIG precision.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Fabien COELHO

coelho@cri.ensmp.fr

over 9 years ago

In reply to: Tom Lane (#1)

Re: Poorly-thought-out handling of double variables in pgbench

Hello Tom,

While testing 7a622b273 I happened to notice this:

\set x greatest(3, 2, 4.9)
create table mytab (x numeric);
insert into mytab values(:x);
x
----------------------
4.900000000000000355

The reason for that is that the result of a "double" calculation
is coerced to text like this:

sprintf(res, "%.18e", result.u.dval);

apparently on the theory that this will result in being able to convert it
back to double with no loss of low-order bits.

Yep. I did that for this very reason.

ISTM that the alternative was to reimplement the variable stuff to add &
manage types, but that induces significant changes, and such change are
likely to be rejected, or at least to raise long debates and questions,
make the patch harder to check... so I just avoided them.

But of course the last two or three digits of such output are pretty
much garbage to the naked eye.

Sure, it is binary to decimal conversion noise.

Then what gets interpolated as the variable value is something like
'4.900000000000000355e+00'.

I think this is a very poor decision; it's going to cause surprises and
probably bug reports.

Moreover, if we were testing this behavior in the buildfarm (which we
are not, but only because the test coverage for pgbench is so shoddy),

I think that "none" is the right word?

we would be seeing failures, because those low-order digits are going to
be platform dependent.

Possibly.

The only value of doing it like this would be if people chained multiple
floating-point calculations in successive pgbench \set commands and needed
full precision to be retained from one \set to the next ... but how likely
is that?

The constraint for me is that a stored double should not lose precision.
Any solution which achieves this goal is fine with me.

A minimum-change fix would be to print only DBL_DIG digits here.

Probably 15, but there is some rounding implied I think (?). I'm not that
sure of DBL_DIG+1, so I put 18 to be safer. I did not envision that
someone would store that in a NUMERIC:-) Your example would work fine with
a DOUBLE PRECISION attribute.

A better answer, perhaps, would be to store double-valued variables in
double format to begin with, coercing to text only when and if the value
is interpolated into a string.

Yep, but that was yet more changes for a limited benefit and would have
increase the probability that the patch would have been rejected.

That's probably a bigger change than we want to be putting in right now,
though I'm a bit tempted to go try it.

Thoughts?

My 0.02ï¿œ:

I definitely agree that the text variable solution is pretty ugly, but it
was the minimum change solution, and I do not have much time available.

I do not think that rounding values is desirable, on principle.

Now doubles are only really there because random_*() functions need one
double argument, otherwise only integers make much sense as keys to select
tuples in a performance bench. So this is not a key feature, just a side
effect of having more realistic non uniform random generators and
expressions. In an early submission I did not bother to include double
variables because I cannot see much actual use to them, and there were
implementation issues given how integer variables were managed.

Also I think that the above NUMERIC/DOUBLE case is a little contrived, so
I would wait for someone to actually complain with a good use case before
spending any significant time to change how variables are stored in
pgbench.

I would expect a long review process for such a patch if I submitted it,
because there are details and the benefit is quite limited.

BTW, just to add insult to injury, the debug() function prints double
values with "%.f", which evidently had even less thought put into it.

AFAICR the thought I put into it is that it is definitely useful to be
able to get a simple trace for debugging purpose. The precision of this
debug output is not very important, so I probably just put the simplest
possible format.

That one should definitely be %g with DBL_DIG precision.

That would be fine as well.

--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Fabien COELHO (#2)

1 attachment(s)

Re: Poorly-thought-out handling of double variables in pgbench

Fabien COELHO <coelho@cri.ensmp.fr> writes:

A better answer, perhaps, would be to store double-valued variables in
double format to begin with, coercing to text only when and if the value
is interpolated into a string.

Yep, but that was yet more changes for a limited benefit and would have
increase the probability that the patch would have been rejected.

That's probably a bigger change than we want to be putting in right now,
though I'm a bit tempted to go try it.

I definitely agree that the text variable solution is pretty ugly, but it
was the minimum change solution, and I do not have much time available.

Well, I felt like doing some minor hacking, so I went and adjusted the
code to work this way. I'm pretty happy with the result, what do you
think?

regards, tom lane

Attachments:

pgbench-real-numeric-variables.patchtext/x-diff; charset=us-ascii; name=pgbench-real-numeric-variables.patchDownload

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 2a9063a..a484165 100644
*** a/src/bin/pgbench/pgbench.c
--- b/src/bin/pgbench/pgbench.c
***************
*** 38,43 ****
--- 38,44 ----
  #include "portability/instr_time.h"
  
  #include <ctype.h>
+ #include <float.h>
  #include <limits.h>
  #include <math.h>
  #include <signal.h>
*************** const char *progname;
*** 185,195 ****
  
  volatile bool timer_exceeded = false;	/* flag from signal handler */
  
! /* variable definitions */
  typedef struct
  {
! 	char	   *name;			/* variable name */
! 	char	   *value;			/* its value */
  } Variable;
  
  #define MAX_SCRIPTS		128		/* max number of SQL scripts allowed */
--- 186,205 ----
  
  volatile bool timer_exceeded = false;	/* flag from signal handler */
  
! /*
!  * Variable definitions.  If a variable has a string value, "value" is that
!  * value, is_numeric is false, and num_value is undefined.  If the value is
!  * known to be numeric, is_numeric is true and num_value contains the value
!  * (in any permitted numeric variant).  In this case "value" contains the
!  * string equivalent of the number, if we've had occasion to compute that,
!  * or NULL if we haven't.
!  */
  typedef struct
  {
! 	char	   *name;			/* variable's name */
! 	char	   *value;			/* its value in string form, if known */
! 	bool		is_numeric;		/* is numeric value known? */
! 	PgBenchValue num_value;		/* variable's value in numeric form */
  } Variable;
  
  #define MAX_SCRIPTS		128		/* max number of SQL scripts allowed */
*************** typedef struct
*** 237,243 ****
  	bool		throttling;		/* whether nap is for throttling */
  	bool		is_throttled;	/* whether transaction throttling is done */
  	Variable   *variables;		/* array of variable definitions */
! 	int			nvariables;
  	int64		txn_scheduled;	/* scheduled start time of transaction (usec) */
  	instr_time	txn_begin;		/* used for measuring schedule lag times */
  	instr_time	stmt_begin;		/* used for measuring statement latencies */
--- 247,254 ----
  	bool		throttling;		/* whether nap is for throttling */
  	bool		is_throttled;	/* whether transaction throttling is done */
  	Variable   *variables;		/* array of variable definitions */
! 	int			nvariables;		/* number of variables */
! 	bool		vars_sorted;	/* are variables sorted by name? */
  	int64		txn_scheduled;	/* scheduled start time of transaction (usec) */
  	instr_time	txn_begin;		/* used for measuring schedule lag times */
  	instr_time	stmt_begin;		/* used for measuring statement latencies */
*************** static const BuiltinScript builtin_scrip
*** 363,368 ****
--- 374,381 ----
  
  
  /* Function prototypes */
+ static void setIntValue(PgBenchValue *pv, int64 ival);
+ static void setDoubleValue(PgBenchValue *pv, double dval);
  static bool evaluateExpr(TState *, CState *, PgBenchExpr *, PgBenchValue *);
  static void doLog(TState *thread, CState *st, instr_time *now,
  	  StatsData *agg, bool skipped, double latency, double lag);
*************** discard_response(CState *state)
*** 836,868 ****
  	} while (res);
  }
  
  static int
! compareVariables(const void *v1, const void *v2)
  {
  	return strcmp(((const Variable *) v1)->name,
  				  ((const Variable *) v2)->name);
  }
  
! static char *
! getVariable(CState *st, char *name)
  {
! 	Variable	key,
! 			   *var;
  
  	/* On some versions of Solaris, bsearch of zero items dumps core */
  	if (st->nvariables <= 0)
  		return NULL;
  
  	key.name = name;
! 	var = (Variable *) bsearch((void *) &key,
! 							   (void *) st->variables,
! 							   st->nvariables,
! 							   sizeof(Variable),
! 							   compareVariables);
! 	if (var != NULL)
! 		return var->value;
  	else
! 		return NULL;
  }
  
  /* check whether the name consists of alphabets, numerals and underscores. */
--- 849,945 ----
  	} while (res);
  }
  
+ /* qsort comparator for Variable array */
  static int
! compareVariableNames(const void *v1, const void *v2)
  {
  	return strcmp(((const Variable *) v1)->name,
  				  ((const Variable *) v2)->name);
  }
  
! /* Locate a variable by name; returns NULL if unknown */
! static Variable *
! lookupVariable(CState *st, char *name)
  {
! 	Variable	key;
  
  	/* On some versions of Solaris, bsearch of zero items dumps core */
  	if (st->nvariables <= 0)
  		return NULL;
  
+ 	/* Sort if we have to */
+ 	if (!st->vars_sorted)
+ 	{
+ 		qsort((void *) st->variables, st->nvariables, sizeof(Variable),
+ 			  compareVariableNames);
+ 		st->vars_sorted = true;
+ 	}
+ 
+ 	/* Now we can search */
  	key.name = name;
! 	return (Variable *) bsearch((void *) &key,
! 								(void *) st->variables,
! 								st->nvariables,
! 								sizeof(Variable),
! 								compareVariableNames);
! }
! 
! /* Get the value of a variable, in string form; returns NULL if unknown */
! static char *
! getVariable(CState *st, char *name)
! {
! 	Variable   *var;
! 	char		stringform[64];
! 
! 	var = lookupVariable(st, name);
! 	if (var == NULL)
! 		return NULL;			/* not found */
! 
! 	if (var->value)
! 		return var->value;		/* we have it in string form */
! 
! 	/* We need to produce a string equivalent of the numeric value */
! 	Assert(var->is_numeric);
! 	if (var->num_value.type == PGBT_INT)
! 		snprintf(stringform, sizeof(stringform),
! 				 INT64_FORMAT, var->num_value.u.ival);
  	else
! 	{
! 		Assert(var->num_value.type == PGBT_DOUBLE);
! 		snprintf(stringform, sizeof(stringform),
! 				 "%.*g", DBL_DIG, var->num_value.u.dval);
! 	}
! 	var->value = pg_strdup(stringform);
! 	return var->value;
! }
! 
! /* Try to convert variable to numeric form; return false on failure */
! static bool
! makeVariableNumeric(Variable *var)
! {
! 	if (var->is_numeric)
! 		return true;			/* no work */
! 
! 	if (is_an_int(var->value))
! 	{
! 		setIntValue(&var->num_value, strtoint64(var->value));
! 		var->is_numeric = true;
! 	}
! 	else /* type should be double */
! 	{
! 		double dv;
! 
! 		if (sscanf(var->value, "%lf", &dv) != 1)
! 		{
! 			fprintf(stderr,
! 					"malformed variable \"%s\" value: \"%s\"\n",
! 					var->name, var->value);
! 			return false;
! 		}
! 		setDoubleValue(&var->num_value, dv);
! 		var->is_numeric = true;
! 	}
! 	return true;
  }
  
  /* check whether the name consists of alphabets, numerals and underscores. */
*************** isLegalVariableName(const char *name)
*** 877,902 ****
  			return false;
  	}
  
! 	return true;
  }
  
! static int
! putVariable(CState *st, const char *context, char *name, char *value)
  {
! 	Variable	key,
! 			   *var;
! 
! 	key.name = name;
! 	/* On some versions of Solaris, bsearch of zero items dumps core */
! 	if (st->nvariables > 0)
! 		var = (Variable *) bsearch((void *) &key,
! 								   (void *) st->variables,
! 								   st->nvariables,
! 								   sizeof(Variable),
! 								   compareVariables);
! 	else
! 		var = NULL;
  
  	if (var == NULL)
  	{
  		Variable   *newvars;
--- 954,973 ----
  			return false;
  	}
  
! 	return (i > 0);				/* must be non-empty */
  }
  
! /*
!  * Lookup a variable by name, creating it if need be.
!  * Caller is expected to assign a value to the variable.
!  * Returns NULL on failure (bad name).
!  */
! static Variable *
! lookupCreateVariable(CState *st, const char *context, char *name)
  {
! 	Variable   *var;
  
+ 	var = lookupVariable(st, name);
  	if (var == NULL)
  	{
  		Variable   *newvars;
*************** putVariable(CState *st, const char *cont
*** 909,917 ****
  		{
  			fprintf(stderr, "%s: invalid variable name: \"%s\"\n",
  					context, name);
! 			return false;
  		}
  
  		if (st->variables)
  			newvars = (Variable *) pg_realloc(st->variables,
  									(st->nvariables + 1) * sizeof(Variable));
--- 980,989 ----
  		{
  			fprintf(stderr, "%s: invalid variable name: \"%s\"\n",
  					context, name);
! 			return NULL;
  		}
  
+ 		/* Create variable at the end of the array */
  		if (st->variables)
  			newvars = (Variable *) pg_realloc(st->variables,
  									(st->nvariables + 1) * sizeof(Variable));
*************** putVariable(CState *st, const char *cont
*** 923,949 ****
  		var = &newvars[st->nvariables];
  
  		var->name = pg_strdup(name);
! 		var->value = pg_strdup(value);
  
  		st->nvariables++;
! 
! 		qsort((void *) st->variables, st->nvariables, sizeof(Variable),
! 			  compareVariables);
  	}
- 	else
- 	{
- 		char	   *val;
  
! 		/* dup then free, in case value is pointing at this variable */
! 		val = pg_strdup(value);
  
  		free(var->value);
! 		var->value = val;
! 	}
  
  	return true;
  }
  
  static char *
  parseVariable(const char *sql, int *eaten)
  {
--- 995,1066 ----
  		var = &newvars[st->nvariables];
  
  		var->name = pg_strdup(name);
! 		var->value = NULL;
! 		/* caller is expected to initialize remaining fields */
  
  		st->nvariables++;
! 		/* we don't re-sort the array till we have to */
! 		st->vars_sorted = false;
  	}
  
! 	return var;
! }
  
+ /* Assign a string value to a variable, creating it if need be */
+ /* Returns false on failure (bad name) */
+ static bool
+ putVariable(CState *st, const char *context, char *name, const char *value)
+ {
+ 	Variable   *var;
+ 	char	   *val;
+ 
+ 	var = lookupCreateVariable(st, context, name);
+ 	if (!var)
+ 		return false;
+ 
+ 	/* dup then free, in case value is pointing at this variable */
+ 	val = pg_strdup(value);
+ 
+ 	if (var->value)
  		free(var->value);
! 	var->value = val;
! 	var->is_numeric = false;
! 
! 	return true;
! }
! 
! /* Assign a numeric value to a variable, creating it if need be */
! /* Returns false on failure (bad name) */
! static bool
! putVariableNumber(CState *st, const char *context, char *name,
! 				  const PgBenchValue *value)
! {
! 	Variable   *var;
! 
! 	var = lookupCreateVariable(st, context, name);
! 	if (!var)
! 		return false;
! 
! 	if (var->value)
! 		free(var->value);
! 	var->value = NULL;
! 	var->is_numeric = true;
! 	var->num_value = *value;
  
  	return true;
  }
  
+ /* Assign an integer value to a variable, creating it if need be */
+ /* Returns false on failure (bad name) */
+ static bool
+ putVariableInt(CState *st, const char *context, char *name, int64 value)
+ {
+ 	PgBenchValue val;
+ 
+ 	setIntValue(&val, value);
+ 	return putVariableNumber(st, context, name, &val);
+ }
+ 
  static char *
  parseVariable(const char *sql, int *eaten)
  {
*************** evalFunc(TState *thread, CState *st,
*** 1260,1266 ****
  				else
  				{
  					Assert(varg->type == PGBT_DOUBLE);
! 					fprintf(stderr, "double %f\n", varg->u.dval);
  				}
  
  				*retval = *varg;
--- 1377,1383 ----
  				else
  				{
  					Assert(varg->type == PGBT_DOUBLE);
! 					fprintf(stderr, "double %.*g\n", DBL_DIG, varg->u.dval);
  				}
  
  				*retval = *varg;
*************** evaluateExpr(TState *thread, CState *st,
*** 1454,1485 ****
  
  		case ENODE_VARIABLE:
  			{
! 				char	   *var;
  
! 				if ((var = getVariable(st, expr->u.variable.varname)) == NULL)
  				{
  					fprintf(stderr, "undefined variable \"%s\"\n",
  							expr->u.variable.varname);
  					return false;
  				}
  
! 				if (is_an_int(var))
! 				{
! 					setIntValue(retval, strtoint64(var));
! 				}
! 				else /* type should be double */
! 				{
! 					double dv;
! 					if (sscanf(var, "%lf", &dv) != 1)
! 					{
! 						fprintf(stderr,
! 								"malformed variable \"%s\" value: \"%s\"\n",
! 								expr->u.variable.varname, var);
! 						return false;
! 					}
! 					setDoubleValue(retval, dv);
! 				}
  
  				return true;
  			}
  
--- 1571,1589 ----
  
  		case ENODE_VARIABLE:
  			{
! 				Variable   *var;
  
! 				if ((var = lookupVariable(st, expr->u.variable.varname)) == NULL)
  				{
  					fprintf(stderr, "undefined variable \"%s\"\n",
  							expr->u.variable.varname);
  					return false;
  				}
  
! 				if (!makeVariableNumeric(var))
! 					return false;
  
+ 				*retval = var->num_value;
  				return true;
  			}
  
*************** runShellCommand(CState *st, char *variab
*** 1596,1603 ****
  				argv[0], res);
  		return false;
  	}
! 	snprintf(res, sizeof(res), "%d", retval);
! 	if (!putVariable(st, "setshell", variable, res))
  		return false;
  
  #ifdef DEBUG
--- 1700,1706 ----
  				argv[0], res);
  		return false;
  	}
! 	if (!putVariableInt(st, "setshell", variable, retval))
  		return false;
  
  #ifdef DEBUG
*************** top:
*** 1973,1979 ****
  
  		if (pg_strcasecmp(argv[0], "set") == 0)
  		{
- 			char		res[64];
  			PgBenchExpr *expr = commands[st->state]->expr;
  			PgBenchValue	result;
  
--- 2076,2081 ----
*************** top:
*** 1983,1997 ****
  				return true;
  			}
  
! 			if (result.type == PGBT_INT)
! 				sprintf(res, INT64_FORMAT, result.u.ival);
! 			else
! 			{
! 				Assert(result.type == PGBT_DOUBLE);
! 				sprintf(res, "%.18e", result.u.dval);
! 			}
! 
! 			if (!putVariable(st, argv[0], argv[1], res))
  			{
  				st->ecnt++;
  				return true;
--- 2085,2091 ----
  				return true;
  			}
  
! 			if (!putVariableNumber(st, argv[0], argv[1], &result))
  			{
  				st->ecnt++;
  				return true;
*************** main(int argc, char **argv)
*** 3325,3332 ****
  	PGresult   *res;
  	char	   *env;
  
- 	char		val[64];
- 
  	progname = get_progname(argv[0]);
  
  	if (argc > 1)
--- 3419,3424 ----
*************** main(int argc, char **argv)
*** 3767,3774 ****
  			state[i].id = i;
  			for (j = 0; j < state[0].nvariables; j++)
  			{
! 				if (!putVariable(&state[i], "startup", state[0].variables[j].name, state[0].variables[j].value))
! 					exit(1);
  			}
  		}
  	}
--- 3859,3878 ----
  			state[i].id = i;
  			for (j = 0; j < state[0].nvariables; j++)
  			{
! 				Variable   *var = &state[0].variables[j];
! 
! 				if (var->is_numeric)
! 				{
! 					if (!putVariableNumber(&state[i], "startup",
! 									 var->name, &var->num_value))
! 						exit(1);
! 				}
! 				else
! 				{
! 					if (!putVariable(&state[i], "startup",
! 									 var->name, var->value))
! 						exit(1);
! 				}
  			}
  		}
  	}
*************** main(int argc, char **argv)
*** 3834,3845 ****
  	 * :scale variables normally get -s or database scale, but don't override
  	 * an explicit -D switch
  	 */
! 	if (getVariable(&state[0], "scale") == NULL)
  	{
- 		snprintf(val, sizeof(val), "%d", scale);
  		for (i = 0; i < nclients; i++)
  		{
! 			if (!putVariable(&state[i], "startup", "scale", val))
  				exit(1);
  		}
  	}
--- 3938,3948 ----
  	 * :scale variables normally get -s or database scale, but don't override
  	 * an explicit -D switch
  	 */
! 	if (lookupVariable(&state[0], "scale") == NULL)
  	{
  		for (i = 0; i < nclients; i++)
  		{
! 			if (!putVariableInt(&state[i], "startup", "scale", scale))
  				exit(1);
  		}
  	}
*************** main(int argc, char **argv)
*** 3848,3859 ****
  	 * Define a :client_id variable that is unique per connection. But don't
  	 * override an explicit -D switch.
  	 */
! 	if (getVariable(&state[0], "client_id") == NULL)
  	{
  		for (i = 0; i < nclients; i++)
  		{
! 			snprintf(val, sizeof(val), "%d", i);
! 			if (!putVariable(&state[i], "startup", "client_id", val))
  				exit(1);
  		}
  	}
--- 3951,3961 ----
  	 * Define a :client_id variable that is unique per connection. But don't
  	 * override an explicit -D switch.
  	 */
! 	if (lookupVariable(&state[0], "client_id") == NULL)
  	{
  		for (i = 0; i < nclients; i++)
  		{
! 			if (!putVariableInt(&state[i], "startup", "client_id", i))
  				exit(1);
  		}
  	}

Fabien COELHO

coelho@cri.ensmp.fr

over 9 years ago

In reply to: Tom Lane (#3)

1 attachment(s)

Re: Poorly-thought-out handling of double variables in pgbench

Hello Tom,

That's probably a bigger change than we want to be putting in right now,
though I'm a bit tempted to go try it.

I definitely agree that the text variable solution is pretty ugly, but it
was the minimum change solution, and I do not have much time available.

Well, I felt like doing some minor hacking, so I went and adjusted the
code to work this way. I'm pretty happy with the result, what do you
think?

This is a definite improvement.

I like the lazyness between string & numeric forms, and for sorting, that
is what was needed doing to have something clean.

Applied on head, it works for me.

While testing the patch I found a minor preexisting (mine...) bug: when
string-scanning doubles, whether the whole string is consumed or not is
not checked. This means that -D x=0one is interpreted as double 0.

I came up with the attached check, but maybe there is a cleaner way to do
that.

--
Fabien.

Attachments:

pgbench-scan-double.patchtext/x-diff; name=pgbench-scan-double.patchDownload

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index a484165..9673640 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -928,8 +928,10 @@ makeVariableNumeric(Variable *var)
 	else /* type should be double */
 	{
 		double dv;
+		int consumed;
 
-		if (sscanf(var->value, "%lf", &dv) != 1)
+		if (sscanf(var->value, "%lf%n", &dv, &consumed) != 1 ||
+			consumed != strlen(var->value))
 		{
 			fprintf(stderr,
 					"malformed variable \"%s\" value: \"%s\"\n",

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Fabien COELHO (#4)

Re: Poorly-thought-out handling of double variables in pgbench

Fabien COELHO <coelho@cri.ensmp.fr> writes:

This is a definite improvement.

I like the lazyness between string & numeric forms, and for sorting, that
is what was needed doing to have something clean.

Applied on head, it works for me.

OK, pushed.

While testing the patch I found a minor preexisting (mine...) bug: when
string-scanning doubles, whether the whole string is consumed or not is
not checked. This means that -D x=0one is interpreted as double 0.

I came up with the attached check, but maybe there is a cleaner way to do
that.

I like the way the zic code does it:

else /* type should be double */
{
double dv;

! 		if (sscanf(var->value, "%lf", &dv) != 1)
  		{
  			fprintf(stderr,
  					"malformed variable \"%s\" value: \"%s\"\n",
--- 928,936 ----
  	else /* type should be double */
  	{
  		double dv;
+ 		char xs;

! if (sscanf(var->value, "%lf%c", &dv, &xs) != 1)
{
fprintf(stderr,
"malformed variable \"%s\" value: \"%s\"\n",

This relies on the idea that if there is any trailing junk, one character
will get assigned into "xs" and then sscanf will say it filled two fields.
Otherwise, it'll report filling only one field (or none, if there's
no number either). This requires only a minimal amount of code and no
extra strlen() calculations.

Neither way allows for trailing whitespace, though. I don't see that
that's important here.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Fabien COELHO

coelho@cri.ensmp.fr

over 9 years ago

In reply to: Tom Lane (#5)

Re: Poorly-thought-out handling of double variables in pgbench

While testing the patch I found a minor preexisting (mine...) bug: when
string-scanning doubles, whether the whole string is consumed or not is
not checked. This means that -D x=0one is interpreted as double 0.

I came up with the attached check, but maybe there is a cleaner way to do
that.

I like the way the zic code does it:

+ char xs;
! if (sscanf(var->value, "%lf%c", &dv, &xs) != 1)

Indeed... I cannot say that it is cleaner such, but at least it is short
and it avoids the strlen call.

Neither way allows for trailing whitespace, though. I don't see that
that's important here.

No, this is not an issue here for variables specified from the command
line. Consider pushing this fix?

--
Fabien.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Fabien COELHO (#6)

Re: Poorly-thought-out handling of double variables in pgbench

Fabien COELHO <coelho@cri.ensmp.fr> writes:

Neither way allows for trailing whitespace, though. I don't see that
that's important here.

No, this is not an issue here for variables specified from the command
line. Consider pushing this fix?

Done already.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers