float output precision questions

Started by Pedro M. Ferreiraover 23 years ago33 messageshackers
Jump to latest
#1Pedro M. Ferreira
pfrazao@ualg.pt

Hi All,

I wrote a Matlab interface to PostgreSQL based on libpq. It is working
fine, supports multiple connections, supports all data types and arrays.
It is a C program to do the interface and some Matlab wrapper functions
around it to do the job at application level.

Matlab has an ODBC toolbox but we dont want to buy it since the
interface does the job and we have been using PostgreSQL for long time.
We want to use PostgreSQl to store numeric data from simulation,
computation and data acquisition sources. Basically a big bunch of float
numbers.

There is still one problem, regarding float output formatting in querys
and dumps, which essential for this type of application.

If I have a float8 field (named real8 below) in a table and insert the
following,

insert into test(real8) values (4503599627370496);
INSERT 21192 1

A select produces,

select real8 from test;
real8
---------------------
4.5035996273705e+15
(1 row)

This is the string I would get from libpq's PQgetvalue(). PQftype()
correctly returns float8. pg_dump produces the same result. If I convert
PQgetvalue() to a C double I wont get the same number I inserted. If I
do a restore from a dump I also wont have the same number. I can see
that the original number is correctly stored because,

select to_char(real8,'9999999999999999999.99999') from test;
to_char
----------------------
4503599627370496
(1 row)

This way PQftype wont tell this is a float8.

Is there a way to set query output precision to maximum precision ?
For the type of application I mentioned this is crucial. People want to
get the 'same' numbers, from querys or dumps, as they inserted them.

Matlab has a toolbox fucntion, claiming maximum precision, to convert
from its double type (PostgreSQL float8) to string which does a
sprintf('%25.18g',number).

Would it be possible to have a a parameter which could be changed by a
SET command, in order to control output precision ? I searched the docs
but could not find this.

--
----------------------------------------------------------------------
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax: (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao

#2Pedro M. Ferreira
pfrazao@ualg.pt
In reply to: Pedro M. Ferreira (#1)
Re: float output precision questions

Sorry. I forgot to thank for any help from all of you in the previous
message. Thanks! :)

Just one more thing:

I now I can go to the source and change the code which converts floats
to strings, and have my problem solved. But this wont be general. Others
might need this kind of application.

I could post this interface to postgres interfaces site. Do you think
this is ok ?

Thanks again !

Pedro M. Ferreira

Pedro Miguel Frazao Fernandes Ferreira wrote:

Hi All,

I wrote a Matlab interface to PostgreSQL based on libpq. It is working
fine, supports multiple connections, supports all data types and arrays.
It is a C program to do the interface and some Matlab wrapper functions
around it to do the job at application level.

Matlab has an ODBC toolbox but we dont want to buy it since the
interface does the job and we have been using PostgreSQL for long time.
We want to use PostgreSQl to store numeric data from simulation,
computation and data acquisition sources. Basically a big bunch of float
numbers.

There is still one problem, regarding float output formatting in querys
and dumps, which essential for this type of application.

If I have a float8 field (named real8 below) in a table and insert the
following,

insert into test(real8) values (4503599627370496);
INSERT 21192 1

A select produces,

select real8 from test;
real8
---------------------
4.5035996273705e+15
(1 row)

This is the string I would get from libpq's PQgetvalue(). PQftype()
correctly returns float8. pg_dump produces the same result. If I convert
PQgetvalue() to a C double I wont get the same number I inserted. If I
do a restore from a dump I also wont have the same number. I can see
that the original number is correctly stored because,

select to_char(real8,'9999999999999999999.99999') from test;
to_char
----------------------
4503599627370496
(1 row)

This way PQftype wont tell this is a float8.

Is there a way to set query output precision to maximum precision ?
For the type of application I mentioned this is crucial. People want to
get the 'same' numbers, from querys or dumps, as they inserted them.

Matlab has a toolbox fucntion, claiming maximum precision, to convert
from its double type (PostgreSQL float8) to string which does a
sprintf('%25.18g',number).

Would it be possible to have a a parameter which could be changed by a
SET command, in order to control output precision ? I searched the docs
but could not find this.

--
----------------------------------------------------------------------
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax: (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao

#3Bruce Momjian
bruce@momjian.us
In reply to: Pedro M. Ferreira (#1)
Re: float output precision questions

TODO has:

o Add SET REAL_FORMAT and SET DOUBLE_PRECISION_FORMAT
using printf args

so we have not implemented it yet.

---------------------------------------------------------------------------

Pedro Miguel Frazao Fernandes Ferreira wrote:

Hi All,

I wrote a Matlab interface to PostgreSQL based on libpq. It is working
fine, supports multiple connections, supports all data types and arrays.
It is a C program to do the interface and some Matlab wrapper functions
around it to do the job at application level.

Matlab has an ODBC toolbox but we dont want to buy it since the
interface does the job and we have been using PostgreSQL for long time.
We want to use PostgreSQl to store numeric data from simulation,
computation and data acquisition sources. Basically a big bunch of float
numbers.

There is still one problem, regarding float output formatting in querys
and dumps, which essential for this type of application.

If I have a float8 field (named real8 below) in a table and insert the
following,

insert into test(real8) values (4503599627370496);
INSERT 21192 1

A select produces,

select real8 from test;
real8
---------------------
4.5035996273705e+15
(1 row)

This is the string I would get from libpq's PQgetvalue(). PQftype()
correctly returns float8. pg_dump produces the same result. If I convert
PQgetvalue() to a C double I wont get the same number I inserted. If I
do a restore from a dump I also wont have the same number. I can see
that the original number is correctly stored because,

select to_char(real8,'9999999999999999999.99999') from test;
to_char
----------------------
4503599627370496
(1 row)

This way PQftype wont tell this is a float8.

Is there a way to set query output precision to maximum precision ?
For the type of application I mentioned this is crucial. People want to
get the 'same' numbers, from querys or dumps, as they inserted them.

Matlab has a toolbox fucntion, claiming maximum precision, to convert
from its double type (PostgreSQL float8) to string which does a
sprintf('%25.18g',number).

Would it be possible to have a a parameter which could be changed by a
SET command, in order to control output precision ? I searched the docs
but could not find this.

--
----------------------------------------------------------------------
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax: (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#4Peter Eisentraut
peter_e@gmx.net
In reply to: Pedro M. Ferreira (#1)
Re: float output precision questions

Pedro Miguel Frazao Fernandes Ferreira writes:

Is there a way to set query output precision to maximum precision ?
For the type of application I mentioned this is crucial. People want to
get the 'same' numbers, from querys or dumps, as they inserted them.

There isn't a way right now, but it's planned to be able to dump
floating-point numbers in some binary form (like printf("%A")) to be able
to restore them exactly. Not sure how that would satisfy the needs of
client interfaces, though.

Matlab has a toolbox fucntion, claiming maximum precision, to convert
from its double type (PostgreSQL float8) to string which does a
sprintf('%25.18g',number).

Do we have some mathematical guarantee that this is sufficient and
necessary? If so, then it might do.

--
Peter Eisentraut peter_e@gmx.net

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#3)
Re: float output precision questions

Bruce Momjian <pgman@candle.pha.pa.us> writes:

TODO has:
o Add SET REAL_FORMAT and SET DOUBLE_PRECISION_FORMAT
using printf args
so we have not implemented it yet.

IIRC, the last time it was discussed there was disagreement about how
it should work; check the pghackers archives for details. The feature
probably won't go anywhere until those issues are resolved.

regards, tom lane

#6Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Peter Eisentraut (#4)
Re: float output precision questions

On Tue, 29 Oct 2002, Peter Eisentraut wrote:

Pedro Miguel Frazao Fernandes Ferreira writes:

Is there a way to set query output precision to maximum precision ?
For the type of application I mentioned this is crucial. People want to
get the 'same' numbers, from querys or dumps, as they inserted them.

How do you define maximum precision and "same"? With simple test programs
in C, using two digits more than DBL_DIG for printf specifier, it's easy
for me to find numbers that "change" decimal string representation in the
decimal representation -> double -> decimal representation conversion(*).
The final double you get from the second conversion should be the same as
the first, but is that what you need or do you need a stronger guarantee
than that?

#7Bruno Wolff III
bruno@wolff.to
In reply to: Peter Eisentraut (#4)
Re: float output precision questions

On Tue, Oct 29, 2002 at 23:19:05 +0100,
Peter Eisentraut <peter_e@gmx.net> wrote:

There isn't a way right now, but it's planned to be able to dump
floating-point numbers in some binary form (like printf("%A")) to be able
to restore them exactly. Not sure how that would satisfy the needs of
client interfaces, though.

Why not print it as a floating binary number instead of a floating decimal
number? I would think that would give you better portability than a system
specific binary representation.

#8Pedro M. Ferreira
pfrazao@ualg.pt
In reply to: Peter Eisentraut (#4)
Re: float output precision questions

Peter Eisentraut wrote:

Matlab has a toolbox fucntion, claiming maximum precision, to convert
from its double type (PostgreSQL float8) to string which does a
sprintf('%25.18g',number).

Do we have some mathematical guarantee that this is sufficient and
necessary? If so, then it might do.

It is necessary if you want to do this type of (huge amount of number
storage) application:

[Some client] (insert) [PostgreSQL] (query) [Some client]
(double number a)-------->(float8 number)------->(double number b)

In order for a=b this is necessary. With current float8 output this is
not allways true.

Here is the help for that particular Matlab function:

NUM2MSTR Convert number to string in maximum precision.
S = NUM2MSTR(N) converts real numbers of input
matrix N to string output vector S, in
maximum precision.

See also NUM2STR.

If you want I can try to contact the guys who coded this function to know
if this is sufficient.

Thanks,
Pedro M. Ferreira
--
----------------------------------------------------------------------
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax: (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao

#9Pedro M. Ferreira
pfrazao@ualg.pt
In reply to: Stephan Szabo (#6)
Re: float output precision questions

Stephan Szabo wrote:

On Tue, 29 Oct 2002, Peter Eisentraut wrote:

Pedro Miguel Frazao Fernandes Ferreira writes:

Is there a way to set query output precision to maximum precision ?
For the type of application I mentioned this is crucial. People want to
get the 'same' numbers, from querys or dumps, as they inserted them.

How do you define maximum precision and "same"? With simple test programs
in C, using two digits more than DBL_DIG for printf specifier, it's easy
for me to find numbers that "change" decimal string representation in the
decimal representation -> double -> decimal representation conversion(*).
The final double you get from the second conversion should be the same as
the first, but is that what you need or do you need a stronger guarantee
than that?

When I say "same" I am talking about having a number 'stored' in double
format in some client, inserting it in PostgreSQL float8 field and get
it to the client as it was before:

[Some client] (insert) [PostgreSQL] (query) [Some client]
(double number a)-------->(float8 number)------->(double number b)

"same" is so that a==b is true.
With current float8 output this is not allways true.

I believe this should allways be true for numbers which are originally
stored in double format.

Thanks,
Pedro M. Ferreira

--
----------------------------------------------------------------------
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax: (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao

#10Pedro M. Ferreira
pfrazao@ualg.pt
In reply to: Pedro M. Ferreira (#1)
Re: float output precision questions

Bruno Wolff III wrote:

On Tue, Oct 29, 2002 at 23:19:05 +0100,
Peter Eisentraut <peter_e@gmx.net> wrote:

There isn't a way right now, but it's planned to be able to dump
floating-point numbers in some binary form (like printf("%A")) to be able
to restore them exactly. Not sure how that would satisfy the needs of
client interfaces, though.

Why not print it as a floating binary number instead of a floating decimal
number? I would think that would give you better portability than a system
specific binary representation.

Having a way to get the binary storage representation for float numbers
would be good and would suffice within the same float number standard.

Thanks,
Pedro M. Ferreira

--
----------------------------------------------------------------------
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax: (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao

#11Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Pedro M. Ferreira (#9)
Re: float output precision questions

On Wed, 30 Oct 2002, Pedro Miguel Frazao Fernandes Ferreira wrote:

Stephan Szabo wrote:

On Tue, 29 Oct 2002, Peter Eisentraut wrote:

Pedro Miguel Frazao Fernandes Ferreira writes:

Is there a way to set query output precision to maximum precision ?
For the type of application I mentioned this is crucial. People want to
get the 'same' numbers, from querys or dumps, as they inserted them.

How do you define maximum precision and "same"? With simple test programs
in C, using two digits more than DBL_DIG for printf specifier, it's easy
for me to find numbers that "change" decimal string representation in the
decimal representation -> double -> decimal representation conversion(*).
The final double you get from the second conversion should be the same as
the first, but is that what you need or do you need a stronger guarantee
than that?

When I say "same" I am talking about having a number 'stored' in double
format in some client, inserting it in PostgreSQL float8 field and get
it to the client as it was before:

[Some client] (insert) [PostgreSQL] (query) [Some client]
(double number a)-------->(float8 number)------->(double number b)

"same" is so that a==b is true.
With current float8 output this is not allways true.

I believe this should allways be true for numbers which are originally
stored in double format.

The problem is that there are two competing needs here. One is the above,
the other other is that you get something that has the same decimal
representation (within the float's ability to store the number). Right now
the system does the latter since for most people, that seems to be the
guarantee they want.

This would probably make sense as an option, so why don't you look at the
past discussions and see if you can come up with a solution that keeps
everyone happy (and preferably implement it, but...) :)

#12Pedro M. Ferreira
pfrazao@ualg.pt
In reply to: Stephan Szabo (#11)
Re: float output precision questions

Stephan Szabo wrote:

On Wed, 30 Oct 2002, Pedro Miguel Frazao Fernandes Ferreira wrote:

Stephan Szabo wrote:

On Tue, 29 Oct 2002, Peter Eisentraut wrote:

Pedro Miguel Frazao Fernandes Ferreira writes:

Is there a way to set query output precision to maximum precision ?
For the type of application I mentioned this is crucial. People want to
get the 'same' numbers, from querys or dumps, as they inserted them.

How do you define maximum precision and "same"? With simple test programs
in C, using two digits more than DBL_DIG for printf specifier, it's easy
for me to find numbers that "change" decimal string representation in the
decimal representation -> double -> decimal representation conversion(*).
The final double you get from the second conversion should be the same as
the first, but is that what you need or do you need a stronger guarantee
than that?

When I say "same" I am talking about having a number 'stored' in double
format in some client, inserting it in PostgreSQL float8 field and get
it to the client as it was before:

[Some client] (insert) [PostgreSQL] (query) [Some client]
(double number a)-------->(float8 number)------->(double number b)

"same" is so that a==b is true.
With current float8 output this is not allways true.

I believe this should allways be true for numbers which are originally
stored in double format.

The problem is that there are two competing needs here. One is the above,
the other other is that you get something that has the same decimal
representation (within the float's ability to store the number). Right now
the system does the latter since for most people, that seems to be the
guarantee they want.

Look at this example:

1.79769313486231571e+308 is the largest floating point number
representable by a C double in x86.

In C this is possible:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
double v;
char a[30];

v=1.79769313486231571e+308;

printf(" Stored double number: %25.18g\n",v);
sprintf(a,"%25.18g",v);
printf(" Converted to string: %s\n",a);
v=atof(a);
printf("Converted from string to double: %25.18g\n",v);
}

Using standard PostgreSQL query output it would not be possible to get
this number, which has representation as a double.

I fetched the PostgreSQL source from Debian, changed
src/backend/utils/adt/float.c to do sprintf(ascii, "%25.18g", num)
instead of sprintf(ascii, "%.*g", DBL_DIG, num), compiled and installed.
Now I can get the number as is.

I understand that if people insert a value of 1.1 in a double, they want
to get 1.1 without knowing that in fact the stored number is
1.10000000000000009. But do you understand that if some people insert,
for example, a value of 1.79769313486231571e+308 they dont want to get
1.79769313486232e+308 which does not compare equal (in Matlab or C) to
the first ? This is a bug.

This would probably make sense as an option, so why don't you look at the
past discussions and see if you can come up with a solution that keeps
everyone happy (and preferably implement it, but...) :)

but ???

I have a sugestion:

To have parameters, say DOUBLE_FORMAT and FLOAT_FORMAT, which could have
option values of 'SHORT' and 'LONG'.
Option 'SHORT' would be default and produce the standard sprintf(ascii,...
Option 'LONG' would produce sprintf(ascii, "%25.18g", num).

Other way would be to have number parameters to be used in the sprintf
calls, in place of 25 and 18, in the format string.

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

--
----------------------------------------------------------------------
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax: (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao

#13Peter Eisentraut
peter_e@gmx.net
In reply to: Bruno Wolff III (#7)
Re: float output precision questions

Bruno Wolff III writes:

On Tue, Oct 29, 2002 at 23:19:05 +0100,
Peter Eisentraut <peter_e@gmx.net> wrote:

There isn't a way right now, but it's planned to be able to dump
floating-point numbers in some binary form (like printf("%A")) to be able
to restore them exactly. Not sure how that would satisfy the needs of
client interfaces, though.

Why not print it as a floating binary number instead of a floating decimal
number? I would think that would give you better portability than a system
specific binary representation.

The printf("%A") output is not system-specific.

--
Peter Eisentraut peter_e@gmx.net

#14Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Pedro M. Ferreira (#12)
Re: float output precision questions

On Wed, 30 Oct 2002, Pedro Miguel Frazao Fernandes Ferreira wrote:

In C this is possible:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
double v;
char a[30];

v=1.79769313486231571e+308;

printf(" Stored double number: %25.18g\n",v);
sprintf(a,"%25.18g",v);
printf(" Converted to string: %s\n",a);
v=atof(a);
printf("Converted from string to double: %25.18g\n",v);
}

AFAICT, this is not guaranteed to give you the same representation that
you're using in the initializer however.

Using standard PostgreSQL query output it would not be possible to get
this number, which has representation as a double.

I fetched the PostgreSQL source from Debian, changed
src/backend/utils/adt/float.c to do sprintf(ascii, "%25.18g", num)
instead of sprintf(ascii, "%.*g", DBL_DIG, num), compiled and installed.
Now I can get the number as is.

I understand that if people insert a value of 1.1 in a double, they want
to get 1.1 without knowing that in fact the stored number is
1.10000000000000009. But do you understand that if some people insert,
for example, a value of 1.79769313486231571e+308 they dont want to get
1.79769313486232e+308 which does not compare equal (in Matlab or C) to
the first ? This is a bug.

I disagree to some extent. I'm not sure it's meaningful to expect that
(what if the database and the client are on different architectures) in
general. In any case, you're effectively going from decimal
representation to double to decimal representation (the string you used
to insert it -> internal representation -> string used to output it) and
that's only guaranteed to be correct up to DBL_DIG digits as far as I can
tell. I think it'd be nice to have an option to get more digits for those
sorts of applications, however.

This would probably make sense as an option, so why don't you look at the
past discussions and see if you can come up with a solution that keeps
everyone happy (and preferably implement it, but...) :)

but ???

, but I realize that you might not be interested in doing such. (I figured
the last part was implied)

I have a sugestion:

To have parameters, say DOUBLE_FORMAT and FLOAT_FORMAT, which could have
option values of 'SHORT' and 'LONG'.
Option 'SHORT' would be default and produce the standard sprintf(ascii,...
Option 'LONG' would produce sprintf(ascii, "%25.18g", num).

Other way would be to have number parameters to be used in the sprintf
calls, in place of 25 and 18, in the format string.

From what Tom said, something similar was suggested and there were issues
brought up. I don't know what they were, since I wasn't personally
terribly interested, but it should be in the archives. If there were any
concerns, you'll probably need to deal with those as well.

#15Bruno Wolff III
bruno@wolff.to
In reply to: Peter Eisentraut (#13)
Re: float output precision questions

On Wed, Oct 30, 2002 at 19:27:57 +0100,
Peter Eisentraut <peter_e@gmx.net> wrote:

The printf("%A") output is not system-specific.

Just out of curiosity, can you tell me a web page or keywords to use
in a search to see what that format does? I tried using google, but
searching for printf with and "A" format didn't go to well. I only
found one correct reference and it didn't explain what %A did.
Thanks.

#16Pedro M. Ferreira
pfrazao@ualg.pt
In reply to: Stephan Szabo (#14)
Re: float output precision questions

Stephan Szabo wrote:

On Wed, 30 Oct 2002, Pedro Miguel Frazao Fernandes Ferreira wrote:

I understand that if people insert a value of 1.1 in a double, they want
to get 1.1 without knowing that in fact the stored number is
1.10000000000000009. But do you understand that if some people insert,
for example, a value of 1.79769313486231571e+308 they dont want to get
1.79769313486232e+308 which does not compare equal (in Matlab or C) to
the first ? This is a bug.

I disagree to some extent. I'm not sure it's meaningful to expect that
(what if the database and the client are on different architectures) in
general. In any case, you're effectively going from decimal
representation to double to decimal representation (the string you used
to insert it -> internal representation -> string used to output it) and
that's only guaranteed to be correct up to DBL_DIG digits as far as I can
tell. I think it'd be nice to have an option to get more digits for those
sorts of applications, however.

In the previous email example, in C, I was going from decimal to double
and so on, but this is not the case when I do some simulation. In this
case it will allways be from Matlab double to PostgreSQL float8 and from
libpq PQgetvalue() string to Matlab double. The example was just a x86
number example where (got the string from Matlab double) query output
would fail.

This would probably make sense as an option, so why don't you look at the
past discussions and see if you can come up with a solution that keeps
everyone happy (and preferably implement it, but...) :)

but ???

, but I realize that you might not be interested in doing such. (I figured
the last part was implied)

ok! :)

No problem. I have seen the GUC thing in the source (guc.c etc...) and it
does not look too dificult. It has lots of examples in the code itself.
What I am saying is that I can do it if pg-people agree on the (some)
way to do it.

To have parameters, say DOUBLE_FORMAT and FLOAT_FORMAT, which could have
option values of 'SHORT' and 'LONG'.
Option 'SHORT' would be default and produce the standard sprintf(ascii,...
Option 'LONG' would produce sprintf(ascii, "%25.18g", num).

Other way would be to have number parameters to be used in the sprintf
calls, in place of 25 and 18, in the format string.

From what Tom said, something similar was suggested and there were issues

brought up. I don't know what they were, since I wasn't personally
terribly interested, but it should be in the archives. If there were any
concerns, you'll probably need to deal with those as well.

I looked at some of these emails and it seemed to me that the problem
was that Tom did'nt want a parameter that would force people to know
about printf number formatting. I think the first solution above (the
SHORT and LONG way) is simple, maintains usual output as default and
enables 'maximum' precision at request.

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

--
----------------------------------------------------------------------
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax: (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao

#17Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Pedro M. Ferreira (#16)
Re: float output precision questions

On Wed, 30 Oct 2002, Pedro M. Ferreira wrote:

Stephan Szabo wrote:

On Wed, 30 Oct 2002, Pedro Miguel Frazao Fernandes Ferreira wrote:

I understand that if people insert a value of 1.1 in a double, they want
to get 1.1 without knowing that in fact the stored number is
1.10000000000000009. But do you understand that if some people insert,
for example, a value of 1.79769313486231571e+308 they dont want to get
1.79769313486232e+308 which does not compare equal (in Matlab or C) to
the first ? This is a bug.

I disagree to some extent. I'm not sure it's meaningful to expect that
(what if the database and the client are on different architectures) in
general. In any case, you're effectively going from decimal
representation to double to decimal representation (the string you used
to insert it -> internal representation -> string used to output it) and
that's only guaranteed to be correct up to DBL_DIG digits as far as I can
tell. I think it'd be nice to have an option to get more digits for those
sorts of applications, however.

In the previous email example, in C, I was going from decimal to double
and so on, but this is not the case when I do some simulation. In this
case it will allways be from Matlab double to PostgreSQL float8 and from
libpq PQgetvalue() string to Matlab double. The example was just a x86
number example where (got the string from Matlab double) query output
would fail.

I was just responding to it being a bug. I don't think that expecting
a float->db->float (double->db->double) giving the same double is always
safe when you assume that the PostgreSQL machine might be on a system with
different guarantees about precision. In practice, it's probably not a
big deal.

, but I realize that you might not be interested in doing such. (I figured
the last part was implied)

ok! :)

No problem. I have seen the GUC thing in the source (guc.c etc...) and it
does not look too dificult. It has lots of examples in the code itself.
What I am saying is that I can do it if pg-people agree on the (some)
way to do it.

Yeah, I didn't think it'd be hard, but sometimes people are unable or
unwilling to do C code for things.

I looked at some of these emails and it seemed to me that the problem
was that Tom did'nt want a parameter that would force people to know
about printf number formatting. I think the first solution above (the
SHORT and LONG way) is simple, maintains usual output as default and
enables 'maximum' precision at request.

That seems reasonable then, Tom'll probably give any other objections he
might have if he has any.

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#17)
Re: float output precision questions

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

On Wed, 30 Oct 2002, Pedro M. Ferreira wrote:

I looked at some of these emails and it seemed to me that the problem
was that Tom did'nt want a parameter that would force people to know
about printf number formatting. I think the first solution above (the
SHORT and LONG way) is simple, maintains usual output as default and
enables 'maximum' precision at request.

That seems reasonable then, Tom'll probably give any other objections he
might have if he has any.

My recollection is that other people (perhaps Peter?) were the ones
objecting before. However I'd be somewhat unhappy with the proposal
as given:

Option 'SHORT' would be default and produce the standard sprintf(ascii,...
Option 'LONG' would produce sprintf(ascii, "%25.18g", num).

since this seems to me to hardwire inappropriate assumptions about the
number of significant digits in a double. (Yes, I know practically
everyone uses IEEE floats these days. But it's inappropriate for PG
to assume that.)

AFAICT the real issue here is that binary float representations will
have a fractional decimal digit of precision beyond what DBL_DIG claims.
I think I could support adding an option that switches between the
current output format:
sprintf(ascii, "%.*g", DBL_DIG, num);
and:
sprintf(ascii, "%.*g", DBL_DIG+1, num);
and similarly for float4. Given carefully written float I/O routines,
reading the latter output should reproduce the originally stored value.
(And if the I/O routines are not carefully written, you probably lose
anyway.) I don't see a need for allowing more flexibility than that.

Comments?

regards, tom lane

#19Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Tom Lane (#18)
Re: float output precision questions

On Wed, 30 Oct 2002, Tom Lane wrote:

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

On Wed, 30 Oct 2002, Pedro M. Ferreira wrote:

I looked at some of these emails and it seemed to me that the problem
was that Tom did'nt want a parameter that would force people to know
about printf number formatting. I think the first solution above (the
SHORT and LONG way) is simple, maintains usual output as default and
enables 'maximum' precision at request.

That seems reasonable then, Tom'll probably give any other objections he
might have if he has any.

My recollection is that other people (perhaps Peter?) were the ones
objecting before. However I'd be somewhat unhappy with the proposal
as given:

Option 'SHORT' would be default and produce the standard sprintf(ascii,...
Option 'LONG' would produce sprintf(ascii, "%25.18g", num).

since this seems to me to hardwire inappropriate assumptions about the
number of significant digits in a double. (Yes, I know practically
everyone uses IEEE floats these days. But it's inappropriate for PG
to assume that.)

True (which I actually was trying to get at in my messages as well). I'll
admit to having not read the precise proposal. It's really pretty outside
what I work with in any case.

AFAICT the real issue here is that binary float representations will
have a fractional decimal digit of precision beyond what DBL_DIG claims.
I think I could support adding an option that switches between the
current output format:
sprintf(ascii, "%.*g", DBL_DIG, num);
and:
sprintf(ascii, "%.*g", DBL_DIG+1, num);
and similarly for float4. Given carefully written float I/O routines,
reading the latter output should reproduce the originally stored value.
(And if the I/O routines are not carefully written, you probably lose
anyway.) I don't see a need for allowing more flexibility than that.

Well, on my system, it doesn't look like doing the above sprintfs will
actually work for all numbers. I did a simple program using an arbitrary
big number and the DBL_DIG+1 output when stuck into another double
actually was a different double value. DBL_DIG+2 worked on my system,
but...

#20Pedro M. Ferreira
pfrazao@ualg.pt
In reply to: Stephan Szabo (#17)
Re: float output precision questions

Tom Lane wrote:

My recollection is that other people (perhaps Peter?) were the ones
objecting before. However I'd be somewhat unhappy with the proposal
as given:

Option 'SHORT' would be default and produce the standard sprintf(ascii,...
Option 'LONG' would produce sprintf(ascii, "%25.18g", num).

since this seems to me to hardwire inappropriate assumptions about the
number of significant digits in a double. (Yes, I know practically
everyone uses IEEE floats these days. But it's inappropriate for PG
to assume that.)

I understand this. Unfortunately I only have IEEE compliant stuff.

AFAICT the real issue here is that binary float representations will
have a fractional decimal digit of precision beyond what DBL_DIG claims.

In fact, for some numbers I have been testing with, the double
representation can distinguish up to DBL_BIG+2.

I think I could support adding an option that switches between the
current output format:
sprintf(ascii, "%.*g", DBL_DIG, num);
and:
sprintf(ascii, "%.*g", DBL_DIG+1, num);

Easy to find numbers with double representation which would need DBL_BIG+2.

and similarly for float4. Given carefully written float I/O routines,
reading the latter output should reproduce the originally stored value.

For some numbers it does not. Not true as I said above.

(And if the I/O routines are not carefully written, you probably lose
anyway.) I don't see a need for allowing more flexibility than that.

Tests like a==b will fail for some numbers with DBL_BIG+1.
Its like I said before, the guys from matlab (in x86 IEEE float) go to
DBL_BIG+3 to have 'maximum precision'.

Comments?

Yes. I think there are several options.
I checked the sprintf(ascii, "%A", num) output format and all the
numbers that would fail because of DBL_DIG=15 are ok. After insertion on
a table and conversion to double after a query, comparison a==b holds.
AFAICT "%A" is system independent.

I would (if I may) propose the following:

Have two parameters, say DOUBLE_OUTPUT and EXTRA_DIGITS. DOUBLE_OUTPUT
would select from decimal output or normalized output. EXTRA_DIGITS
would add the required extra digits, from 0 (default) to 3, when output
is decimal.

EXTRA_DIGITS:
in the range [0:3]. 0 as defualt.

DOUBLE_OUTPUT:

'DECIMAL': sprintf(ascii, "%.*g", DBL_DIG+EXTRA_DIGITS, num); (default)
'NORMALIZED': sprintf(ascii, "%A", num);

The same could be done for floats (float4).

This way PG does not assume anything (DOUBLE_OUTPUT as 'NORMALIZED'), it
does not hardwire 'inappropriate' assumptions about the number of
significant digits in a double (default EXTRA_DIGITS=0), and it gives
flexibility (EXTRA_DIGITS!=0) if needed.
I think this is functional and reasonable.

Regards,
Pedro M. Ferreira

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

--
----------------------------------------------------------------------
Pedro Miguel Frazao Fernandes Ferreira
Universidade do Algarve
Faculdade de Ciencias e Tecnologia
Campus de Gambelas
8000-117 Faro
Portugal
Tel./Fax: (+351) 289 800950 / 289 819403
http://w3.ualg.pt/~pfrazao

#21Pedro M. Ferreira
pfrazao@ualg.pt
In reply to: Stephan Szabo (#17)
#22Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Pedro M. Ferreira (#21)
#23Pedro M. Ferreira
pfrazao@ualg.pt
In reply to: Zeugswetter Andreas SB SD (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#19)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pedro M. Ferreira (#23)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pedro M. Ferreira (#20)
#27Pedro M. Ferreira
pfrazao@ualg.pt
In reply to: Stephan Szabo (#17)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pedro M. Ferreira (#20)
#29Pedro M. Ferreira
pfrazao@ualg.pt
In reply to: Stephan Szabo (#19)
#30Pedro M. Ferreira
pfrazao@ualg.pt
In reply to: Stephan Szabo (#17)
#31Bruno Wolff III
bruno@wolff.to
In reply to: Tom Lane (#28)
#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruno Wolff III (#31)
#33Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#28)