somebody could explain this?

Started by Cristian Prietoabout 20 years ago15 messages

Cristian Prieto

cristian@clickdiario.com

about 20 years ago

Hello, I'm using PostgreSQL 8.0.4 in Fedora Core 3, right now I'm learning a
little about the postgresql internals and the way some kind of SPs could be
written in c language; I found something really weird and I cannot explain
to me this behavior:

#include "postgres.h"
#include "fmgr.h"

PG_FUNCTION_INFO_V1(test);

Datum
repeat_item(PG_FUNCTION_ARGS)
{
int num_times;

num_times = PG_GETARG_FLOAT8(0) * 100;
PG_RETURN_INT32(num_times);
}

Inside psql this happens:

# Create or replace function test(float) returns integer as 'test.so'
language 'c' stable;

select test(0.1);
Returns 10

Select test(0.11);
Returns 11

Select test(0.12)
Returns 11

Select test(0.13)
Returns 13

Select test(0.14)
Returns 14

Select test(0.15)
Returns 14

What Is happening here?

nagy@ecircle-ag.com

about 20 years ago

In reply to: Cristian Prieto (#1)

Re: somebody could explain this?

Cristian,

I bet it's related to some rounding issue and the fact that floating
formats are approximative even for small integers.
Probably 12 ands up being slightly less in floating format (something
like 11.999...), and the cast to integer is truncating it.
Not 100% sure though... read up on your API, I'm not a C programmer :-)

HTH,
Csaba.

Show quoted text

On Fri, 2005-11-04 at 17:16, Cristian Prieto wrote:

Hello, I'm using PostgreSQL 8.0.4 in Fedora Core 3, right now I'm learning a
little about the postgresql internals and the way some kind of SPs could be
written in c language; I found something really weird and I cannot explain
to me this behavior:

#include "postgres.h"
#include "fmgr.h"

PG_FUNCTION_INFO_V1(test);

Datum
repeat_item(PG_FUNCTION_ARGS)
{
int num_times;

num_times = PG_GETARG_FLOAT8(0) * 100;
PG_RETURN_INT32(num_times);
}

Inside psql this happens:

# Create or replace function test(float) returns integer as 'test.so'
language 'c' stable;

select test(0.1);
Returns 10

Select test(0.11);
Returns 11

Select test(0.12)
Returns 11

Select test(0.13)
Returns 13

Select test(0.14)
Returns 14

Select test(0.15)
Returns 14

What Is happening here?

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

andrew@dunslane.net

about 20 years ago

In reply to: Cristian Prieto (#1)

Re: somebody could explain this?

Classic behaviour, which has nothing to do with postgres. Try the
program below to see the same effect. You probably should be using a
rounding function to see what you seem to expect.

cheers

andrew

#include <stdio.h>

main()
{
double x[] = { 0.1, 0.11, 0.12, 0.13, 0.14, 0.15 };
int i,n;
for (i=0 ; i < 6; i++)
{
n = x[i] * 100;
printf("%d\n",n);
}

}

Cristian Prieto wrote:

Show quoted text

Hello, I'm using PostgreSQL 8.0.4 in Fedora Core 3, right now I'm learning a
little about the postgresql internals and the way some kind of SPs could be
written in c language; I found something really weird and I cannot explain
to me this behavior:

#include "postgres.h"
#include "fmgr.h"

PG_FUNCTION_INFO_V1(test);

Datum
repeat_item(PG_FUNCTION_ARGS)
{
int num_times;

num_times = PG_GETARG_FLOAT8(0) * 100;
PG_RETURN_INT32(num_times);
}

Inside psql this happens:

# Create or replace function test(float) returns integer as 'test.so'
language 'c' stable;

select test(0.1);
Returns 10

Select test(0.11);
Returns 11

Select test(0.12)
Returns 11

Select test(0.13)
Returns 13

Select test(0.14)
Returns 14

Select test(0.15)
Returns 14

What Is happening here?

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

Martijn van Oosterhout

kleptog@svana.org

about 20 years ago

In reply to: Cristian Prieto (#1)

Re: somebody could explain this?

On Fri, Nov 04, 2005 at 10:16:50AM -0600, Cristian Prieto wrote:

Hello, I'm using PostgreSQL 8.0.4 in Fedora Core 3, right now I'm learning a
little about the postgresql internals and the way some kind of SPs could be
written in c language; I found something really weird and I cannot explain
to me this behavior:

What's happening here is that the multiplication, being floating point,
has some accumulated error such that when you multiply it by 100 and
convert it to an int, it hits the cutoff.

Note: converting a float to an int tends to invoke trunc() or something
similar. trunc(14.999999999999) = 14. You need to decide whether maybe
you want round() instead. Or decide the actual cutoff yourself. I
sometimes fix this by adding 0.00001 to numbers before converting to
integer, to avoid these issues.

Floating points numbers are accurate but not precise.

Hope this helps,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.

tgl@sss.pgh.pa.us

about 20 years ago

In reply to: Cristian Prieto (#1)

Re: somebody could explain this?

"Cristian Prieto" <cristian@clickdiario.com> writes:

Datum
repeat_item(PG_FUNCTION_ARGS)
{
int num_times;

num_times = PG_GETARG_FLOAT8(0) * 100;
PG_RETURN_INT32(num_times);
}

# Create or replace function test(float) returns integer as 'test.so'
language 'c' stable;

"float" is FLOAT4 not FLOAT8 ... your function is picking up some
garbage data.

There might also be some roundoff issues to think about --- I'd be
inclined to add an rint() call, or at least add 0.5 before truncating to
integer.

regards, tom lane

otto.hirr@olabinc.com

about 20 years ago

In reply to: Martijn van Oosterhout (#4)

Re: somebody could explain this?

What's happening here is that the multiplication, being
floating point,
has some accumulated error such that when you multiply it by 100 and
convert it to an int, it hits the cutoff.

Trivia...
I heard a story many years ago that landed a programmer in prison...
He worked on the program that calculated interest that was to be
deposited into an account. Instead of rounding or truncating the
amount beyond what the bank wanted to use, he saved the small fractional
dollars (really micro-pennies) into an account. Soon he had several
hundred thousand in the account - there is a lot of multiplication...
He got caught.

nagy@ecircle-ag.com

about 20 years ago

In reply to: Martijn van Oosterhout (#4)

Re: [OT] somebody could explain this?

[snip]

Floating points numbers are accurate but not precise.

OK, now this one beats me... what's the difference between "accurate"
and "exact" ? I thought both mean something like "correct", but precise
refers to some action and accurate applies to a situation or
description...

I'm actually curios what it means. Merriam-Webster refers for both to
"correct" as a synonym.

Cheers,
Csaba.

Peter Eisentraut

peter_e@gmx.net

about 20 years ago

In reply to: Csaba Nagy (#7)

Re: [OT] somebody could explain this?

Csaba Nagy wrote:

[snip]

Floating points numbers are accurate but not precise.

OK, now this one beats me... what's the difference between "accurate"
and "exact" ? I thought both mean something like "correct", but
precise refers to some action and accurate applies to a situation or
description...

Accurate means close to the real value, precise means having a lot of
detail.

So 3 is more accurate than 4 as a representation of "Pi", but both are
not very precise.

5.32290753057207250735 is a very precise representation of "Pi" but
totally inaccurate.

This also means that the statement at the top is wrong. It should be
the other way around.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Bruno Wolff III

bruno@wolff.to

about 20 years ago

In reply to: Csaba Nagy (#7)

Re: [OT] somebody could explain this?

On Fri, Nov 04, 2005 at 18:30:56 +0100,
Csaba Nagy <nagy@ecircle-ag.com> wrote:

[snip]

Floating points numbers are accurate but not precise.

OK, now this one beats me... what's the difference between "accurate"
and "exact" ? I thought both mean something like "correct", but precise
refers to some action and accurate applies to a situation or
description...

I'm actually curios what it means. Merriam-Webster refers for both to
"correct" as a synonym.

My memory from science courses a long time ago, is that precision refers
to how much information you have (e.g. the number of digits in a number)
and accuracy refers to how close your information is to reality.
Using a precision that was much higher than justified by accuracy used to
get points deducted from lab report grades.

In mathematics "precise" has a somewhat different meaning, but isn't a synonym
for "accurate" in that context.

Kevin.Grittner@wicourts.gov

about 20 years ago

In reply to: Bruno Wolff III (#9)

Re: [OT] somebody could explain this?

I think the crucial point is that the common IEEE floating point
formats are unable to store an EXACT representation of common
decimal fractions (such as .1) -- they can only store an
APPROXIMATION.

Peter Eisentraut <peter_e@gmx.net> >>>

Csaba Nagy wrote:

[snip]

Floating points numbers are accurate but not precise.

OK, now this one beats me... what's the difference between "accurate"
and "exact" ? I thought both mean something like "correct", but
precise refers to some action and accurate applies to a situation or
description...

Accurate means close to the real value, precise means having a lot of
detail.

So 3 is more accurate than 4 as a representation of "Pi", but both are
not very precise.

5.32290753057207250735 is a very precise representation of "Pi" but
totally inaccurate.

This also means that the statement at the top is wrong. It should be
the other way around.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Import Notes

Resolved by subject fallback

otto.hirr@olabinc.com

about 20 years ago

In reply to: Kevin Grittner (#10)

Re: [OT] somebody could explain this?

Lets start with an agreed upon expert, Knuth.
The art of computer programming. Vol2,Seminumerical Algorithms.Ed2.
pg682: Precision: The number of digits in a representation.
pg212: Section: Accuracy of floating point numbers.
"A rough (but reasonably useful) way to express the behavior of
floating point arithmetic can be based upon the concept of
"significant figures" or relative error."

The float point rep is built of a exponent and fractional part.
(Note mantissa means "a worthless addition"(pg199)

Most notably, the IEEE rep, either single or double, most certainly
has the ability to store the EXACT value for 0.1.

The issue is how to REPRESENT an exact value in the computer.
Some values may be represented exactly. Some values may not.
The tricky part is then operations on these values. We may have
exact representation of two values, then applying an operation
that results in a theoretical value that can only have an
approximate representation. Accuracy deals with the difference
between the theoretical exact value and our ability to
represent it.

Note that Knuth's observation about add/sub v. mul/div:

"Roughly speaking, the operations of floating point multiplication
and division do not magnify the relative error by very much; but
floating point subtraction of nearly equal quantities (and floating
point addition, u+v, where u is nearly equal to -v) can very greatly
increase the relative error."

Surprised?

Things to consider:
Is the value I want to represent have, shall we say, a limit to
the number of digits required to fully define its value,
eg 0.5, 0.9987

Is the value I want to represent have no limit to the number of
digits, eg pi, etc.

Does my representation provide a value that is within acceptable
limits of error, between the rep and actual value.

!Are the operations that I am performing introducing greater
amounts of relative error!

Break out the old computer book/math book and review...
Lets not assume or re-invent the wheel.

..Otto

Show quoted text

I think the crucial point is that the common IEEE floating point
formats are unable to store an EXACT representation of common
decimal fractions (such as .1) -- they can only store an
APPROXIMATION.

Peter Eisentraut <peter_e@gmx.net> >>>

Csaba Nagy wrote:

[snip]

Floating points numbers are accurate but not precise.

OK, now this one beats me... what's the difference between

"accurate"

and "exact" ? I thought both mean something like "correct", but
precise refers to some action and accurate applies to a situation or
description...

Accurate means close to the real value, precise means having a lot of
detail.

So 3 is more accurate than 4 as a representation of "Pi", but
both are
not very precise.

5.32290753057207250735 is a very precise representation of "Pi" but
totally inaccurate.

This also means that the statement at the top is wrong. It should be
the other way around.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

---------------------------(end of
broadcast)---------------------------
TIP 6: explain analyze is your friend

tgl@sss.pgh.pa.us

about 20 years ago

In reply to: Otto Hirr (#11)

Re: [OT] somebody could explain this?

"Otto Hirr" <otto.hirr@olabinc.com> writes:

Most notably, the IEEE rep, either single or double, most certainly
has the ability to store the EXACT value for 0.1.

Oh really?

regards, tom lane

Kevin.Grittner@wicourts.gov

about 20 years ago

In reply to: Tom Lane (#12)

Re: [OT] somebody could explain this?

No, the IEEE formats can not store .1 exactly. How close it
comes depends on the rest of the number. For single and
double precision, respectively, the IEEE representations fall
at about:

0.100000001490116119384765625
0.1000000000000000055511151231257827021181583404541015625

Libraries must do some rounding of these approximations when
formatting for display, so the error is rarely seen, except when
conversions to other data types are done via truncation. The
problem values from the previous example were actually stored
as something close to the following.

0.11999999999999999555910790149937383830547332763671875
0.1499999999999999944488848768742172978818416595458984375

Whole numbers, on the other hand, are always accurate unless
they exceed the number of significant digits for the format.

-Kevin

"Otto Hirr" <otto.hirr@olabinc.com> >>>

Most notably, the IEEE rep, either single or double, most certainly
has the ability to store the EXACT value for 0.1.

Import Notes

Resolved by subject fallback

gsstark@mit.edu

about 20 years ago

In reply to: Otto Hirr (#6)

Re: somebody could explain this?

"Otto Hirr" <otto.hirr@olabinc.com> writes:

Trivia...
I heard a story many years ago that landed a programmer in prison...
He worked on the program that calculated interest that was to be
deposited into an account. Instead of rounding or truncating the
amount beyond what the bank wanted to use, he saved the small fractional
dollars (really micro-pennies) into an account. Soon he had several
hundred thousand in the account - there is a lot of multiplication...
He got caught.

Uh yeah, that story would be the plot to Superman II (and Office Space too).

--
greg

otto.hirr@olabinc.com

about 20 years ago

In reply to: Tom Lane (#12)

Re: [OT] somebody could explain this?

yes, MAJOR goof on my part. My brain cells were not firing quite right :(
For those really interested, here are some resources:
http://www.cs.wisc.edu/~cs354-1/cs354/karen.notes/reps.flpt.html
http://cch.loria.fr/documentation/IEEE754/ACM/goldberg.pdf
http://babbage.cs.qc.edu/courses/cs341/IEEE-754references.html

Show quoted text

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org]On Behalf Of Tom Lane
Sent: Friday, November 04, 2005 11:52 AM
To: otto.hirr@olabinc.com
Cc: pgsql-hackers@postgresql.org
Subject: Re: [OT] somebody could explain this?

"Otto Hirr" <otto.hirr@olabinc.com> writes:

Most notably, the IEEE rep, either single or double, most certainly
has the ability to store the EXACT value for 0.1.

Oh really?

regards, tom lane

---------------------------(end of
broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings