Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Started by Shachar Shemeshabout 19 years ago41 messageshackers

shachar@shemesh.biz

about 19 years ago

Hi guys of the pgsql-hackers list.

I've received a bug report on the OLE DB list, which I suspect is
actually a server bug. The correspondence so far is listed further on,
but, in a nutshell, user runs an OLE DB client on windows (OLE DB uses
the binary interface), and server version 8.1.9 on Windows, and all is
fine. When the server is switched to 8.1.8 on Debian Etch on ARM9,
float8 type is not received properly by OLE DB.

Since OLE DB doesn't really care what version the server is running, the
chances of this being a server side bug are really high. I don't know
ARM9 well enough to comment on floating point format there.

Julian Heeb wrote:

Shachar Shemesh schrieb:

Julian Heeb wrote:

Hello

Our acounting software can use the PostgreSQL OLE DB driver to access
a postgreSQL database. With the pg server installed on windows,
everything works fine.

I moved now the database to a postgreSQL server on a linux server, but
now every floating point number gets wrongly interpreted by the
acounting software, either by replacing it with a 0 or a very large
number (e.g. xxxE+308). Only the floating point numbers are affected,
integer or characters are correct. pgAdmin shows even the fp numbers
correctly, so I guess it has something to do with the pgoledb driver.

Can someone give me a hint, how to solve the problem?

It's hard to give a precise answer. Let's try a couple of venues.

First of all, what platform is the Linux server? Is that an Intel, or
something else?

It is an ARM9 platform with Debian Etch (Linkstation Pro Fileserver with
Freelink).

Also, what is the precise type of the floating point var on the server?
Can you give the SQL line that generated the table?

The table has been generated by the following SQL line. The problem
occures at the double precision fields.

I have some bad news. This is the comment in the Postgresql source code.
This seems to be a core problem at the server side of things:

/* --------------------------------
* pq_sendfloat8 - append a float8 to a StringInfo buffer
*
* The point of this routine is to localize knowledge of the external
binary
* representation of float8, which is a component of several datatypes.
*
* We currently assume that float8 should be byte-swapped in the same way
* as int8. This rule is not perfect but it gives us portability across
* most IEEE-float-using architectures.
* --------------------------------
*/

Could it be that ARM9 is not IEEE float standard? Can anyone from the
"hackers" list give any insight into this? The function for the data
type import on the client side seems to be in order (switch the byte
order around, and assume it's a valid "double" C type).

Shachar

Import Notes

Reply to msg id not found: 464EE563.4080808@yahoo.deReference msg id not found: 20070518201332.66837.qmail@web27705.mail.ukl.yahoo.comReference msg id not found: 464E139D.9000906@shemesh.bizReference msg id not found: 464EE563.4080808@yahoo.de

Stefan Kaltenbrunner

stefan@kaltenbrunner.cc

about 19 years ago

In reply to: Shachar Shemesh (#1)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Shachar Shemesh wrote:

Hi guys of the pgsql-hackers list.

I've received a bug report on the OLE DB list, which I suspect is
actually a server bug. The correspondence so far is listed further on,
but, in a nutshell, user runs an OLE DB client on windows (OLE DB uses
the binary interface), and server version 8.1.9 on Windows, and all is
fine. When the server is switched to 8.1.8 on Debian Etch on ARM9,
float8 type is not received properly by OLE DB.

if 8.1.8 is built based on the debian packages it is probably compiled
with --enable-integer-datetimes.
Is the oledb client library able to cope with that ?

Stefan

Tom Lane

tgl@sss.pgh.pa.us

about 19 years ago

In reply to: Shachar Shemesh (#1)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Shachar Shemesh <shachar@shemesh.biz> writes:

I've received a bug report on the OLE DB list, which I suspect is
actually a server bug. The correspondence so far is listed further on,
but, in a nutshell, user runs an OLE DB client on windows (OLE DB uses
the binary interface), and server version 8.1.9 on Windows, and all is
fine. When the server is switched to 8.1.8 on Debian Etch on ARM9,
float8 type is not received properly by OLE DB.

Perhaps OLE is trying to use binary instead of text transmission of
data? It's not a server bug if ARM has a different FP format than
the client hardware; it's the client's responsibility to either use
text format or be prepared to cope with whatever the binary format is.

regards, tom lane

Shachar Shemesh

shachar@shemesh.biz

about 19 years ago

In reply to: Stefan Kaltenbrunner (#2)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Stefan Kaltenbrunner wrote:

Shachar Shemesh wrote:

Hi guys of the pgsql-hackers list.

I've received a bug report on the OLE DB list, which I suspect is
actually a server bug. The correspondence so far is listed further on,
but, in a nutshell, user runs an OLE DB client on windows (OLE DB uses
the binary interface), and server version 8.1.9 on Windows, and all is
fine. When the server is switched to 8.1.8 on Debian Etch on ARM9,
float8 type is not received properly by OLE DB.

if 8.1.8 is built based on the debian packages it is probably compiled
with --enable-integer-datetimes.
Is the oledb client library able to cope with that ?

I'm calling "GetPgStatus(&stat, _bstr_t("integer_datetimes") );" to find
out what the server representation is, and cope accordingly. This only
doesn't work on 7.4 servers, but otherwise I'm fine.

either way, this is off topic to this problem.

From further reading, it seems that ARM indeed uses its own
representation for IEEE floats. I'll try to poll my sources, try and
find out what the %(!@#&$ this format actually is (google was no help),
and try and formulate a patch for PG to export it in IEEE despite the
discrepancies.

Stefan

Shachar

Shachar Shemesh

shachar@shemesh.biz

about 19 years ago

In reply to: Tom Lane (#3)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Tom Lane wrote:

Shachar Shemesh <shachar@shemesh.biz> writes:

I've received a bug report on the OLE DB list, which I suspect is
actually a server bug. The correspondence so far is listed further on,
but, in a nutshell, user runs an OLE DB client on windows (OLE DB uses
the binary interface), and server version 8.1.9 on Windows, and all is
fine. When the server is switched to 8.1.8 on Debian Etch on ARM9,
float8 type is not received properly by OLE DB.

Perhaps OLE is trying to use binary instead of text transmission of
data?

Of course it does. That's what the OLE DB specs say. Said so in my
original email.

It's not a server bug if ARM has a different FP format than
the client hardware;

No. The server can use, internally, whatever it wants.

it's the client's responsibility to either use
text format or be prepared to cope with whatever the binary format is.

I agree 100%.

I'll reiterate - the problem is not that PG is exporting the internal
ARM FP format. The problem is that the server is exporting the internal
ARM FP format when the server is ARM, and the IEEE format when the
server is Intel. It's not the format, it's the inconsistency.

I can (and I do) handle, in PgOleDb, binary format that are vastly
different than those that I need (anyone said "timestamps"?). Handling a
format that is inconsistent across same version backends merely because
of platform, now that's a server bug if I ever saw one.

regards, tom lane

Like I said elsewhere, I'm willing to write a patch to "pq_sendfloat8"
(and probably "pq_getmsgfloat8" too) to make sure it does the conversion
on ARM platforms. Hell, I think I can even write it portable enough so
that it will work on all non-IEEE platforms (I'm not sure yet, but I
have a general idea). What I'm hoping for, however, is that if I do, you
(or another comitter) will get it in.

Shachar

Tom Lane

tgl@sss.pgh.pa.us

about 19 years ago

In reply to: Shachar Shemesh (#5)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Shachar Shemesh <shachar@shemesh.biz> writes:

I'll reiterate - the problem is not that PG is exporting the internal
ARM FP format. The problem is that the server is exporting the internal
ARM FP format when the server is ARM, and the IEEE format when the
server is Intel. It's not the format, it's the inconsistency.

This is not a bug, it's intentional. While IEEE-spec floats are
reasonably interchangeable these days (modulo the endianness issue),
other FP formats tend to have different ranges, special cases, etc.
If we try to force them to IEEE spec we may have problems with overflow,
loss of precision, who knows what.

Like I said elsewhere, I'm willing to write a patch to "pq_sendfloat8"
(and probably "pq_getmsgfloat8" too) to make sure it does the conversion
on ARM platforms. Hell, I think I can even write it portable enough so
that it will work on all non-IEEE platforms

Really? Will it be faster and more reliable than conversion to text?
(In this context "reliable" means "can reproduce the original datum
exactly when transmitted back".)

regards, tom lane

Shachar Shemesh

shachar@shemesh.biz

about 19 years ago

In reply to: Tom Lane (#6)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Tom Lane wrote:

Shachar Shemesh <shachar@shemesh.biz> writes:

I'll reiterate - the problem is not that PG is exporting the internal
ARM FP format. The problem is that the server is exporting the internal
ARM FP format when the server is ARM, and the IEEE format when the
server is Intel. It's not the format, it's the inconsistency.

This is not a bug, it's intentional. While IEEE-spec floats are
reasonably interchangeable these days (modulo the endianness issue),
other FP formats tend to have different ranges, special cases, etc.
If we try to force them to IEEE spec we may have problems with overflow,
loss of precision, who knows what.

Yes, but if we do not then we have a data interchange library that is
useless for data interchange. I think overflow and precision loss is
preferable.

Please remember that I'm only trying to help Postgresql here. I have a
spec to work with on the outside. I'm more than willing to do what's
necessary (see the STRRRANGE date conversion code) in order to adapt
whatever PG throws my way to the no less strange representation expected
of me. That's what I do as a driver hacker.

Sometimes, the specs don't help me. Windows' notion of "timezone free
timestamps" is nothing short of a disgrace, and some of the hacks that
are needed around that issues are, well, hacks. I don't come complaining
here, because this has nothing to do with PG. It's bad design on the
other end of the two ends that a driver has to make meet.

But sometimes, like now, PG puts me in an impossible position. You are
essentially telling me "you will get the numbers in an unknown format,
you will not have any way of knowing whether you got them in a strange
format or not, nor will you have any docs on what that format is going
to be". That is no way to treat your driver developers.

Like I said elsewhere, I'm willing to write a patch to "pq_sendfloat8"
(and probably "pq_getmsgfloat8" too) to make sure it does the conversion
on ARM platforms. Hell, I think I can even write it portable enough so
that it will work on all non-IEEE platforms

Really? Will it be faster

Absolutely. Do you honestly believe that turning a 64bit binary number
into a 40 something byte decimal number will be quicker than turning a
64 bit binary number into another 64 bit number? For one thing, I really
doubt that my technique will require division, modulo or, in fact, any
math operations at all. It will likely be done with a few bit shifting
and that's it.

I also find it strange, though, that you berate me for using binary
rather than text format, and then complain about speed. That's what
makes OLE DB faster than ODBC - binary interface.

and more reliable than conversion to text?

Well, depends on how you define "more reliable". If you define it to
mean "exactly represents what happens in the server internals", then the
answer is "no". If you define it to mean "make more sense to the client,
and have better chances of producing results that more closely
approximate the right number than the current code", then the answer is
a definite yes.

(In this context "reliable" means "can reproduce the original datum
exactly when transmitted back".)

Who cares? If you are using the same function for binary communication
inside the server and for communications to the clients (or, for that
matter, another server), then there is something wrong in your design.
What are the "send" functions used for, beside server to client
communication, anyways? You are asking me to treat the binary data as an
opaque. Well, I'll counter with a question - what good is that to me?

Please note that the current code is useless for communicating binary
data between two servers, even if they are guaranteed to be of the same
version! How much less reliable can you get?

Please, give your own interface designers something to work with. Your
attitude essentially leaves me out in the cold.

regards, tom lane

Shachar

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

about 19 years ago

In reply to: Shachar Shemesh (#7)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Shachar Shemesh wrote:

Tom Lane wrote:

Shachar Shemesh <shachar@shemesh.biz> writes:

I'll reiterate - the problem is not that PG is exporting the internal
ARM FP format. The problem is that the server is exporting the internal
ARM FP format when the server is ARM, and the IEEE format when the
server is Intel. It's not the format, it's the inconsistency.

This is not a bug, it's intentional. While IEEE-spec floats are
reasonably interchangeable these days (modulo the endianness issue),
other FP formats tend to have different ranges, special cases, etc.
If we try to force them to IEEE spec we may have problems with overflow,
loss of precision, who knows what.

Yes, but if we do not then we have a data interchange library that is
useless for data interchange. I think overflow and precision loss is
preferable.

I agree in principle that the wire protocol should be
platform-independent. At the very least, if the server always sends in
native format, the client needs to know which format it's receiving and
be able to convert between all formats. Switching to a common format on
all platforms would of course make life easier for client library
developers.

But sometimes, like now, PG puts me in an impossible position. You are
essentially telling me "you will get the numbers in an unknown format,
you will not have any way of knowing whether you got them in a strange
format or not, nor will you have any docs on what that format is going
to be". That is no way to treat your driver developers.

You seem to be ignoring the fact that the text format is
platform-independent. That's what for example JDBC uses, and I'd imagine
other interfaces as well. Is it not possible to use text format in OLE
DB, for floating points?

(In this context "reliable" means "can reproduce the original datum
exactly when transmitted back".)

Who cares? If you are using the same function for binary communication
inside the server and for communications to the clients (or, for that
matter, another server), then there is something wrong in your design.

What are the "send" functions used for, beside server to client
communication, anyways? You are asking me to treat the binary data as an
opaque. Well, I'll counter with a question - what good is that to me?

Imagine an application that does this:

Issue query "SELECT foofloat FROM footable", and store the value to a
variable
Issue "INSERT INTO footable2 VALUES (?)", and send back the same value.

Don't you think footable and footable2 should now have the same value?
If "SELECT * FROM footable, footable2 WHERE foofloat=foofloat" returns 0
rows, I'd be pissed.

It's possible that the conversion within the driver loses some
precision, depending on the data types supported by the language and
platform, but the wire protocol should at least give the driver a chance
to get it right.

Please note that the current code is useless for communicating binary
data between two servers, even if they are guaranteed to be of the same
version! How much less reliable can you get?

The current code is not used for communicating between two servers. And
it works fine as long as the client and the server are on the same platform.

Don't get me wrong, I agree that the binary format is broken as it is,
but the cure mustn't be worse than the disease.

Please, give your own interface designers something to work with. Your
attitude essentially leaves me out in the cold.

Design a wire protocol that
1. Doesn't lose information on any platform
2. Is more efficient than text format

and I'm pretty sure it'll be accepted.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Tom Lane

tgl@sss.pgh.pa.us

about 19 years ago

In reply to: Heikki Linnakangas (#8)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Heikki Linnakangas <heikki@enterprisedb.com> writes:

I agree in principle that the wire protocol should be
platform-independent.

The *TEXT* format is for that. The problem here is that Shachar is
insisting on using binary format in a context where it is inappropriate.
Binary format has other goals that are not always compatible with 100%
platform independence --- that's unfortunate, sure, but it's reality.

regards, tom lane

#10

Shachar Shemesh

shachar@shemesh.biz

about 19 years ago

In reply to: Tom Lane (#9)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Tom Lane wrote:

Binary format has other goals that are not always compatible with 100%
platform independence --- that's unfortunate, sure, but it's reality.

Maybe the misunderstanding is mine. What are the goals for the binary
format?

Shachar

#11

Shachar Shemesh

shachar@shemesh.biz

about 19 years ago

In reply to: Heikki Linnakangas (#8)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Heikki Linnakangas wrote:

But sometimes, like now, PG puts me in an impossible position. You are
essentially telling me "you will get the numbers in an unknown format,
you will not have any way of knowing whether you got them in a strange
format or not, nor will you have any docs on what that format is going
to be". That is no way to treat your driver developers.

You seem to be ignoring the fact that the text format is
platform-independent. That's what for example JDBC uses, and I'd
imagine other interfaces as well. Is it not possible to use text
format in OLE DB, for floating points?

It is impossible to use text format for just floating point. I often
don't know in advance what type the result is going to be.

I can switch EVERYTHING to text mode, but as the other end of the
interface requires me to get things out in binary format, I get the
following absurd path:
PG stores in binary
PG translates to text
PG sends to driver
driver translates to binary

As long as I get consistent binary format, I prefer to translate between
binary formats than between text and binary.

What are the "send" functions used for, beside server to client
communication, anyways? You are asking me to treat the binary data as an
opaque. Well, I'll counter with a question - what good is that to me?

Imagine an application that does this:

Issue query "SELECT foofloat FROM footable", and store the value to a
variable
Issue "INSERT INTO footable2 VALUES (?)", and send back the same value.

Why would I want to do that over "insert into footable2 select foofloat
from footable"?

I know, even if it makes no sense you'd want it to work. All I'm saying
is that something has got to give, and there is no reason to assume that
your usage is more likely than mine.

For that reason, without looking into the ARM float implementation, it
is just as likely that it contains LESS precision than the IEEE one.
Would that change the objection?

Don't you think footable and footable2 should now have the same value?
If "SELECT * FROM footable, footable2 WHERE foofloat=foofloat" returns
0 rows, I'd be pissed.

It's possible that the conversion within the driver loses some
precision, depending on the data types supported by the language and
platform, but the wire protocol should at least give the driver a
chance to get it right.

I'm not sure why there is a difference here. How is that above example
different than the exact same example written in ADODB (and thus passing
driver conversion)?

I'll take this one step further. Does the text representation never lose
precision? Ever?

What if I send a patch that sends a 64bit float as 128bit number,
containing two integers, one for mantissa and one for exponent. This
format is guaranteed to never lose precision, but is wire compatible
across platforms. Would that be considered a good solution?

The current code is not used for communicating between two servers.
And it works fine as long as the client and the server are on the same
platform.

So does the driver. And yet, a user complained! In my dictionary, this
means that "as long as the client and server are on the same platform"
is not a good enough requirement.

Don't get me wrong, I agree that the binary format is broken as it is,
but the cure mustn't be worse than the disease.

Please, give your own interface designers something to work with. Your
attitude essentially leaves me out in the cold.

Design a wire protocol that
1. Doesn't lose information on any platform
2. Is more efficient than text format

and I'm pretty sure it'll be accepted.

I just offered one. I would hate it myself, and it would mean that pre
8.3 (or whenever it is that it will go in) will have a different
representation than post the change, but it will live up to your requests.

Shachar

#12

Tom Lane

tgl@sss.pgh.pa.us

about 19 years ago

In reply to: Shachar Shemesh (#10)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Shachar Shemesh <shachar@shemesh.biz> writes:

Tom Lane wrote:

Binary format has other goals that are not always compatible with 100%
platform independence --- that's unfortunate, sure, but it's reality.

Maybe the misunderstanding is mine. What are the goals for the binary
format?

Well, the one that I'm concerned about at the moment is that dump/reload
using COPY BINARY ought to be 100% guaranteed to reproduce the original
datum.

Obviously, if you are transporting the dump across platforms then that
may be an impossibility. In that case you use a text dump and accept
that you get an approximation. But there should be a way to ensure that
you can have a lossless dump of whatever strange FP format your server
may use, as long as you are restoring onto a similar machine.

If there is a guaranteed-reversible transform between the ARM FP format
and IEEE format, then I'd be interested in hacking things the way you
suggest --- but what I suspect is that such a conversion must lose
either range or precision. There are only so many bits in a double.

regards, tom lane

#13

Andrew Dunstan

andrew@dunslane.net

about 19 years ago

In reply to: Shachar Shemesh (#5)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Shachar Shemesh wrote:

Perhaps OLE is trying to use binary instead of text transmission of
data?

Of course it does. That's what the OLE DB specs say. Said so in my
original email.

Why the heck do the OLE DB specs care about the internals of the
client-server prototocol? It is documented fairly clearly that text is
the only portable way to transfer data.

Perhaps we need to expand this sentence in the docs: "Keep in mind that
binary representations for complex data types may change across server
versions;"

The COPY docs are probably more correct: "The BINARY key word causes all
data to be stored/read as binary format rather than as text. It is
somewhat faster than the normal text mode, but a binary-format file is
less portable across machine architectures and PostgreSQL versions."

I do recall someone telling me that text mode transfer could actually be
faster than binary, somewhat to their (and my) surprise.

cheers

andrew

#14

Tom Lane

tgl@sss.pgh.pa.us

about 19 years ago

In reply to: Shachar Shemesh (#11)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Shachar Shemesh <shachar@shemesh.biz> writes:

Heikki Linnakangas wrote:

Is it not possible to use text
format in OLE DB, for floating points?

It is impossible to use text format for just floating point. I often
don't know in advance what type the result is going to be.

Sure it's "possible". Send a Parse command, ask for Describe Statement
output, then specify the column formats as desired in Bind. Now this
does imply an extra server round trip, which might be annoying if your
client code doesn't have another reason to need to peek at Describe
output.

An idea that's been in the back of my mind for awhile is to provide some
way to let the client say things like "I want float and timestamp
results in text and everything else in binary", so that one setup step
at the start of the session avoids the need for the extra round trips.
Haven't got a detailed proposal at the moment though.

What if I send a patch that sends a 64bit float as 128bit number,
containing two integers, one for mantissa and one for exponent. This
format is guaranteed to never lose precision, but is wire compatible
across platforms. Would that be considered a good solution?

No, not unless you can make the case why this handles NaNs and
denormalized numbers compatibly across platforms...

regards, tom lane

#15

Tom Lane

tgl@sss.pgh.pa.us

about 19 years ago

In reply to: Andrew Dunstan (#13)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Andrew Dunstan <andrew@dunslane.net> writes:

I do recall someone telling me that text mode transfer could actually be
faster than binary, somewhat to their (and my) surprise.

Seems a bit improbable --- what was their test case?

The only such situation that comes to mind is that some values are
smaller as text than binary (eg "2" is shorter as text than as any
binary numeric format), so in a situation where number of bytes sent
dominates all other costs, text would win. But of course there are also
many values that're smaller in binary format, so I'd think this would
only happen in restricted test cases.

regards, tom lane

#16

Andrew Dunstan

andrew@dunslane.net

about 19 years ago

In reply to: Tom Lane (#15)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

I do recall someone telling me that text mode transfer could actually be
faster than binary, somewhat to their (and my) surprise.

Seems a bit improbable --- what was their test case?

No idea - this was idle chat on IRC I think. I am similarly skeptical.
After all, we just had a discussion about improving performance of PLs
by avoiding use of the input/output functions in some cases.

cheers

andrew

#17

Shachar Shemesh

shachar@shemesh.biz

about 19 years ago

In reply to: Tom Lane (#14)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Tom Lane wrote:

Sure it's "possible". Send a Parse command, ask for Describe Statement
output, then specify the column formats as desired in Bind. Now this
does imply an extra server round trip, which might be annoying if your
client code doesn't have another reason to need to peek at Describe
output.

No, it's far worse than annoying. It kills my performance. OLE DB does
have a mechanism for explicit "prepare", and I certainly COULD do it
only for those cases, but it is meaningless.

What if I send a patch that sends a 64bit float as 128bit number,
containing two integers, one for mantissa and one for exponent. This
format is guaranteed to never lose precision, but is wire compatible
across platforms. Would that be considered a good solution?

No, not unless you can make the case why this handles NaNs and
denormalized numbers compatibly across platforms...

NaNs and infinite (plus and minus) should not be a problem. I'm not sure
what denormalized numbers are. If you mean (switching to base 10 for a
second) that 2*10^3 vs. 20*10^2, then I would have to ask why you want
them treated differently. What is the scenario in which you would want
to tell them apart? Likewise, would you really want to tell +0 and -0
apart? If I have an export/import round trip that turns -0 into +0, is
that really a problem?

regards, tom lane

Shachar

#18

Shachar Shemesh

shachar@shemesh.biz

about 19 years ago

In reply to: Tom Lane (#12)

Re: [Oledb-dev] Re: double precision error with pg linux server, but not with windows pg server

Tom Lane wrote:

Obviously, if you are transporting the dump across platforms then that
may be an impossibility. In that case you use a text dump and accept
that you get an approximation.

ן¿½That's something that I've been meaning to ask about, but you all
seemed so sure of yourself. What you are essentially saying is that no
one aside from the server itself is allowed to get full precision. That
seems like a strange decision to make.

But there should be a way to ensure that
you can have a lossless dump of whatever strange FP format your server
may use, as long as you are restoring onto a similar machine.

Personally (and I know my opinion "doesn't count"), I find the "similar
machine" requirement a bit hard. It is usually accepted for HA or live
load balancing, but is really inappropriate for backups or data
transfers. Just my humble opinion.

If there is a guaranteed-reversible transform between the ARM FP format
and IEEE format, then I'd be interested in hacking things the way you
suggest

I find it highly likely that there will be. After all, the ARM format
was not designed to be better packed than IEEE, just easier to hardware
implement in an efficient way. However

--- but what I suspect is that such a conversion must lose
either range or precision.  There are only so many bits in a double.

Like I said elsewhere, a 64bit FP only has 64bits, but there is nothing
constraining us to export a 64bit number to 64bits.

regards, tom lane

What I'm worried about in that regard is about other platforms that PG
may be running on. Like I said above, I'm fairly sure (will get the
specs and make sure) that there shouldn't be a problem in exporting ARM
FP into 64bit IEEE with no loss at all. This says nothing, however,
about other platforms. Unless we map all cases, we had better choose an
export format that is capable of extension.

Shachar

#19

Shachar Shemesh

shachar@shemesh.biz

about 19 years ago

In reply to: Andrew Dunstan (#13)

Re: [Oledb-dev] Re: double precision error with pg linux server, but not with windows pg server

Andrew Dunstan wrote:

Why the heck do the OLE DB specs care about the internals of the
client-server prototocol? It is documented fairly clearly that text is
the only portable way to transfer data.

Is it?

Perhaps we need to expand this sentence in the docs: "Keep in mind that
binary representations for complex data types may change across server
versions;"

Where is that in the docs. It does not appear in the page discussing
PQLIB and binary data transfers
(http://www.postgresql.org/docs/8.2/interactive/libpq-exec.html).

Even the original sentence does not describe the problem we're seeing
here. It does not mention cross platform incompatibility.

The COPY docs are probably more correct: "The BINARY key word causes all
data to be stored/read as binary format rather than as text. It is
somewhat faster than the normal text mode, but a binary-format file is
less portable across machine architectures and PostgreSQL versions."

Again, to me this sentence spells "you have a problem, we're not going
to help you out, deal with it". This is especially true if what Tom said
was true, that the text format does not maintain total precision. You
are essentially telling the user "you cannot move your data reliably
even between servers of the same version".

Shachar

#20

Tom Lane

tgl@sss.pgh.pa.us

about 19 years ago

In reply to: Shachar Shemesh (#17)

Re: Re: [Oledb-dev] double precision error with pg linux server, but not with windows pg server

Shachar Shemesh <shachar@shemesh.biz> writes:

Tom Lane wrote:

No, not unless you can make the case why this handles NaNs and
denormalized numbers compatibly across platforms...

NaNs and infinite (plus and minus) should not be a problem.

Really? Need I point out that these concepts, let alone their
representation, are not standardized in non-IEEE float implementations?

I'm not sure what denormalized numbers are.

You should find out before proposing representation replacements for
floats.

What is the scenario in which you would want to tell them apart?
Likewise, would you really want to tell +0 and -0 apart?

There are competent authorities --- for instance, the guys who created
the IEEE float standard --- who think it's worth distinguishing them.
IIRC (it's been fifteen or so years since I did any serious numerical
analysis) the arguments in favor have mostly to do with preserving
maximal accuracy for intermediate results in a series of calculations.
So maybe you could claim that these arguments are not so relevant to
storage in a database. But personally I don't think it's the province
of a database to decide that it need not accurately preserve the data
it's given to store.

regards, tom lane

#21

Andrew Dunstan

andrew@dunslane.net

about 19 years ago

In reply to: Shachar Shemesh (#19)

#22

Shachar Shemesh

shachar@shemesh.biz

about 19 years ago

In reply to: Tom Lane (#20)

#23