Lossless transmission of double precision floating point

Started by Michael J. Baars6 months ago3 messages
#1Michael J. Baars
mjbaars1977.pgsql@gmail.com
1 attachment(s)

Hi,

I receive data from a data provider on a daily basis, and noticed how they
use fixed type floating point in text mode, to transmit data. As you might
know, this type of transmission is not lossless.

Because the PostgreSQL binary format is not very portable across different
database providers, I came up with two solutions to this problem.

1) Pass an optional custom binary format loader to the COPY command.
2) Lossless transmission in text mode through byte arrays.

1) You already seem to be working on that, but are other database providers
too?
2) I attached some code, but the COPY command refuses to accept custom
implicit typecasts. Correct? Adding dedicated casting columns seems to be
the accepted work around.

This is not a problem that needs to be solved immediately, but in the long
run, a COPY command that does accept custom implicit typecasts would be
very much appreciated.

Kind regards,
Mischa.

Attachments:

hexcast.tar.xzapplication/x-xz; name=hexcast.tar.xzDownload
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael J. Baars (#1)
Re: Lossless transmission of double precision floating point

"Michael J. Baars" <mjbaars1977.pgsql@gmail.com> writes:

I receive data from a data provider on a daily basis, and noticed how they
use fixed type floating point in text mode, to transmit data. As you might
know, this type of transmission is not lossless.

It is if you use enough decimal digits. Are you sure you're not
inventing something to worry about?

regards, tom lane

#3Michael J. Baars
mjbaars1977.pgsql@gmail.com
In reply to: Tom Lane (#2)
Re: Lossless transmission of double precision floating point

On Fri, Jul 18, 2025 at 4:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Michael J. Baars" <mjbaars1977.pgsql@gmail.com> writes:

I receive data from a data provider on a daily basis, and noticed how they
use fixed type floating point in text mode, to transmit data. As you might
know, this type of transmission is not lossless.

It is if you use enough decimal digits. Are you sure you're not
inventing something to worry about?

The graphs speak for themselves. They do have a problem with accuracy
in the lower digits. Transmitting doubles as byte arrays takes less
characters (especially for very small numbers) and is 100% accurate.

The current solution would be to combine the data stream and its
inverse into one, depending on the two numbers being above or below
1.0. But that is a pretty ugly client side solution.

Show quoted text

regards, tom lane