Java : Postgres double precession issue with different data format text and binary

Started by Rahul Uniyalalmost 2 years ago5 messages
#1Rahul Uniyal
rahul.uniyal00@gmail.com

Hello Team,

Hope everyone is doing well here.

I am writing this email to understand an issue I'm facing when fetching data in our Java application. We are using PostgreSQL JDBC Driver version 42.6.0.

Issue:

We are encountering an issue where the double precision data type in PostgreSQL is giving some intermittent results when fetching data. For example, in the database the value is 40, but sometimes this value is fetched as 40.0. Similarly, for a value of 0.0005, it is being fetched as 0.00050, resulting in extra trailing zeros.

While debugging, it seems like this issue is caused by the different data formats, such as Text and Binary. There is some logic in the PgResultSet class that converts values based on this data format.

Example:

Below is an example where we are getting different data formats for the same table:

Text Format: [Field(id,FLOAT8,8,T), Field(client_id,FLOAT8,8,T), Field(create_ts,TIMESTAMP,8,T), ...]

Binary Format: [Field(id,FLOAT8,8,B), Field(client_id,FLOAT8,8,B), ...] (notice some format changes)

We are not sure why different formats are coming for the same table.

Schema:

Below is the schema for the table used:

SQL

CREATE TABLE IF NOT EXISTS SUBMISSION_QUEUE(
ID DOUBLE PRECISION,
CLIENT_ID DOUBLE PRECISION,
OCODE VARCHAR(20) NOT NULL,
PAYLOAD_TYPE VARCHAR(20),
REPOSITORY VARCHAR(16),
SUB_REPOSITORY VARCHAR(20),
FORCE_GENERATION_FLAG BOOLEAN,
IS_JMX_CALL BOOLEAN,
INSTANCE_ID DOUBLE PRECISION,
CREATE_TS TIMESTAMP(6) NOT NULL,
);

Request:

Team, would it be possible to give some insight on this issue? Any help would be greatly appreciated.

Thanks,

#2Chapman Flack
jcflack@acm.org
In reply to: Rahul Uniyal (#1)
Re: Java : Postgres double precession issue with different data format text and binary

Hi,

On 03/16/24 11:10, Rahul Uniyal wrote:

We are encountering an issue where the double precision data type
in PostgreSQL is giving some intermittent results when fetching data.
For example, in the database the value is 40, but sometimes this value
is fetched as 40.0. Similarly, for a value of 0.0005, it is being
fetched as 0.00050, resulting in extra trailing zeros.

As a first observation, the column names in your schema suggest that
these columns are being used as IDs of some kind, for which a float type
would be an unusual choice. Unless something in your situation requires it,
you might consider changing to integer types for IDs.

That said, you may have found something interesting in how JDBC handles
the float8 type in text vs. binary format, but comparing the results
of conversion to decimal string is not the most direct way to
investigate it.

It would be clearer to compare the raw bits of the values.

For example, with SELECT float8send(ID) FROM SUBMISSION_QUEUE,
you should see \x4044000000000000 if ID is 40, and you should see
\x3f40624dd2f1a9fc if ID is 0.0005.

Likewise, on the Java side,
Long.toHexString(Double.doubleToLongBits(id)) should also show you
4044000000000000 for the value 40, and 3f40624dd2f1a9fc for the
value 0.0005.

If you end up finding that the text/binary transmission format
sometimes causes the Java value not to have the same bits as the
PostgreSQL value, that information could be of interest on the
pgsql-jdbc list.

Regards,
Chapman Flack

#3Rahul Uniyal
rahul.uniyal00@gmail.com
In reply to: Chapman Flack (#2)
Re: Java : Postgres double precession issue with different data format text and binary

Hello Chapman,

Thanks for the reply and suggestion.

Below are my observations when i was debugging the code of postgres-jdbc driver for double precision data type.

1- When the value in DB is 40 and fetched value is also 40
A - In the QueryExecuterImpl class method - receiveFields() , we create Fields metadata

private Field[] receiveFields() throws IOException {
pgStream.receiveInteger4(); // MESSAGE SIZE
int size = pgStream.receiveInteger2();
Field[] fields = new Field[size];

if (LOGGER.isLoggable(Level.FINEST)) {
LOGGER.log(Level.FINEST, " <=BE RowDescription({0})", size);
}

for (int i = 0; i < fields.length; i++) {
String columnLabel = pgStream.receiveCanonicalString();
int tableOid = pgStream.receiveInteger4();
short positionInTable = (short) pgStream.receiveInteger2();
int typeOid = pgStream.receiveInteger4();
int typeLength = pgStream.receiveInteger2();
int typeModifier = pgStream.receiveInteger4();
int formatType = pgStream.receiveInteger2();
fields[i] = new Field(columnLabel,
typeOid, typeLength, typeModifier, tableOid, positionInTable);
fields[i].setFormat(formatType);

LOGGER.log(Level.FINEST, " {0}", fields[i]);
}

return fields;
}

Output of this method is - [Field(id,FLOAT8,8,T), Field(client_id,FLOAT8,8,T), Field(create_ts,TIMESTAMP,8,T), Field(force_generation_flag,VARCHAR,65535,T), Field(instance_id,FLOAT8,8,T), Field(is_jmx_call,VARCHAR,65535,T), Field(ocode,VARCHAR,65535,T), Field(payload_type,VARCHAR,65535,T), Field(repository,VARCHAR,65535,T), Field(sub_repository,VARCHAR,65535,T)]

B- Then in the class PgResultSet , it calls the method
public java.math.@Nullable BigDecimal getBigDecimal(@Positive int columnIndex) throws SQLException {
return getBigDecimal(columnIndex, -1);
}
and then it calls the method
@Pure
private @Nullable Number getNumeric(
int columnIndex, int scale, boolean allowNaN) throws SQLException {
byte[] value = getRawValue(columnIndex);
if (value == null) {
return null;
}

if (isBinary(columnIndex)) {
int sqlType = getSQLType(columnIndex);
if (sqlType != Types.NUMERIC && sqlType != Types.DECIMAL) {
Object obj = internalGetObject(columnIndex, fields[columnIndex - 1]);
if (obj == null) {
return null;
}
if (obj instanceof Long || obj instanceof Integer || obj instanceof Byte) {
BigDecimal res = BigDecimal.valueOf(((Number) obj).longValue());
res = scaleBigDecimal(res, scale);
return res;
}
return toBigDecimal(trimMoney(String.valueOf(obj)), scale);
} else {
Number num = ByteConverter.numeric(value);
if (allowNaN && Double.isNaN(num.doubleValue())) {
return Double.NaN;
}

return num;
}
}
Since the column format is text and not binary it converts the value to BigDecimal and give back the value as 40 .

2- When the value in DB is 40 and fetched value is 40.0 (trailing zero)
In this case the field metadata is -

[Field(id,FLOAT8,8,B), Field(client_id,FLOAT8,8,B), Field(ocode,VARCHAR,65535,T), Field(payload_type,VARCHAR,65535,T), Field(repository,VARCHAR,65535,T), Field(sub_repository,VARCHAR,65535,T), Field(force_generation_flag,VARCHAR,65535,T), Field(is_jmx_call,VARCHAR,65535,T), Field(instance_id,FLOAT8,8,B), Field(create_ts,TIMESTAMP,8,B)]

Now since the format is Binary Type hence in PgResultSet class and in Numeric method condition isBinary(columnIndex) is true.
and it returns DOUBLE from there result in 40.0

Now i am not sure for the same table and same column why we have two different format and this issue is intermittent.

Thanks,

Rahul

Show quoted text

On 19-Mar-2024, at 1:02 AM, Rahul Uniyal <rahul.uniyal00@gmail.com> wrote:

#4Chapman Flack
jcflack@acm.org
In reply to: Rahul Uniyal (#3)
Re: Java : Postgres double precession issue with different data format text and binary

Hi Rahul,

On 03/18/24 15:52, Rahul Uniyal wrote:

Since the column format is text and not binary it converts the value
to BigDecimal and give back the value as 40 .
...
Now since the format is Binary ... it returns DOUBLE from there
result in 40.0

Now i am not sure for the same table and same column why we have two
different format and this issue is intermittent.

I don't see, in this message or your earlier one, which public
ResultSet API method your Java client code is calling. It sounds as if
you are simply calling getObject, the flavor without a second parameter
narrowing the type, and you are finding that the object returned is
sometimes of class Double and sometimes of class BigDecimal. Is that
accurate? That would seem to be the nub of the issue.

You seem to have found that the class of the returned object is
influenced by whether text or binary format was used on the wire.
I will guess that would be worth reporting to the PGJDBC devs, using
the pgsql-jdbc list.

The question of why the driver might sometimes use one wire format
and sometimes the other seems secondary. There may be some technical
explanation, but it would not be very interesting except as an
implementation detail, if it did not have this visible effect of
changing the returned object's class.

For the time being, I assume that if your Java code calls a more
specific method, such as getObject(..., BigDecimal.class) or
getObject(..., Double.class), or simply getDouble, you will get
results of the desired class whatever wire format is used.

The issue of the wire format influencing what class of object
getObject returns (when a specific class hasn't been requested)
is probably worth raising on pgsql-jdbc.

Regards,
-Chap

#5Chapman Flack
jcflack@acm.org
In reply to: Rahul Uniyal (#3)
Re: Java : Postgres double precession issue with different data format text and binary

On 03/18/24 23:05, Rahul Uniyal wrote:

This is the public method of PgResultSet class .
Apart from which format type it is ,
getBigDecimal public method will get call

That is interesting. That's why it's important to include the basics
when making a report, including which public API method your code
is calling.

As you call getBigDecimal, you always will get a BigDecimal (or an
exception). But, as you have shown, the implementation detail of
the on-the-wire format influences how the BigDecimal is constructed.

When the wire format is text, you get a BigDecimal constructed
directly from the on-the-wire string value.

When the wire format is binary (and the type is not NUMERIC or
DECIMAL), there first is an object constructed by internalGetObject,
and (if that is not an instance of Long, Integer, or Byte), it gets
converted to a BigDecimal this way:

return toBigDecimal(trimMoney(String.valueOf(obj)), scale);

Assuming the object returned by internalGetObject is a Double, the
reason for the divergent results is the String.valueOf in that
conversion. BigDecimal and Double have different conventions for
how they are rendered as a String; the representation of a Double
will always have a decimal place:

jshell> new BigDecimal("40")
$1 ==> 40
jshell> Double.valueOf("40")
$2 ==> 40.0

If you construct a BigDecimal directly from the Double, you get
the one you expect:

jshell> new BigDecimal($2)
$3 ==> 40

But if you take the String value of the Double first, the extra
decimal place shown in the Double's String value results in a
BigDecimal with scale of 1 rather than 0:

jshell> new BigDecimal(String.valueOf($2))
$4 ==> 40.0

jshell> $3.scale(); $4.scale()
$5 ==> 0
$6 ==> 1

This is very probably worth reporting on pgsql-jdbc.

I also wonder why the trimMoney() is there. The method
seems to exist to deal with a string that could have $
in it, or parentheses instead of a negative sign, but
I am unsure what kind of object could be obtained from
internalGetObject that would require such treatment.

Unrelated to your case, I wonder also about the handling
of NUMERIC or DECIMAL when the value is NaN. The
ByteConverter.numeric method is declared to return Number,
and documented to return either a BigDecimal or Double.NaN.
Likewise, PgResultSet.getNumeric can return Double.NaN
in that case. But getBigDecimal contains an unconditional
cast to BigDecimal, which seems like a ClassCastException
waiting to happen for the NaN case. (getBigDecimal clearly
can't return Double.NaN, but maybe a more-specific exception
would be more helpful than "Double cannot be cast to BigInteger"?)

I've added pgsql-jdbc to the To: for this message, but it will
probably be delayed somewhat in moderation, as I am not
subscribed to that list.

Regards,
-Chap