Non-deterministic behavior with floating point in parallel mode

Started by Ruben Buchatskiyalmost 9 years ago3 messages
#1Ruben Buchatskiy
ruben@ispras.ru

Hi hackers,

We have found that in parallel mode result of queries is non-deterministic
when the types of the attributes in table are double precision
(floating-point).

Our example is based on TPC-H, but some NUMERIC columns type was changed to
DOUBLE PRECISION;

When running without parallelism

tpch=# set max_parallel_workers_per_gather to 0;
SET
tpch=# select sum(l_extendedprice) from lineitem where l_shipdate <= date
'1998-12-01' - interval '116 days';
sum
------------------
448157055361.319
(1 row)

output is always the same.

But in parallel mode

tpch=# set max_parallel_workers_per_gather to 1;
SET
tpch=# select sum(l_extendedprice) from lineitem where l_shipdate <= date
'1998-12-01' - interval '116 days';
sum
------------------
448157055361.341
(1 row)

tpch=# select sum(l_extendedprice) from lineitem where l_shipdate <= date
'1998-12-01' - interval '116 days';
sum
-----------------
448157055361.348
(1 row)

result differs between runs.

That is because floating-point addition is not necessarily associative.
That is, (a + b) + c is not necessarily equal to a + (b + c).
In parallel mode, the order in which the attribute values are added
(summed) changes between runs, which leads to non-deterministic results.

Is this desirable behavior?

--

*Best Regards,**Ruben.* <ruben@ispras.ru>
ISP RAS.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ruben Buchatskiy (#1)
Re: Non-deterministic behavior with floating point in parallel mode

Ruben Buchatskiy <ruben@ispras.ru> writes:

We have found that in parallel mode result of queries is non-deterministic
when the types of the attributes in table are double precision
(floating-point).

Yeah ...

That is because floating-point addition is not necessarily associative.

Right, exactly.

Is this desirable behavior?

It's not especially the fault of parallelism. Any change in the order in
which the SUM visits the rows could cause a similar change in the results.
IOW, you are being overoptimistic about how deterministic this result
is any of the time.

Use numeric, not float, if you can't tolerate this sort of behavior.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Tom Lane (#2)
Re: Non-deterministic behavior with floating point in parallel mode

On Fri, Feb 3, 2017 at 3:49 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ruben Buchatskiy <ruben@ispras.ru> writes:

We have found that in parallel mode result of queries is non-deterministic
when the types of the attributes in table are double precision
(floating-point).

Yeah ...

That is because floating-point addition is not necessarily associative.

Right, exactly.

Is this desirable behavior?

It's not especially the fault of parallelism. Any change in the order in
which the SUM visits the rows could cause a similar change in the results.
IOW, you are being overoptimistic about how deterministic this result
is any of the time.

For example, I just did the following while also running the same
query in another session to provoke synchronize_seqscans (in a
REPEATABLE READ transaction for added absurdity):

tpch=# set max_parallel_workers_per_gather to 0;
SET

tpch=# select sum(l_extendedprice::double precision) from lineitem;
sum
------------------
229577310901.211
(1 row)

tpch=# select sum(l_extendedprice::double precision) from lineitem;
sum
------------------
229577310901.198
(1 row)

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers