BUG #15307: Low numerical precision of (Co-) Variance

Started by PG Bug reporting formover 7 years ago12 messages
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 15307
Logged by: Erich Schubert
Email address: erich@debian.org
PostgreSQL version: 9.6.6
Operating system: Linux
Description:

Numerical precision of variance computations in PostgreSQL is too low.

SELECT VAR_SAMP(x::float8), COVAR_SAMP(x, x), VAR_SAMP(x)
FROM (SELECT 1000000.01 as x UNION SELECT 999999.99 as x) AS x

The first two give the low-precision answer 0.000244140625 instead of
0.0002. Interestingly enough, VAR_SAMP(x) is okay - I guess that postgres
may autodetect a fixed-precision decimal here, or use some high precision
code.

If you add just another digit (10 million +- 1 cent), the bad functions even
return a variance of 0:

SELECT VAR_SAMP(x::float8), COVAR_SAMP(x, x), VAR_SAMP(x)
FROM (SELECT 10000000.01 as x UNION SELECT 9999999.99 as x) AS x

The reason is catastrophic cancellation. Apparently, the covariance is
computed using the E[XY]-E[X]E[Y] approach, which suffers from low numerical
precision.

While I report this against version 9.6.6 (since I used sqlfiddle), this
clearly is present still in Git:

https://github.com/postgres/postgres/blob/6bf0bc842bd75877e31727eb559c6a69e237f831/src/backend/utils/adt/float.c#L2606
https://github.com/postgres/postgres/blob/6bf0bc842bd75877e31727eb559c6a69e237f831/src/backend/utils/adt/float.c#L2969

it should also affect the general "numeric" version (i.e. all versions of
variance/covariance/standard deviation use the known-bad equation), but for
integers it usually will be fine as long as the sum-of-squares can be
stored:
https://github.com/postgres/postgres/blob/6bf0bc842bd75877e31727eb559c6a69e237f831/src/backend/utils/adt/numeric.c#L4883

There are a number of methods to get a reasonable precision. The simplest to
implement is a two pass approach: first compute the mean of each variable,
then compute E[(X-meanX)*(Y-meanY)] in a second pass. This will usually give
the best precision. Faster (single-pass) methods can be found in literature
from the 1970s, and also Donald Knuth's "The Art of Computer Programming".
In particular Young and Cramer's version performed quite well (and
surprisingly much better than Welford's; supposedly due to CPU pipelining).
Single-pass and parallelization-friendly approaches (interesting to use in
particular in distributed databases, but also to avoid IO cost) with good
accuracy are studied in:

Erich Schubert, and Michael Gertz. Numerically Stable Parallel Computation

of (Co-)Variance. In: Proceedings of the 30th International Conference on
Scientific and Statistical Database Management (SSDBM), Bolzano-Bozen,
Italy. 2018, 10:1–10:12. DOI: 10.1145/3221269.3223036.
https://dl.acm.org/citation.cfm?doid=3221269.3223036

The problem has also been observed in other SQL databases (MS - others like
MySQL have implemented a numeric stable single-pass approach), see e.g.:

Kamat, Niranjan, and Arnab Nandi. "A Closer Look at Variance

Implementations In Modern Database Systems." ACM SIGMOD Record 45.4 (2017):
28-33. DOI: 10.1145/3092931.3092936.
https://dl.acm.org/citation.cfm?id=3092936

Sorry, I do not know the PostgreSQL internals (aggregation...) well enough
to provide a patch.

#2Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #15307: Low numerical precision of (Co-) Variance

On 1 August 2018 at 14:35, PG Bug reporting form <noreply@postgresql.org> wrote:

Numerical precision of variance computations in PostgreSQL is too low.

SELECT VAR_SAMP(x::float8), COVAR_SAMP(x, x), VAR_SAMP(x)
FROM (SELECT 1000000.01 as x UNION SELECT 999999.99 as x) AS x

The first two give the low-precision answer 0.000244140625 instead of
0.0002. Interestingly enough, VAR_SAMP(x) is okay ...

For a number of those statistical aggregates, PostgreSQL provides 2
implementations -- one implemented using double precision floating
point arithmetic, which is much faster, but necessarily less accurate
and possibly platform-dependent; and one implemented using the
arbitrary precision numeric datatype, which will return much more
accurate results. For any input datatypes other than floating point,
you will automatically get the latter, which is what you're seeing
with var_samp(x), when you're not explicitly casting the input to
float8.

However, all-the 2-argument aggregates such as corr() and
covar_pop/samp() currently only have floating point implementations,
and suffer from the problem you report, which I agree, is not great.
If we can easily improve the accuracy of those aggregates, then I
think it is worthwhile.

Using a two pass approach isn't really an option, given the way that
aggregates work in PostgreSQL, however, implementing Welford's
algorithm looks to be quite straightforward. I had a quick play and I
found that it fixed the accuracy problem with no noticeable
performance penalty -- there are a few extra cycles in the accumulator
functions, but compared to the overall per-tuple overhead, that
appears to be negligible.

I'll post something shortly.

Regards,
Dean

#3Erich Schubert
erich@debian.org
In reply to: Dean Rasheed (#2)
Re: BUG #15307: Low numerical precision of (Co-) Variance

Hi,

For a number of those statistical aggregates, PostgreSQL provides 2
implementations -- one implemented using double precision floating
point arithmetic, which is much faster, but necessarily less accurate
and possibly platform-dependent; and one implemented using the
arbitrary precision numeric datatype, which will return much more
accurate results. For any input datatypes other than floating point,
you will automatically get the latter, which is what you're seeing
with var_samp(x), when you're not explicitly casting the input to
float8.

Ok, I am surprised that this is possible to do with arbitrary precision
here, as for example with 8 data points, it will eventually involve a
division by 7, which cannot be represented even with arbitrary (finite)
precision floating point. But maybe just this division comes late enough
(after the difference) that it just works before being converted to a
float for the division.

However, all-the 2-argument aggregates such as corr() and
covar_pop/samp() currently only have floating point implementations,
and suffer from the problem you report, which I agree, is not great.
If we can easily improve the accuracy of those aggregates, then I
think it is worthwhile.

Using a two pass approach isn't really an option, given the way that
aggregates work in PostgreSQL, however, implementing Welford's
algorithm looks to be quite straightforward. I had a quick play and I
found that it fixed the accuracy problem with no noticeable
performance penalty -- there are a few extra cycles in the accumulator
functions, but compared to the overall per-tuple overhead, that
appears to be negligible.

Instead of Welford, I recommend to use the Youngs-Cramer approach. It is
almost the same; roughly it is aggregating sum-X and sum((X-meanX)²)
directly (whereas Welford updates mean-X, and sum((X-meanX)^2)). So the
first aggregate is actually unchanged to the current approach.
In my experiments, this was surprisingly much faster when the data was a
double[]; explainable by CPU pipelining (see the Figure in the paper I
linked - the division comes late, and the next step does not depend on
the division, so pipelining can already process the next double without
waiting for the division to finish).

Youngs & Cramer original publication:
https://www.jstor.org/stable/pdf/1267176.pdf

The (non-parallel) implementation of the classic algorithms used in the
SSDBM 2018 paper are (templated only for aggregate precision; no guard
against len=0):

template <typename T = double>
double youngs_cramer_var(const double* const data, const size_t len) {
    T sum = data[0], var = 0;
    for(size_t i=1; i<len;) {
        const size_t oldi = i;
        const double x = data[i++];
        sum += x;
        double tmp = i * x - sum;
        var += tmp * tmp / (i * (double) oldi);
    }
    return var / len;
}

Close to Welfords original article:

template <typename T = double>
double welford_var(const double* const data, const size_t len) {
    T mean = data[0], var = 0;
    for(size_t i=1; i<len; ) {
        const size_t oldi = i;
        const double x = data[i++];
        const double delta = x - mean;
        mean = mean * oldi / (double) i + x / i;
        var += oldi / (double) i * delta * delta;
    }
    return var / len;
}

From Knuth's book, attributed to Welford (but usuing fewer divisons),
likely the most widely known incremental version:

template <typename T = double>
double knuth_var(const double* const data, const size_t len) {
    T mean = data[0], xx = 0;
    for(size_t i=1; i<len;) {
        const double v = data[i++];
        const double delta = v - mean;
        mean += delta / i;
        xx += delta * (v - mean);
    }
    return xx / len;
}

In both Welford and Knuth, the "mean" aggregate depends on the division;
with YC it does not, so I recommend YC.

If Postgres could use parallel aggregation, let me know. I can assist
with the proper aggregation of several partitions (both distributed, or
multithreaded). Using multiple partitions can also slightly improve
precision; but it is mostly interesting to process multiple chunks in
parallel.
If I have some spare time to clean it up some more, I intend to
open-source the entire experimental code from SSDBM18 for reproducibility.

Regards,
Erich

#4Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Erich Schubert (#3)
2 attachment(s)
Re: BUG #15307: Low numerical precision of (Co-) Variance

On 7 August 2018 at 12:58, Erich Schubert <erich@debian.org> wrote:

I am surprised that this is possible to do with arbitrary precision
here, as for example with 8 data points, it will eventually involve a
division by 7, which cannot be represented even with arbitrary (finite)
precision floating point. But maybe just this division comes late enough
(after the difference) that it just works before being converted to a float
for the division.

Right, the division comes at the end in that case.

Instead of Welford, I recommend to use the Youngs-Cramer approach. It is
almost the same; roughly it is aggregating sum-X and sum((X-meanX)²)
directly (whereas Welford updates mean-X, and sum((X-meanX)^2)). So the
first aggregate is actually unchanged to the current approach.
In my experiments, this was surprisingly much faster when the data was a
double[]; explainable by CPU pipelining (see the Figure in the paper I
linked - the division comes late, and the next step does not depend on the
division, so pipelining can already process the next double without waiting
for the division to finish).

Youngs & Cramer original publication:
https://www.jstor.org/stable/pdf/1267176.pdf

OK, thanks for the reference. That was very interesting.

It was actually the Knuth variant of the Welford algorithm that I was
playing with, but I have now tried out the Youngs-Cramer algorithm as
well.

In terms of performance the YC algorithm was maybe 2-3% slower than
Welford, which was roughly on a par with the current schoolbook
algorithm in PostgreSQL, but there was some variability in those
results, depending on the exact query in question. One thing to bear
in mind is that the kind of micro-optimisations that will make a huge
difference when processing in-memory data in tight loops as in your
examples are not nearly as important in the PostgreSQL environment
where there is a significant per-tuple overhead in fetching data,
depending on the enclosing SQL query. So the odd cycle here and there
from a division isn't going to matter much in this context.

Both algorithms eliminate the main loss-of-precision problem that the
current schoolbook algorithm suffers from, which I think has to be the
primary aim of this. The YC paper suggests that the results from their
algorithm might be slightly more accurate (although maybe not enough
that we really care). What I do like about the YC algorithm is that it
guarantees the same result from avg(), whereas the Welford algorithm
will change avg() slightly, possibly making it very slightly less
accurate.

So I concur, the YC algorithm is probably preferable. For the record,
attached are both versions that I tried.

If Postgres could use parallel aggregation, let me know. I can assist with
the proper aggregation of several partitions (both distributed, or
multithreaded). Using multiple partitions can also slightly improve
precision; but it is mostly interesting to process multiple chunks in
parallel.

PostgreSQL isn't multithreaded, but it will parallelise large
aggregates by distributing the computation over a small number of
worker backend processes, and then combining the results. I did a
quick back-of-the-envelope calculation to see how to combine the
results, and in testing it seemed to work correctly, but it would be
worth having a second pair of eyes to validate that.

I'll add this to the next commitfest for consideration.

Regards,
Dean

Attachments:

implement-float-aggs-using-welford.patchtext/x-patch; charset=US-ASCII; name=implement-float-aggs-using-welford.patchDownload
diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
new file mode 100644
index df35557..08e38e3
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -2401,8 +2401,21 @@ setseed(PG_FUNCTION_ARGS)
  *		float8_stddev_samp	- produce final result for float STDDEV_SAMP()
  *		float8_stddev_pop	- produce final result for float STDDEV_POP()
  *
+ * The naive schoolbook implementation of these aggregates works by
+ * accumulating sum(X) and sum(X^2).  However, this approach suffers from
+ * large rounding errors in the final computation of quantities like the
+ * population variance (N*sum(X^2) - sum(X)^2) / N^2, since each of the
+ * intermediate terms is potentially very large, while the difference is often
+ * quite small.
+ *
+ * Instead we use Welford's algorithm which works by accumulating Mx=Sum(X)/N
+ * and Sxx=sum((X-Mx)^2), using a numerically stable algorithm to
+ * incrementally update those quantities.  The final computations of each of
+ * the aggregate values is then trivial and gives more accurate results (for
+ * example, the population variance is just Sxx/N).
+ *
  * The transition datatype for all these aggregates is a 3-element array
- * of float8, holding the values N, sum(X), sum(X*X) in that order.
+ * of float8, holding the values N, Mx, Sxx in that order.
  *
  * Note that we represent N as a float to avoid having to build a special
  * datatype.  Given a reasonable floating-point implementation, there should
@@ -2441,6 +2454,15 @@ float8_combine(PG_FUNCTION_ARGS)
 	ArrayType  *transarray2 = PG_GETARG_ARRAYTYPE_P(1);
 	float8	   *transvalues1;
 	float8	   *transvalues2;
+	float8		N1,
+				Mx1,
+				Sxx1,
+				N2,
+				Mx2,
+				Sxx2,
+				N,
+				Mx,
+				Sxx;
 
 	if (!AggCheckCallContext(fcinfo, NULL))
 		elog(ERROR, "aggregate function called in non-aggregate context");
@@ -2448,9 +2470,51 @@ float8_combine(PG_FUNCTION_ARGS)
 	transvalues1 = check_float8_array(transarray1, "float8_combine", 3);
 	transvalues2 = check_float8_array(transarray2, "float8_combine", 3);
 
-	transvalues1[0] = transvalues1[0] + transvalues2[0];
-	transvalues1[1] = float8_pl(transvalues1[1], transvalues2[1]);
-	transvalues1[2] = float8_pl(transvalues1[2], transvalues2[2]);
+	N1 = transvalues1[0];
+	Mx1 = transvalues1[1];
+	Sxx1 = transvalues1[2];
+
+	N2 = transvalues2[0];
+	Mx2 = transvalues2[1];
+	Sxx2 = transvalues2[2];
+
+	/*--------------------
+	 * The transition values combine using a generalization of Welford's
+	 * algorithm as follows:
+	 *
+	 *	N = N1 + N2
+	 *	Mx = (N1 * Mx1 + N2 * Mx2) / N
+	 *	Sxx = Sxx1 + Sxx2 + N1 * N2 * (Mx1 - Mx2)^2 / N
+	 *
+	 * It's worth handling the special cases N1 = 0 and N2 = 0 separately
+	 * since those cases are trivial, and we then don't need to worry about
+	 * division-by-zero errors in the general case.
+	 *--------------------
+	 */
+	if (N1 == 0.0)
+	{
+		N = N2;
+		Mx = Mx2;
+		Sxx = Sxx2;
+	}
+	else if (N2 == 0.0)
+	{
+		N = N1;
+		Mx = Mx1;
+		Sxx = Sxx1;
+	}
+	else
+	{
+		N = N1 + N2;
+		Mx = (N1 * Mx1 + N2 * Mx2) / N;
+		check_float8_val(Mx, isinf(Mx1) || isinf(Mx2), true);
+		Sxx = Sxx1 + Sxx2 + (N1 * N2 * (Mx1 - Mx2) * (Mx1 - Mx2)) / N;
+		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+	}
+
+	transvalues1[0] = N;
+	transvalues1[1] = Mx;
+	transvalues1[2] = Sxx;
 
 	PG_RETURN_ARRAYTYPE_P(transarray1);
 }
@@ -2461,8 +2525,28 @@ float8_accum(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8		newval = PG_GETARG_FLOAT8(1);
 	float8	   *transvalues;
+	float8		N,
+				Mx,
+				Sxx,
+				d1,
+				d2;
 
 	transvalues = check_float8_array(transarray, "float8_accum", 3);
+	N = transvalues[0];
+	Mx = transvalues[1];
+	Sxx = transvalues[2];
+
+	/*
+	 * Use Welford's algorithm to incorporate the new value into the
+	 * transition values.
+	 */
+	N += 1.0;
+	d1 = newval - Mx;
+	Mx += d1 / N;
+	check_float8_val(Mx, isinf(transvalues[1]) || isinf(newval), true);
+	d2 = newval - Mx;
+	Sxx += d1 * d2;
+	check_float8_val(Sxx, isinf(transvalues[2]) || isinf(newval), true);
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -2471,9 +2555,9 @@ float8_accum(PG_FUNCTION_ARGS)
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transvalues[0] = N;
+		transvalues[1] = Mx;
+		transvalues[2] = Sxx;
 
 		PG_RETURN_ARRAYTYPE_P(transarray);
 	}
@@ -2482,9 +2566,9 @@ float8_accum(PG_FUNCTION_ARGS)
 		Datum		transdatums[3];
 		ArrayType  *result;
 
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transdatums[0] = Float8GetDatumFast(N);
+		transdatums[1] = Float8GetDatumFast(Mx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
 
 		result = construct_array(transdatums, 3,
 								 FLOAT8OID,
@@ -2502,8 +2586,28 @@ float4_accum(PG_FUNCTION_ARGS)
 	/* do computations as float8 */
 	float8		newval = PG_GETARG_FLOAT4(1);
 	float8	   *transvalues;
+	float8		N,
+				Mx,
+				Sxx,
+				d1,
+				d2;
 
 	transvalues = check_float8_array(transarray, "float4_accum", 3);
+	N = transvalues[0];
+	Mx = transvalues[1];
+	Sxx = transvalues[2];
+
+	/*
+	 * Use Welford's algorithm to incorporate the new value into the
+	 * transition values.
+	 */
+	N += 1.0;
+	d1 = newval - Mx;
+	Mx += d1 / N;
+	check_float8_val(Mx, isinf(transvalues[1]) || isinf(newval), true);
+	d2 = newval - Mx;
+	Sxx += d1 * d2;
+	check_float8_val(Sxx, isinf(transvalues[2]) || isinf(newval), true);
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -2512,9 +2616,9 @@ float4_accum(PG_FUNCTION_ARGS)
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transvalues[0] = N;
+		transvalues[1] = Mx;
+		transvalues[2] = Sxx;
 
 		PG_RETURN_ARRAYTYPE_P(transarray);
 	}
@@ -2523,9 +2627,9 @@ float4_accum(PG_FUNCTION_ARGS)
 		Datum		transdatums[3];
 		ArrayType  *result;
 
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transdatums[0] = Float8GetDatumFast(N);
+		transdatums[1] = Float8GetDatumFast(Mx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
 
 		result = construct_array(transdatums, 3,
 								 FLOAT8OID,
@@ -2541,18 +2645,18 @@ float8_avg(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX;
+				Mx;
 
 	transvalues = check_float8_array(transarray, "float8_avg", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	/* ignore sumX2 */
+	Mx = transvalues[1];
+	/* ignore Sxx */
 
 	/* SQL defines AVG of no values to be NULL */
 	if (N == 0.0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(sumX / N);
+	PG_RETURN_FLOAT8(Mx);
 }
 
 Datum
@@ -2561,27 +2665,22 @@ float8_var_pop(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_var_pop", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Mx */
+	Sxx = transvalues[2];
 
 	/* Population variance is undefined when N is 0, so return NULL */
 	if (N == 0.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative variance */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(numerator / (N * N));
+	PG_RETURN_FLOAT8(Sxx / N);
 }
 
 Datum
@@ -2590,27 +2689,22 @@ float8_var_samp(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_var_samp", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Mx */
+	Sxx = transvalues[2];
 
 	/* Sample variance is undefined when N is 0 or 1, so return NULL */
 	if (N <= 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative variance */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(numerator / (N * (N - 1.0)));
+	PG_RETURN_FLOAT8(Sxx / (N - 1.0));
 }
 
 Datum
@@ -2619,27 +2713,22 @@ float8_stddev_pop(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_stddev_pop", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Mx */
+	Sxx = transvalues[2];
 
 	/* Population stddev is undefined when N is 0, so return NULL */
 	if (N == 0.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative variance */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(sqrt(numerator / (N * N)));
+	PG_RETURN_FLOAT8(sqrt(Sxx / N));
 }
 
 Datum
@@ -2648,27 +2737,22 @@ float8_stddev_samp(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_stddev_samp", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Mx */
+	Sxx = transvalues[2];
 
 	/* Sample stddev is undefined when N is 0 or 1, so return NULL */
 	if (N <= 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative variance */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(sqrt(numerator / (N * (N - 1.0))));
+	PG_RETURN_FLOAT8(sqrt(Sxx / (N - 1.0)));
 }
 
 /*
@@ -2676,9 +2760,14 @@ float8_stddev_samp(PG_FUNCTION_ARGS)
  *		SQL2003 BINARY AGGREGATES
  *		=========================
  *
+ * As with the preceding aggregates, we use Welford's algorithm to reduce
+ * rounding errors in the aggregate final functions.
+ *
  * The transition datatype for all these aggregates is a 6-element array of
- * float8, holding the values N, sum(X), sum(X*X), sum(Y), sum(Y*Y), sum(X*Y)
- * in that order.  Note that Y is the first argument to the aggregates!
+ * float8, holding the values N, Mx=sum(X)/N, Sxx=sum((X-Mx)^2), My=sum(Y)/N,
+ * Syy=sum((Y-My)^2), Sxy=sum((X-Mx)*(Y-My)) in that order.
+ *
+ * Note that Y is the first argument to all these aggregates!
  *
  * It might seem attractive to optimize this by having multiple accumulator
  * functions that only calculate the sums actually needed.  But on most
@@ -2695,31 +2784,43 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 	float8		newvalX = PG_GETARG_FLOAT8(2);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumY2,
-				sumXY;
+				Mx,
+				Sxx,
+				My,
+				Syy,
+				Sxy,
+				d1X,
+				d2X,
+				d1Y,
+				d2Y;
 
 	transvalues = check_float8_array(transarray, "float8_regr_accum", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
-	sumXY = transvalues[5];
+	Mx = transvalues[1];
+	Sxx = transvalues[2];
+	My = transvalues[3];
+	Syy = transvalues[4];
+	Sxy = transvalues[5];
 
+	/*
+	 * Use Welford's algorithm to incorporate the new values into the
+	 * transition values.
+	 */
 	N += 1.0;
-	sumX += newvalX;
-	check_float8_val(sumX, isinf(transvalues[1]) || isinf(newvalX), true);
-	sumX2 += newvalX * newvalX;
-	check_float8_val(sumX2, isinf(transvalues[2]) || isinf(newvalX), true);
-	sumY += newvalY;
-	check_float8_val(sumY, isinf(transvalues[3]) || isinf(newvalY), true);
-	sumY2 += newvalY * newvalY;
-	check_float8_val(sumY2, isinf(transvalues[4]) || isinf(newvalY), true);
-	sumXY += newvalX * newvalY;
-	check_float8_val(sumXY, isinf(transvalues[5]) || isinf(newvalX) ||
+	d1X = newvalX - Mx;
+	Mx += d1X / N;
+	check_float8_val(Mx, isinf(transvalues[1]) || isinf(newvalX), true);
+	d2X = newvalX - Mx;
+	Sxx += d1X * d2X;
+	check_float8_val(Sxx, isinf(transvalues[3]) || isinf(newvalX), true);
+	d1Y = newvalY - My;
+	My += d1Y / N;
+	check_float8_val(My, isinf(transvalues[2]) || isinf(newvalY), true);
+	d2Y = newvalY - My;
+	Syy += d1Y * d2Y;
+	check_float8_val(Syy, isinf(transvalues[4]) || isinf(newvalY), true);
+	Sxy += d1X * d2Y;
+	check_float8_val(Sxy, isinf(transvalues[5]) || isinf(newvalX) ||
 					 isinf(newvalY), true);
 
 	/*
@@ -2730,11 +2831,11 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		transvalues[0] = N;
-		transvalues[1] = sumX;
-		transvalues[2] = sumX2;
-		transvalues[3] = sumY;
-		transvalues[4] = sumY2;
-		transvalues[5] = sumXY;
+		transvalues[1] = Mx;
+		transvalues[2] = Sxx;
+		transvalues[3] = My;
+		transvalues[4] = Syy;
+		transvalues[5] = Sxy;
 
 		PG_RETURN_ARRAYTYPE_P(transarray);
 	}
@@ -2744,11 +2845,11 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 		ArrayType  *result;
 
 		transdatums[0] = Float8GetDatumFast(N);
-		transdatums[1] = Float8GetDatumFast(sumX);
-		transdatums[2] = Float8GetDatumFast(sumX2);
-		transdatums[3] = Float8GetDatumFast(sumY);
-		transdatums[4] = Float8GetDatumFast(sumY2);
-		transdatums[5] = Float8GetDatumFast(sumXY);
+		transdatums[1] = Float8GetDatumFast(Mx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
+		transdatums[3] = Float8GetDatumFast(My);
+		transdatums[4] = Float8GetDatumFast(Syy);
+		transdatums[5] = Float8GetDatumFast(Sxy);
 
 		result = construct_array(transdatums, 6,
 								 FLOAT8OID,
@@ -2773,6 +2874,24 @@ float8_regr_combine(PG_FUNCTION_ARGS)
 	ArrayType  *transarray2 = PG_GETARG_ARRAYTYPE_P(1);
 	float8	   *transvalues1;
 	float8	   *transvalues2;
+	float8		N1,
+				Mx1,
+				Sxx1,
+				My1,
+				Syy1,
+				Sxy1,
+				N2,
+				Mx2,
+				Sxx2,
+				My2,
+				Syy2,
+				Sxy2,
+				N,
+				Mx,
+				Sxx,
+				My,
+				Syy,
+				Sxy;
 
 	if (!AggCheckCallContext(fcinfo, NULL))
 		elog(ERROR, "aggregate function called in non-aggregate context");
@@ -2780,12 +2899,75 @@ float8_regr_combine(PG_FUNCTION_ARGS)
 	transvalues1 = check_float8_array(transarray1, "float8_regr_combine", 6);
 	transvalues2 = check_float8_array(transarray2, "float8_regr_combine", 6);
 
-	transvalues1[0] = transvalues1[0] + transvalues2[0];
-	transvalues1[1] = float8_pl(transvalues1[1], transvalues2[1]);
-	transvalues1[2] = float8_pl(transvalues1[2], transvalues2[2]);
-	transvalues1[3] = float8_pl(transvalues1[3], transvalues2[3]);
-	transvalues1[4] = float8_pl(transvalues1[4], transvalues2[4]);
-	transvalues1[5] = float8_pl(transvalues1[5], transvalues2[5]);
+	N1 = transvalues1[0];
+	Mx1 = transvalues1[1];
+	Sxx1 = transvalues1[2];
+	My1 = transvalues1[3];
+	Syy1 = transvalues1[4];
+	Sxy1 = transvalues1[5];
+
+	N2 = transvalues2[0];
+	Mx2 = transvalues2[1];
+	Sxx2 = transvalues2[2];
+	My2 = transvalues2[3];
+	Syy2 = transvalues2[4];
+	Sxy2 = transvalues2[5];
+
+	/*--------------------
+	 * The transition values combine using a generalization of Welford's
+	 * algorithm as follows:
+	 *
+	 *	N = N1 + N2
+	 *	Mx = (N1 * Mx1 + N2 * Mx2) / N
+	 *	Sxx = Sxx1 + Sxx2 + N1 * N2 * (Mx1 - Mx2)^2 / N
+	 *	My = (N1 * My1 + N2 * My2) / N
+	 *	Syy = Syy1 + Syy2 + N1 * N2 * (My1 - My2)^2 / N
+	 *	Sxy = Sxy1 + Sxy2 + N1 * N2 * (Mx1 - Mx2) * (My1 - My2) / N
+	 *
+	 * It's worth handling the special cases N1 = 0 and N2 = 0 separately
+	 * since those cases are trivial, and we then don't need to worry about
+	 * division-by-zero errors in the general case.
+	 *--------------------
+	 */
+	if (N1 == 0.0)
+	{
+		N = N2;
+		Mx = Mx2;
+		Sxx = Sxx2;
+		My = My2;
+		Syy = Syy2;
+		Sxy = Sxy2;
+	}
+	else if (N2 == 0.0)
+	{
+		N = N1;
+		Mx = Mx1;
+		Sxx = Sxx1;
+		My = My1;
+		Syy = Syy1;
+		Sxy = Sxy1;
+	}
+	else
+	{
+		N = N1 + N2;
+		Mx = (N1 * Mx1 + N2 * Mx2) / N;
+		check_float8_val(Mx, isinf(Mx1) || isinf(Mx2), true);
+		Sxx = Sxx1 + Sxx2 + (N1 * N2 * (Mx1 - Mx2) * (Mx1 - Mx2)) / N;
+		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		My = (N1 * My1 + N2 * My2) / N;
+		check_float8_val(My, isinf(My1) || isinf(My2), true);
+		Syy = Syy1 + Syy2 + (N1 * N2 * (My1 - My2) * (My1 - My2)) / N;
+		check_float8_val(Syy, isinf(Syy1) || isinf(Syy2), true);
+		Sxy = Sxy1 + Sxy2 + (N1 * N2 * (Mx1 - Mx2) * (My1 - My2)) / N;
+		check_float8_val(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
+	}
+
+	transvalues1[0] = N;
+	transvalues1[1] = Mx;
+	transvalues1[2] = Sxx;
+	transvalues1[3] = My;
+	transvalues1[4] = Syy;
+	transvalues1[5] = Sxy;
 
 	PG_RETURN_ARRAYTYPE_P(transarray1);
 }
@@ -2797,27 +2979,21 @@ float8_regr_sxx(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_regr_sxx", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	Sxx = transvalues[2];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative result */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(numerator / N);
+	PG_RETURN_FLOAT8(Sxx);
 }
 
 Datum
@@ -2826,27 +3002,21 @@ float8_regr_syy(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumY,
-				sumY2,
-				numerator;
+				Syy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_syy", 6);
 	N = transvalues[0];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
+	Syy = transvalues[4];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumY2 - sumY * sumY;
-	check_float8_val(numerator, isinf(sumY2) || isinf(sumY), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative result */
+	if (Syy <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(numerator / N);
+	PG_RETURN_FLOAT8(Syy);
 }
 
 Datum
@@ -2855,28 +3025,19 @@ float8_regr_sxy(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumY,
-				sumXY,
-				numerator;
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_sxy", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumXY - sumX * sumY;
-	check_float8_val(numerator, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-
 	/* A negative result is valid here */
 
-	PG_RETURN_FLOAT8(numerator / N);
+	PG_RETURN_FLOAT8(Sxy);
 }
 
 Datum
@@ -2885,17 +3046,17 @@ float8_regr_avgx(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX;
+				Mx;
 
 	transvalues = check_float8_array(transarray, "float8_regr_avgx", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
+	Mx = transvalues[1];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(sumX / N);
+	PG_RETURN_FLOAT8(Mx);
 }
 
 Datum
@@ -2904,17 +3065,17 @@ float8_regr_avgy(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumY;
+				My;
 
 	transvalues = check_float8_array(transarray, "float8_regr_avgy", 6);
 	N = transvalues[0];
-	sumY = transvalues[3];
+	My = transvalues[3];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(sumY / N);
+	PG_RETURN_FLOAT8(My);
 }
 
 Datum
@@ -2923,26 +3084,17 @@ float8_covar_pop(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumY,
-				sumXY,
-				numerator;
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_covar_pop", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumXY - sumX * sumY;
-	check_float8_val(numerator, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-
-	PG_RETURN_FLOAT8(numerator / (N * N));
+	PG_RETURN_FLOAT8(Sxy / N);
 }
 
 Datum
@@ -2951,26 +3103,17 @@ float8_covar_samp(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumY,
-				sumXY,
-				numerator;
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_covar_samp", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxy = transvalues[5];
 
 	/* if N is <= 1 we should return NULL */
 	if (N < 2.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumXY - sumX * sumY;
-	check_float8_val(numerator, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-
-	PG_RETURN_FLOAT8(numerator / (N * (N - 1.0)));
+	PG_RETURN_FLOAT8(Sxy / (N - 1.0));
 }
 
 Datum
@@ -2979,38 +3122,24 @@ float8_corr(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumY2,
-				sumXY,
-				numeratorX,
-				numeratorY,
-				numeratorXY;
+				Sxx,
+				Syy,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_corr", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
-	sumXY = transvalues[5];
+	Sxx = transvalues[2];
+	Syy = transvalues[4];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorY = N * sumY2 - sumY * sumY;
-	check_float8_val(numeratorY, isinf(sumY2) || isinf(sumY), true);
-	numeratorXY = N * sumXY - sumX * sumY;
-	check_float8_val(numeratorXY, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-	if (numeratorX <= 0 || numeratorY <= 0)
+	if (Sxx <= 0 || Syy <= 0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(numeratorXY / sqrt(numeratorX * numeratorY));
+	PG_RETURN_FLOAT8(Sxy / sqrt(Sxx * Syy));
 }
 
 Datum
@@ -3019,42 +3148,27 @@ float8_regr_r2(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumY2,
-				sumXY,
-				numeratorX,
-				numeratorY,
-				numeratorXY;
+				Sxx,
+				Syy,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_r2", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
-	sumXY = transvalues[5];
+	Sxx = transvalues[2];
+	Syy = transvalues[4];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorY = N * sumY2 - sumY * sumY;
-	check_float8_val(numeratorY, isinf(sumY2) || isinf(sumY), true);
-	numeratorXY = N * sumXY - sumX * sumY;
-	check_float8_val(numeratorXY, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-	if (numeratorX <= 0)
+	if (Sxx <= 0)
 		PG_RETURN_NULL();
 	/* per spec, horizontal line produces 1.0 */
-	if (numeratorY <= 0)
+	if (Syy <= 0)
 		PG_RETURN_FLOAT8(1.0);
 
-	PG_RETURN_FLOAT8((numeratorXY * numeratorXY) /
-					 (numeratorX * numeratorY));
+	PG_RETURN_FLOAT8((Sxy * Sxy) / (Sxx * Syy));
 }
 
 Datum
@@ -3063,33 +3177,22 @@ float8_regr_slope(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumXY,
-				numeratorX,
-				numeratorXY;
+				Sxx,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_slope", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxx = transvalues[2];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorXY = N * sumXY - sumX * sumY;
-	check_float8_val(numeratorXY, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-	if (numeratorX <= 0)
+	if (Sxx <= 0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(numeratorXY / numeratorX);
+	PG_RETURN_FLOAT8(Sxy / Sxx);
 }
 
 Datum
@@ -3098,33 +3201,26 @@ float8_regr_intercept(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumXY,
-				numeratorX,
-				numeratorXXY;
+				Mx,
+				Sxx,
+				My,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_intercept", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Mx = transvalues[1];
+	Sxx = transvalues[2];
+	My = transvalues[3];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorXXY = sumY * sumX2 - sumX * sumXY;
-	check_float8_val(numeratorXXY, isinf(sumY) || isinf(sumX2) ||
-					 isinf(sumX) || isinf(sumXY), true);
-	if (numeratorX <= 0)
+	if (Sxx <= 0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(numeratorXXY / numeratorX);
+	PG_RETURN_FLOAT8(My - Mx * Sxy / Sxx);
 }
 
 
implement-float-aggs-using-youngs-cramer.patchtext/x-patch; charset=US-ASCII; name=implement-float-aggs-using-youngs-cramer.patchDownload
diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
new file mode 100644
index df35557..ebb2d5c
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -2401,13 +2401,29 @@ setseed(PG_FUNCTION_ARGS)
  *		float8_stddev_samp	- produce final result for float STDDEV_SAMP()
  *		float8_stddev_pop	- produce final result for float STDDEV_POP()
  *
+ * The naive schoolbook implementation of these aggregates works by
+ * accumulating sum(X) and sum(X^2).  However, this approach suffers from
+ * large rounding errors in the final computation of quantities like the
+ * population variance (N*sum(X^2) - sum(X)^2) / N^2, since each of the
+ * intermediate terms is potentially very large, while the difference is often
+ * quite small.
+ *
+ * Instead we use the Youngs-Cramer algorithm [1] which works by accumulating
+ * Sx=sum(X) and Sxx=sum((X-Sx/N)^2), using a numerically stable algorithm to
+ * incrementally update those quantities.  The final computations of each of
+ * the aggregate values is then trivial and gives more accurate results (for
+ * example, the population variance is just Sxx/N).
+ *
  * The transition datatype for all these aggregates is a 3-element array
- * of float8, holding the values N, sum(X), sum(X*X) in that order.
+ * of float8, holding the values N, Sx, Sxx in that order.
  *
  * Note that we represent N as a float to avoid having to build a special
  * datatype.  Given a reasonable floating-point implementation, there should
  * be no accuracy loss unless N exceeds 2 ^ 52 or so (by which time the
  * user will have doubtless lost interest anyway...)
+ *
+ * [1] Some Results Relevant to Choice of Sum and Sum-of-Product Algorithms,
+ * E. A. Youngs and E. M. Cramer, Technometrics Vol 13, No 3, August 1971.
  */
 
 static float8 *
@@ -2441,6 +2457,16 @@ float8_combine(PG_FUNCTION_ARGS)
 	ArrayType  *transarray2 = PG_GETARG_ARRAYTYPE_P(1);
 	float8	   *transvalues1;
 	float8	   *transvalues2;
+	float8		N1,
+				Sx1,
+				Sxx1,
+				N2,
+				Sx2,
+				Sxx2,
+				tmp,
+				N,
+				Sx,
+				Sxx;
 
 	if (!AggCheckCallContext(fcinfo, NULL))
 		elog(ERROR, "aggregate function called in non-aggregate context");
@@ -2448,9 +2474,51 @@ float8_combine(PG_FUNCTION_ARGS)
 	transvalues1 = check_float8_array(transarray1, "float8_combine", 3);
 	transvalues2 = check_float8_array(transarray2, "float8_combine", 3);
 
-	transvalues1[0] = transvalues1[0] + transvalues2[0];
-	transvalues1[1] = float8_pl(transvalues1[1], transvalues2[1]);
-	transvalues1[2] = float8_pl(transvalues1[2], transvalues2[2]);
+	N1 = transvalues1[0];
+	Sx1 = transvalues1[1];
+	Sxx1 = transvalues1[2];
+
+	N2 = transvalues2[0];
+	Sx2 = transvalues2[1];
+	Sxx2 = transvalues2[2];
+
+	/*--------------------
+	 * The transition values combine using a generalization of the
+	 * Youngs-Cramer algorithm as follows:
+	 *
+	 *	N = N1 + N2
+	 *	Sx = Sx1 + Sx2
+	 *	Sxx = Sxx1 + Sxx2 + N1 * N2 * (Sx1/N1 - Sx2/N2)^2 / N;
+	 *
+	 * It's worth handling the special cases N1 = 0 and N2 = 0 separately
+	 * since those cases are trivial, and we then don't need to worry about
+	 * division-by-zero errors in the general case.
+	 *--------------------
+	 */
+	if (N1 == 0.0)
+	{
+		N = N2;
+		Sx = Sx2;
+		Sxx = Sxx2;
+	}
+	else if (N2 == 0.0)
+	{
+		N = N1;
+		Sx = Sx1;
+		Sxx = Sxx1;
+	}
+	else
+	{
+		N = N1 + N2;
+		Sx = float8_pl(Sx1, Sx2);
+		tmp = Sx1 / N1 - Sx2 / N2;
+		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp * tmp / N;
+		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+	}
+
+	transvalues1[0] = N;
+	transvalues1[1] = Sx;
+	transvalues1[2] = Sxx;
 
 	PG_RETURN_ARRAYTYPE_P(transarray1);
 }
@@ -2461,8 +2529,28 @@ float8_accum(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8		newval = PG_GETARG_FLOAT8(1);
 	float8	   *transvalues;
+	float8		N,
+				Sx,
+				Sxx,
+				tmp;
 
 	transvalues = check_float8_array(transarray, "float8_accum", 3);
+	N = transvalues[0];
+	Sx = transvalues[1];
+	Sxx = transvalues[2];
+
+	/*
+	 * Use the Youngs-Cramer algorithm to incorporate the new value into the
+	 * transition values.
+	 */
+	N += 1.0;
+	Sx = float8_pl(Sx, newval);
+	if (transvalues[0] > 0.0)
+	{
+		tmp = newval * N - Sx;
+		Sxx += tmp * tmp / (N * transvalues[0]);
+		check_float8_val(Sxx, isinf(transvalues[2]) || isinf(newval), true);
+	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -2471,9 +2559,9 @@ float8_accum(PG_FUNCTION_ARGS)
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transvalues[0] = N;
+		transvalues[1] = Sx;
+		transvalues[2] = Sxx;
 
 		PG_RETURN_ARRAYTYPE_P(transarray);
 	}
@@ -2482,9 +2570,9 @@ float8_accum(PG_FUNCTION_ARGS)
 		Datum		transdatums[3];
 		ArrayType  *result;
 
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transdatums[0] = Float8GetDatumFast(N);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
 
 		result = construct_array(transdatums, 3,
 								 FLOAT8OID,
@@ -2502,8 +2590,28 @@ float4_accum(PG_FUNCTION_ARGS)
 	/* do computations as float8 */
 	float8		newval = PG_GETARG_FLOAT4(1);
 	float8	   *transvalues;
+	float8		N,
+				Sx,
+				Sxx,
+				tmp;
 
 	transvalues = check_float8_array(transarray, "float4_accum", 3);
+	N = transvalues[0];
+	Sx = transvalues[1];
+	Sxx = transvalues[2];
+
+	/*
+	 * Use the Youngs-Cramer algorithm to incorporate the new value into the
+	 * transition values.
+	 */
+	N += 1.0;
+	Sx = float8_pl(Sx, newval);
+	if (transvalues[0] > 0.0)
+	{
+		tmp = newval * N - Sx;
+		Sxx += tmp * tmp / (N * transvalues[0]);
+		check_float8_val(Sxx, isinf(transvalues[2]) || isinf(newval), true);
+	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -2512,9 +2620,9 @@ float4_accum(PG_FUNCTION_ARGS)
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transvalues[0] = N;
+		transvalues[1] = Sx;
+		transvalues[2] = Sxx;
 
 		PG_RETURN_ARRAYTYPE_P(transarray);
 	}
@@ -2523,9 +2631,9 @@ float4_accum(PG_FUNCTION_ARGS)
 		Datum		transdatums[3];
 		ArrayType  *result;
 
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transdatums[0] = Float8GetDatumFast(N);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
 
 		result = construct_array(transdatums, 3,
 								 FLOAT8OID,
@@ -2541,18 +2649,18 @@ float8_avg(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX;
+				Sx;
 
 	transvalues = check_float8_array(transarray, "float8_avg", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	/* ignore sumX2 */
+	Sx = transvalues[1];
+	/* ignore Sxx */
 
 	/* SQL defines AVG of no values to be NULL */
 	if (N == 0.0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(sumX / N);
+	PG_RETURN_FLOAT8(Sx / N);
 }
 
 Datum
@@ -2561,27 +2669,22 @@ float8_var_pop(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_var_pop", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Sx */
+	Sxx = transvalues[2];
 
 	/* Population variance is undefined when N is 0, so return NULL */
 	if (N == 0.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative variance */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(numerator / (N * N));
+	PG_RETURN_FLOAT8(Sxx / N);
 }
 
 Datum
@@ -2590,27 +2693,22 @@ float8_var_samp(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_var_samp", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Sx */
+	Sxx = transvalues[2];
 
 	/* Sample variance is undefined when N is 0 or 1, so return NULL */
 	if (N <= 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative variance */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(numerator / (N * (N - 1.0)));
+	PG_RETURN_FLOAT8(Sxx / (N - 1.0));
 }
 
 Datum
@@ -2619,27 +2717,22 @@ float8_stddev_pop(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_stddev_pop", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Sx */
+	Sxx = transvalues[2];
 
 	/* Population stddev is undefined when N is 0, so return NULL */
 	if (N == 0.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative variance */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(sqrt(numerator / (N * N)));
+	PG_RETURN_FLOAT8(sqrt(Sxx / N));
 }
 
 Datum
@@ -2648,27 +2741,22 @@ float8_stddev_samp(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_stddev_samp", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Sx */
+	Sxx = transvalues[2];
 
 	/* Sample stddev is undefined when N is 0 or 1, so return NULL */
 	if (N <= 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative variance */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(sqrt(numerator / (N * (N - 1.0))));
+	PG_RETURN_FLOAT8(sqrt(Sxx / (N - 1.0)));
 }
 
 /*
@@ -2676,9 +2764,14 @@ float8_stddev_samp(PG_FUNCTION_ARGS)
  *		SQL2003 BINARY AGGREGATES
  *		=========================
  *
+ * As with the preceding aggregates, we use the Youngs-Cramer algorithm to
+ * reduce rounding errors in the aggregate final functions.
+ *
  * The transition datatype for all these aggregates is a 6-element array of
- * float8, holding the values N, sum(X), sum(X*X), sum(Y), sum(Y*Y), sum(X*Y)
- * in that order.  Note that Y is the first argument to the aggregates!
+ * float8, holding the values N, Sx=sum(X), Sxx=sum((X-Sx/N)^2), Sy=sum(Y),
+ * Syy=sum((Y-Sy/N)^2), Sxy=sum((X-Sx/N)*(Y-Sy/N)) in that order.
+ *
+ * Note that Y is the first argument to all these aggregates!
  *
  * It might seem attractive to optimize this by having multiple accumulator
  * functions that only calculate the sums actually needed.  But on most
@@ -2695,32 +2788,43 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 	float8		newvalX = PG_GETARG_FLOAT8(2);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumY2,
-				sumXY;
+				Sx,
+				Sxx,
+				Sy,
+				Syy,
+				Sxy,
+				factor,
+				tmp1,
+				tmp2;
 
 	transvalues = check_float8_array(transarray, "float8_regr_accum", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
-	sumXY = transvalues[5];
+	Sx = transvalues[1];
+	Sxx = transvalues[2];
+	Sy = transvalues[3];
+	Syy = transvalues[4];
+	Sxy = transvalues[5];
 
+	/*
+	 * Use the Youngs-Cramer algorithm to incorporate the new values into the
+	 * transition values.
+	 */
 	N += 1.0;
-	sumX += newvalX;
-	check_float8_val(sumX, isinf(transvalues[1]) || isinf(newvalX), true);
-	sumX2 += newvalX * newvalX;
-	check_float8_val(sumX2, isinf(transvalues[2]) || isinf(newvalX), true);
-	sumY += newvalY;
-	check_float8_val(sumY, isinf(transvalues[3]) || isinf(newvalY), true);
-	sumY2 += newvalY * newvalY;
-	check_float8_val(sumY2, isinf(transvalues[4]) || isinf(newvalY), true);
-	sumXY += newvalX * newvalY;
-	check_float8_val(sumXY, isinf(transvalues[5]) || isinf(newvalX) ||
-					 isinf(newvalY), true);
+	Sx = float8_pl(Sx, newvalX);
+	Sy = float8_pl(Sy, newvalY);
+	if (transvalues[0] > 0.0)
+	{
+		factor = 1.0 / (N * transvalues[0]);
+		tmp1 = newvalX * N - Sx;
+		Sxx += tmp1 * tmp1 * factor;
+		check_float8_val(Sxx, isinf(transvalues[2]) || isinf(newvalX), true);
+		tmp2 = newvalY * N - Sy;
+		Syy += tmp2 * tmp2 * factor;
+		check_float8_val(Syy, isinf(transvalues[4]) || isinf(newvalY), true);
+		Sxy += tmp1 * tmp2 * factor;
+		check_float8_val(Sxy, isinf(transvalues[5]) || isinf(newvalX) ||
+						 isinf(newvalY), true);
+	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -2730,11 +2834,11 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		transvalues[0] = N;
-		transvalues[1] = sumX;
-		transvalues[2] = sumX2;
-		transvalues[3] = sumY;
-		transvalues[4] = sumY2;
-		transvalues[5] = sumXY;
+		transvalues[1] = Sx;
+		transvalues[2] = Sxx;
+		transvalues[3] = Sy;
+		transvalues[4] = Syy;
+		transvalues[5] = Sxy;
 
 		PG_RETURN_ARRAYTYPE_P(transarray);
 	}
@@ -2744,11 +2848,11 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 		ArrayType  *result;
 
 		transdatums[0] = Float8GetDatumFast(N);
-		transdatums[1] = Float8GetDatumFast(sumX);
-		transdatums[2] = Float8GetDatumFast(sumX2);
-		transdatums[3] = Float8GetDatumFast(sumY);
-		transdatums[4] = Float8GetDatumFast(sumY2);
-		transdatums[5] = Float8GetDatumFast(sumXY);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
+		transdatums[3] = Float8GetDatumFast(Sy);
+		transdatums[4] = Float8GetDatumFast(Syy);
+		transdatums[5] = Float8GetDatumFast(Sxy);
 
 		result = construct_array(transdatums, 6,
 								 FLOAT8OID,
@@ -2773,6 +2877,26 @@ float8_regr_combine(PG_FUNCTION_ARGS)
 	ArrayType  *transarray2 = PG_GETARG_ARRAYTYPE_P(1);
 	float8	   *transvalues1;
 	float8	   *transvalues2;
+	float8		N1,
+				Sx1,
+				Sxx1,
+				Sy1,
+				Syy1,
+				Sxy1,
+				N2,
+				Sx2,
+				Sxx2,
+				Sy2,
+				Syy2,
+				Sxy2,
+				tmp1,
+				tmp2,
+				N,
+				Sx,
+				Sxx,
+				Sy,
+				Syy,
+				Sxy;
 
 	if (!AggCheckCallContext(fcinfo, NULL))
 		elog(ERROR, "aggregate function called in non-aggregate context");
@@ -2780,12 +2904,75 @@ float8_regr_combine(PG_FUNCTION_ARGS)
 	transvalues1 = check_float8_array(transarray1, "float8_regr_combine", 6);
 	transvalues2 = check_float8_array(transarray2, "float8_regr_combine", 6);
 
-	transvalues1[0] = transvalues1[0] + transvalues2[0];
-	transvalues1[1] = float8_pl(transvalues1[1], transvalues2[1]);
-	transvalues1[2] = float8_pl(transvalues1[2], transvalues2[2]);
-	transvalues1[3] = float8_pl(transvalues1[3], transvalues2[3]);
-	transvalues1[4] = float8_pl(transvalues1[4], transvalues2[4]);
-	transvalues1[5] = float8_pl(transvalues1[5], transvalues2[5]);
+	N1 = transvalues1[0];
+	Sx1 = transvalues1[1];
+	Sxx1 = transvalues1[2];
+	Sy1 = transvalues1[3];
+	Syy1 = transvalues1[4];
+	Sxy1 = transvalues1[5];
+
+	N2 = transvalues2[0];
+	Sx2 = transvalues2[1];
+	Sxx2 = transvalues2[2];
+	Sy2 = transvalues2[3];
+	Syy2 = transvalues2[4];
+	Sxy2 = transvalues2[5];
+
+	/*--------------------
+	 * The transition values combine using a generalization of the
+	 * Youngs-Cramer algorithm as follows:
+	 *
+	 *	N = N1 + N2
+	 *	Sx = Sx1 + Sx2
+	 *	Sxx = Sxx1 + Sxx2 + N1 * N2 * (Sx1/N1 - Sx2/N2)^2 / N
+	 *	Sy = Sy1 + Sy2
+	 *	Syy = Syy1 + Syy2 + N1 * N2 * (Sy1/N1 - Sy2/N2)^2 / N
+	 *	Sxy = Sxy1 + Sxy2 + N1 * N2 * (Sx1/N1 - Sx2/N2) * (Sy1/N1 - Sy2/N2) / N
+	 *
+	 * It's worth handling the special cases N1 = 0 and N2 = 0 separately
+	 * since those cases are trivial, and we then don't need to worry about
+	 * division-by-zero errors in the general case.
+	 *--------------------
+	 */
+	if (N1 == 0.0)
+	{
+		N = N2;
+		Sx = Sx2;
+		Sxx = Sxx2;
+		Sy = Sy2;
+		Syy = Syy2;
+		Sxy = Sxy2;
+	}
+	else if (N2 == 0.0)
+	{
+		N = N1;
+		Sx = Sx1;
+		Sxx = Sxx1;
+		Sy = Syy1;
+		Syy = Syy1;
+		Sxy = Sxy1;
+	}
+	else
+	{
+		N = N1 + N2;
+		Sx = float8_pl(Sx1, Sx2);
+		tmp1 = Sx1 / N1 - Sx2 / N2;
+		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp1 * tmp1 / N;
+		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		Sy = float8_pl(Sy1, Sy2);
+		tmp2 = Sy1 / N1 - Sy2 / N2;
+		Syy = Syy1 + Syy2 + N1 * N2 * tmp2 * tmp2 / N;
+		check_float8_val(Syy, isinf(Syy1) || isinf(Syy2), true);
+		Sxy = Sxy1 + Sxy2 + N1 * N2 * tmp1 * tmp2 / N;
+		check_float8_val(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
+	}
+
+	transvalues1[0] = N;
+	transvalues1[1] = Sx;
+	transvalues1[2] = Sxx;
+	transvalues1[3] = Sy;
+	transvalues1[4] = Syy;
+	transvalues1[5] = Sxy;
 
 	PG_RETURN_ARRAYTYPE_P(transarray1);
 }
@@ -2797,27 +2984,21 @@ float8_regr_sxx(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_regr_sxx", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	Sxx = transvalues[2];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative result */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(numerator / N);
+	PG_RETURN_FLOAT8(Sxx);
 }
 
 Datum
@@ -2826,27 +3007,21 @@ float8_regr_syy(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumY,
-				sumY2,
-				numerator;
+				Syy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_syy", 6);
 	N = transvalues[0];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
+	Syy = transvalues[4];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumY2 - sumY * sumY;
-	check_float8_val(numerator, isinf(sumY2) || isinf(sumY), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative result */
+	if (Syy <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(numerator / N);
+	PG_RETURN_FLOAT8(Syy);
 }
 
 Datum
@@ -2855,28 +3030,19 @@ float8_regr_sxy(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumY,
-				sumXY,
-				numerator;
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_sxy", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumXY - sumX * sumY;
-	check_float8_val(numerator, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-
 	/* A negative result is valid here */
 
-	PG_RETURN_FLOAT8(numerator / N);
+	PG_RETURN_FLOAT8(Sxy);
 }
 
 Datum
@@ -2885,17 +3051,17 @@ float8_regr_avgx(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX;
+				Sx;
 
 	transvalues = check_float8_array(transarray, "float8_regr_avgx", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
+	Sx = transvalues[1];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(sumX / N);
+	PG_RETURN_FLOAT8(Sx / N);
 }
 
 Datum
@@ -2904,17 +3070,17 @@ float8_regr_avgy(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumY;
+				Sy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_avgy", 6);
 	N = transvalues[0];
-	sumY = transvalues[3];
+	Sy = transvalues[3];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(sumY / N);
+	PG_RETURN_FLOAT8(Sy / N);
 }
 
 Datum
@@ -2923,26 +3089,17 @@ float8_covar_pop(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumY,
-				sumXY,
-				numerator;
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_covar_pop", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumXY - sumX * sumY;
-	check_float8_val(numerator, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-
-	PG_RETURN_FLOAT8(numerator / (N * N));
+	PG_RETURN_FLOAT8(Sxy / N);
 }
 
 Datum
@@ -2951,26 +3108,17 @@ float8_covar_samp(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumY,
-				sumXY,
-				numerator;
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_covar_samp", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxy = transvalues[5];
 
 	/* if N is <= 1 we should return NULL */
 	if (N < 2.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumXY - sumX * sumY;
-	check_float8_val(numerator, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-
-	PG_RETURN_FLOAT8(numerator / (N * (N - 1.0)));
+	PG_RETURN_FLOAT8(Sxy / (N - 1.0));
 }
 
 Datum
@@ -2979,38 +3127,24 @@ float8_corr(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumY2,
-				sumXY,
-				numeratorX,
-				numeratorY,
-				numeratorXY;
+				Sxx,
+				Syy,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_corr", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
-	sumXY = transvalues[5];
+	Sxx = transvalues[2];
+	Syy = transvalues[4];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorY = N * sumY2 - sumY * sumY;
-	check_float8_val(numeratorY, isinf(sumY2) || isinf(sumY), true);
-	numeratorXY = N * sumXY - sumX * sumY;
-	check_float8_val(numeratorXY, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-	if (numeratorX <= 0 || numeratorY <= 0)
+	if (Sxx <= 0 || Syy <= 0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(numeratorXY / sqrt(numeratorX * numeratorY));
+	PG_RETURN_FLOAT8(Sxy / sqrt(Sxx * Syy));
 }
 
 Datum
@@ -3019,42 +3153,27 @@ float8_regr_r2(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumY2,
-				sumXY,
-				numeratorX,
-				numeratorY,
-				numeratorXY;
+				Sxx,
+				Syy,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_r2", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
-	sumXY = transvalues[5];
+	Sxx = transvalues[2];
+	Syy = transvalues[4];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorY = N * sumY2 - sumY * sumY;
-	check_float8_val(numeratorY, isinf(sumY2) || isinf(sumY), true);
-	numeratorXY = N * sumXY - sumX * sumY;
-	check_float8_val(numeratorXY, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-	if (numeratorX <= 0)
+	if (Sxx <= 0)
 		PG_RETURN_NULL();
 	/* per spec, horizontal line produces 1.0 */
-	if (numeratorY <= 0)
+	if (Syy <= 0)
 		PG_RETURN_FLOAT8(1.0);
 
-	PG_RETURN_FLOAT8((numeratorXY * numeratorXY) /
-					 (numeratorX * numeratorY));
+	PG_RETURN_FLOAT8((Sxy * Sxy) / (Sxx * Syy));
 }
 
 Datum
@@ -3063,33 +3182,22 @@ float8_regr_slope(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumXY,
-				numeratorX,
-				numeratorXY;
+				Sxx,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_slope", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxx = transvalues[2];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorXY = N * sumXY - sumX * sumY;
-	check_float8_val(numeratorXY, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-	if (numeratorX <= 0)
+	if (Sxx <= 0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(numeratorXY / numeratorX);
+	PG_RETURN_FLOAT8(Sxy / Sxx);
 }
 
 Datum
@@ -3098,33 +3206,26 @@ float8_regr_intercept(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumXY,
-				numeratorX,
-				numeratorXXY;
+				Sx,
+				Sxx,
+				Sy,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_intercept", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sx = transvalues[1];
+	Sxx = transvalues[2];
+	Sy = transvalues[3];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorXXY = sumY * sumX2 - sumX * sumXY;
-	check_float8_val(numeratorXXY, isinf(sumY) || isinf(sumX2) ||
-					 isinf(sumX) || isinf(sumXY), true);
-	if (numeratorX <= 0)
+	if (Sxx <= 0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(numeratorXXY / numeratorX);
+	PG_RETURN_FLOAT8((Sy - Sx * Sxy / Sxx) / N);
 }
 
 
#5Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Dean Rasheed (#4)
1 attachment(s)
Re: BUG #15307: Low numerical precision of (Co-) Variance

On 9 August 2018 at 12:02, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

... the YC algorithm is probably preferable. For the record,
attached are both versions that I tried.

Here is an updated, more complete patch, based on the YC algorithm,
with updated regression tests for the September commitfest.

All the existing tests pass unchanged, although I'm somewhat surprised
that the current tests pass with no platform variations. I've added
new tests to cover infinity/NaN handling, parallel aggregation and
confirm the improved accuracy with large offsets. The latter tests
operate well within in the limits of double precision arithmetic, so I
wouldn't expect any platform variation, but that's difficult
guarantee. If there are problems, it may be necessary to round the
test results.

Notable changes from the previous patch:

I have rewritten the overflow checks in the accum functions to be
clearer and more efficient which, if anything, makes these aggregates
now slightly faster than HEAD. More importantly though, I've added
explicit code to force Sxx to be NaN if any input is infinite, which
the previous coding didn't guarantee. I think NaN is the right result
for quantities like variance, if any input value is infinite, since it
logically involves 'infinity minus infinity'. That's also consistent
with the current behaviour.

I have also made the aggregate combine functions SQL-callable to make
testing easier -- there was a bug in the previous version due to a
typo which meant that float8_regr_combine() was incorrect when N1 was
non-zero and N2 was zero. That situation is unlikely to happen in
practice, and difficult to provoke deliberately without manually
calling the combine function, which is why I didn't spot it before.
The new tests cover all code branches, and make it easier to see that
the combine functions are producing the correct results.

Regards,
Dean

Attachments:

implement-float-aggs-using-youngs-cramer-v2.patchapplication/octet-stream; name=implement-float-aggs-using-youngs-cramer-v2.patchDownload
diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
new file mode 100644
index df35557..bd25ab7
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -2401,13 +2401,29 @@ setseed(PG_FUNCTION_ARGS)
  *		float8_stddev_samp	- produce final result for float STDDEV_SAMP()
  *		float8_stddev_pop	- produce final result for float STDDEV_POP()
  *
+ * The naive schoolbook implementation of these aggregates works by
+ * accumulating sum(X) and sum(X^2).  However, this approach suffers from
+ * large rounding errors in the final computation of quantities like the
+ * population variance (N*sum(X^2) - sum(X)^2) / N^2, since each of the
+ * intermediate terms is potentially very large, while the difference is often
+ * quite small.
+ *
+ * Instead we use the Youngs-Cramer algorithm [1] which works by accumulating
+ * Sx=sum(X) and Sxx=sum((X-Sx/N)^2), using a numerically stable algorithm to
+ * incrementally update those quantities.  The final computations of each of
+ * the aggregate values is then trivial and gives more accurate results (for
+ * example, the population variance is just Sxx/N).
+ *
  * The transition datatype for all these aggregates is a 3-element array
- * of float8, holding the values N, sum(X), sum(X*X) in that order.
+ * of float8, holding the values N, Sx, Sxx in that order.
  *
  * Note that we represent N as a float to avoid having to build a special
  * datatype.  Given a reasonable floating-point implementation, there should
  * be no accuracy loss unless N exceeds 2 ^ 52 or so (by which time the
  * user will have doubtless lost interest anyway...)
+ *
+ * [1] Some Results Relevant to Choice of Sum and Sum-of-Product Algorithms,
+ * E. A. Youngs and E. M. Cramer, Technometrics Vol 13, No 3, August 1971.
  */
 
 static float8 *
@@ -2441,18 +2457,90 @@ float8_combine(PG_FUNCTION_ARGS)
 	ArrayType  *transarray2 = PG_GETARG_ARRAYTYPE_P(1);
 	float8	   *transvalues1;
 	float8	   *transvalues2;
-
-	if (!AggCheckCallContext(fcinfo, NULL))
-		elog(ERROR, "aggregate function called in non-aggregate context");
+	float8		N1,
+				Sx1,
+				Sxx1,
+				N2,
+				Sx2,
+				Sxx2,
+				tmp,
+				N,
+				Sx,
+				Sxx;
 
 	transvalues1 = check_float8_array(transarray1, "float8_combine", 3);
 	transvalues2 = check_float8_array(transarray2, "float8_combine", 3);
 
-	transvalues1[0] = transvalues1[0] + transvalues2[0];
-	transvalues1[1] = float8_pl(transvalues1[1], transvalues2[1]);
-	transvalues1[2] = float8_pl(transvalues1[2], transvalues2[2]);
+	N1 = transvalues1[0];
+	Sx1 = transvalues1[1];
+	Sxx1 = transvalues1[2];
 
-	PG_RETURN_ARRAYTYPE_P(transarray1);
+	N2 = transvalues2[0];
+	Sx2 = transvalues2[1];
+	Sxx2 = transvalues2[2];
+
+	/*--------------------
+	 * The transition values combine using a generalization of the
+	 * Youngs-Cramer algorithm as follows:
+	 *
+	 *	N = N1 + N2
+	 *	Sx = Sx1 + Sx2
+	 *	Sxx = Sxx1 + Sxx2 + N1 * N2 * (Sx1/N1 - Sx2/N2)^2 / N;
+	 *
+	 * It's worth handling the special cases N1 = 0 and N2 = 0 separately
+	 * since those cases are trivial, and we then don't need to worry about
+	 * division-by-zero errors in the general case.
+	 *--------------------
+	 */
+	if (N1 == 0.0)
+	{
+		N = N2;
+		Sx = Sx2;
+		Sxx = Sxx2;
+	}
+	else if (N2 == 0.0)
+	{
+		N = N1;
+		Sx = Sx1;
+		Sxx = Sxx1;
+	}
+	else
+	{
+		N = N1 + N2;
+		Sx = float8_pl(Sx1, Sx2);
+		tmp = Sx1 / N1 - Sx2 / N2;
+		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp * tmp / N;
+		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+	}
+
+	/*
+	 * If we're invoked as an aggregate, we can cheat and modify our first
+	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
+	 * new array with the updated transition data and return it.
+	 */
+	if (AggCheckCallContext(fcinfo, NULL))
+	{
+		transvalues1[0] = N;
+		transvalues1[1] = Sx;
+		transvalues1[2] = Sxx;
+
+		PG_RETURN_ARRAYTYPE_P(transarray1);
+	}
+	else
+	{
+		Datum		transdatums[3];
+		ArrayType  *result;
+
+		transdatums[0] = Float8GetDatumFast(N);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
+
+		result = construct_array(transdatums, 3,
+								 FLOAT8OID,
+								 sizeof(float8), FLOAT8PASSBYVAL, 'd');
+
+		PG_RETURN_ARRAYTYPE_P(result);
+	}
 }
 
 Datum
@@ -2461,8 +2549,43 @@ float8_accum(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8		newval = PG_GETARG_FLOAT8(1);
 	float8	   *transvalues;
+	float8		N,
+				Sx,
+				Sxx,
+				tmp;
 
 	transvalues = check_float8_array(transarray, "float8_accum", 3);
+	N = transvalues[0];
+	Sx = transvalues[1];
+	Sxx = transvalues[2];
+
+	/*
+	 * Use the Youngs-Cramer algorithm to incorporate the new value into the
+	 * transition values.
+	 */
+	N += 1.0;
+	Sx += newval;
+	if (transvalues[0] > 0.0)
+	{
+		tmp = newval * N - Sx;
+		Sxx += tmp * tmp / (N * transvalues[0]);
+
+		/*
+		 * Overflow check.  We only report an overflow error when finite
+		 * inputs lead to infinite results.  Note also that Sxx should be NaN
+		 * if any of the inputs are infinite, so we intentionally prevent Sxx
+		 * from becoming infinite.
+		 */
+		if (isinf(Sx) || isinf(Sxx))
+		{
+			if (!isinf(transvalues[1]) && !isinf(newval))
+				ereport(ERROR,
+						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+						 errmsg("value out of range: overflow")));
+
+			Sxx = get_float8_nan();
+		}
+	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -2471,9 +2594,9 @@ float8_accum(PG_FUNCTION_ARGS)
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transvalues[0] = N;
+		transvalues[1] = Sx;
+		transvalues[2] = Sxx;
 
 		PG_RETURN_ARRAYTYPE_P(transarray);
 	}
@@ -2482,9 +2605,9 @@ float8_accum(PG_FUNCTION_ARGS)
 		Datum		transdatums[3];
 		ArrayType  *result;
 
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transdatums[0] = Float8GetDatumFast(N);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
 
 		result = construct_array(transdatums, 3,
 								 FLOAT8OID,
@@ -2502,8 +2625,43 @@ float4_accum(PG_FUNCTION_ARGS)
 	/* do computations as float8 */
 	float8		newval = PG_GETARG_FLOAT4(1);
 	float8	   *transvalues;
+	float8		N,
+				Sx,
+				Sxx,
+				tmp;
 
 	transvalues = check_float8_array(transarray, "float4_accum", 3);
+	N = transvalues[0];
+	Sx = transvalues[1];
+	Sxx = transvalues[2];
+
+	/*
+	 * Use the Youngs-Cramer algorithm to incorporate the new value into the
+	 * transition values.
+	 */
+	N += 1.0;
+	Sx += newval;
+	if (transvalues[0] > 0.0)
+	{
+		tmp = newval * N - Sx;
+		Sxx += tmp * tmp / (N * transvalues[0]);
+
+		/*
+		 * Overflow check.  We only report an overflow error when finite
+		 * inputs lead to infinite results.  Note also that Sxx should be NaN
+		 * if any of the inputs are infinite, so we intentionally prevent Sxx
+		 * from becoming infinite.
+		 */
+		if (isinf(Sx) || isinf(Sxx))
+		{
+			if (!isinf(transvalues[1]) && !isinf(newval))
+				ereport(ERROR,
+						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+						 errmsg("value out of range: overflow")));
+
+			Sxx = get_float8_nan();
+		}
+	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -2512,9 +2670,9 @@ float4_accum(PG_FUNCTION_ARGS)
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transvalues[0] = N;
+		transvalues[1] = Sx;
+		transvalues[2] = Sxx;
 
 		PG_RETURN_ARRAYTYPE_P(transarray);
 	}
@@ -2523,9 +2681,9 @@ float4_accum(PG_FUNCTION_ARGS)
 		Datum		transdatums[3];
 		ArrayType  *result;
 
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transdatums[0] = Float8GetDatumFast(N);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
 
 		result = construct_array(transdatums, 3,
 								 FLOAT8OID,
@@ -2541,18 +2699,18 @@ float8_avg(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX;
+				Sx;
 
 	transvalues = check_float8_array(transarray, "float8_avg", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	/* ignore sumX2 */
+	Sx = transvalues[1];
+	/* ignore Sxx */
 
 	/* SQL defines AVG of no values to be NULL */
 	if (N == 0.0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(sumX / N);
+	PG_RETURN_FLOAT8(Sx / N);
 }
 
 Datum
@@ -2561,27 +2719,22 @@ float8_var_pop(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_var_pop", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Sx */
+	Sxx = transvalues[2];
 
 	/* Population variance is undefined when N is 0, so return NULL */
 	if (N == 0.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative variance */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(numerator / (N * N));
+	PG_RETURN_FLOAT8(Sxx / N);
 }
 
 Datum
@@ -2590,27 +2743,22 @@ float8_var_samp(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_var_samp", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Sx */
+	Sxx = transvalues[2];
 
 	/* Sample variance is undefined when N is 0 or 1, so return NULL */
 	if (N <= 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative variance */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(numerator / (N * (N - 1.0)));
+	PG_RETURN_FLOAT8(Sxx / (N - 1.0));
 }
 
 Datum
@@ -2619,27 +2767,22 @@ float8_stddev_pop(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_stddev_pop", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Sx */
+	Sxx = transvalues[2];
 
 	/* Population stddev is undefined when N is 0, so return NULL */
 	if (N == 0.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative variance */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(sqrt(numerator / (N * N)));
+	PG_RETURN_FLOAT8(sqrt(Sxx / N));
 }
 
 Datum
@@ -2648,27 +2791,22 @@ float8_stddev_samp(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_stddev_samp", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Sx */
+	Sxx = transvalues[2];
 
 	/* Sample stddev is undefined when N is 0 or 1, so return NULL */
 	if (N <= 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative variance */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(sqrt(numerator / (N * (N - 1.0))));
+	PG_RETURN_FLOAT8(sqrt(Sxx / (N - 1.0)));
 }
 
 /*
@@ -2676,9 +2814,14 @@ float8_stddev_samp(PG_FUNCTION_ARGS)
  *		SQL2003 BINARY AGGREGATES
  *		=========================
  *
+ * As with the preceding aggregates, we use the Youngs-Cramer algorithm to
+ * reduce rounding errors in the aggregate final functions.
+ *
  * The transition datatype for all these aggregates is a 6-element array of
- * float8, holding the values N, sum(X), sum(X*X), sum(Y), sum(Y*Y), sum(X*Y)
- * in that order.  Note that Y is the first argument to the aggregates!
+ * float8, holding the values N, Sx=sum(X), Sxx=sum((X-Sx/N)^2), Sy=sum(Y),
+ * Syy=sum((Y-Sy/N)^2), Sxy=sum((X-Sx/N)*(Y-Sy/N)) in that order.
+ *
+ * Note that Y is the first argument to all these aggregates!
  *
  * It might seem attractive to optimize this by having multiple accumulator
  * functions that only calculate the sums actually needed.  But on most
@@ -2695,32 +2838,66 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 	float8		newvalX = PG_GETARG_FLOAT8(2);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumY2,
-				sumXY;
+				Sx,
+				Sxx,
+				Sy,
+				Syy,
+				Sxy,
+				tmpX,
+				tmpY,
+				scale;
 
 	transvalues = check_float8_array(transarray, "float8_regr_accum", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
-	sumXY = transvalues[5];
+	Sx = transvalues[1];
+	Sxx = transvalues[2];
+	Sy = transvalues[3];
+	Syy = transvalues[4];
+	Sxy = transvalues[5];
 
+	/*
+	 * Use the Youngs-Cramer algorithm to incorporate the new values into the
+	 * transition values.
+	 */
 	N += 1.0;
-	sumX += newvalX;
-	check_float8_val(sumX, isinf(transvalues[1]) || isinf(newvalX), true);
-	sumX2 += newvalX * newvalX;
-	check_float8_val(sumX2, isinf(transvalues[2]) || isinf(newvalX), true);
-	sumY += newvalY;
-	check_float8_val(sumY, isinf(transvalues[3]) || isinf(newvalY), true);
-	sumY2 += newvalY * newvalY;
-	check_float8_val(sumY2, isinf(transvalues[4]) || isinf(newvalY), true);
-	sumXY += newvalX * newvalY;
-	check_float8_val(sumXY, isinf(transvalues[5]) || isinf(newvalX) ||
-					 isinf(newvalY), true);
+	Sx += newvalX;
+	Sy += newvalY;
+	if (transvalues[0] > 0.0)
+	{
+		tmpX = newvalX * N - Sx;
+		tmpY = newvalY * N - Sy;
+		scale = 1.0 / (N * transvalues[0]);
+		Sxx += tmpX * tmpX * scale;
+		Syy += tmpY * tmpY * scale;
+		Sxy += tmpX * tmpY * scale;
+
+		/*
+		 * Overflow check.  We only report an overflow error when finite
+		 * inputs lead to infinite results.  Note also that Sxx, Syy and Sxy
+		 * should be NaN if any of the relevant inputs are infinite, so we
+		 * intentionally prevent them from becoming infinite.
+		 */
+		if (isinf(Sx) || isinf(Sxx) || isinf(Sy) || isinf(Syy) || isinf(Sxy))
+		{
+			if (((isinf(Sx) || isinf(Sxx)) &&
+				 !isinf(transvalues[1]) && !isinf(newvalX)) ||
+				((isinf(Sy) || isinf(Syy)) &&
+				 !isinf(transvalues[3]) && !isinf(newvalY)) ||
+				(isinf(Sxy) &&
+				 !isinf(transvalues[1]) && !isinf(newvalX) &&
+				 !isinf(transvalues[3]) && !isinf(newvalY)))
+				ereport(ERROR,
+						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+						 errmsg("value out of range: overflow")));
+
+			if (isinf(Sxx))
+				Sxx = get_float8_nan();
+			if (isinf(Syy))
+				Syy = get_float8_nan();
+			if (isinf(Sxy))
+				Sxy = get_float8_nan();
+		}
+	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -2730,11 +2907,11 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		transvalues[0] = N;
-		transvalues[1] = sumX;
-		transvalues[2] = sumX2;
-		transvalues[3] = sumY;
-		transvalues[4] = sumY2;
-		transvalues[5] = sumXY;
+		transvalues[1] = Sx;
+		transvalues[2] = Sxx;
+		transvalues[3] = Sy;
+		transvalues[4] = Syy;
+		transvalues[5] = Sxy;
 
 		PG_RETURN_ARRAYTYPE_P(transarray);
 	}
@@ -2744,11 +2921,11 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 		ArrayType  *result;
 
 		transdatums[0] = Float8GetDatumFast(N);
-		transdatums[1] = Float8GetDatumFast(sumX);
-		transdatums[2] = Float8GetDatumFast(sumX2);
-		transdatums[3] = Float8GetDatumFast(sumY);
-		transdatums[4] = Float8GetDatumFast(sumY2);
-		transdatums[5] = Float8GetDatumFast(sumXY);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
+		transdatums[3] = Float8GetDatumFast(Sy);
+		transdatums[4] = Float8GetDatumFast(Syy);
+		transdatums[5] = Float8GetDatumFast(Sxy);
 
 		result = construct_array(transdatums, 6,
 								 FLOAT8OID,
@@ -2773,21 +2950,127 @@ float8_regr_combine(PG_FUNCTION_ARGS)
 	ArrayType  *transarray2 = PG_GETARG_ARRAYTYPE_P(1);
 	float8	   *transvalues1;
 	float8	   *transvalues2;
-
-	if (!AggCheckCallContext(fcinfo, NULL))
-		elog(ERROR, "aggregate function called in non-aggregate context");
+	float8		N1,
+				Sx1,
+				Sxx1,
+				Sy1,
+				Syy1,
+				Sxy1,
+				N2,
+				Sx2,
+				Sxx2,
+				Sy2,
+				Syy2,
+				Sxy2,
+				tmp1,
+				tmp2,
+				N,
+				Sx,
+				Sxx,
+				Sy,
+				Syy,
+				Sxy;
 
 	transvalues1 = check_float8_array(transarray1, "float8_regr_combine", 6);
 	transvalues2 = check_float8_array(transarray2, "float8_regr_combine", 6);
 
-	transvalues1[0] = transvalues1[0] + transvalues2[0];
-	transvalues1[1] = float8_pl(transvalues1[1], transvalues2[1]);
-	transvalues1[2] = float8_pl(transvalues1[2], transvalues2[2]);
-	transvalues1[3] = float8_pl(transvalues1[3], transvalues2[3]);
-	transvalues1[4] = float8_pl(transvalues1[4], transvalues2[4]);
-	transvalues1[5] = float8_pl(transvalues1[5], transvalues2[5]);
+	N1 = transvalues1[0];
+	Sx1 = transvalues1[1];
+	Sxx1 = transvalues1[2];
+	Sy1 = transvalues1[3];
+	Syy1 = transvalues1[4];
+	Sxy1 = transvalues1[5];
 
-	PG_RETURN_ARRAYTYPE_P(transarray1);
+	N2 = transvalues2[0];
+	Sx2 = transvalues2[1];
+	Sxx2 = transvalues2[2];
+	Sy2 = transvalues2[3];
+	Syy2 = transvalues2[4];
+	Sxy2 = transvalues2[5];
+
+	/*--------------------
+	 * The transition values combine using a generalization of the
+	 * Youngs-Cramer algorithm as follows:
+	 *
+	 *	N = N1 + N2
+	 *	Sx = Sx1 + Sx2
+	 *	Sxx = Sxx1 + Sxx2 + N1 * N2 * (Sx1/N1 - Sx2/N2)^2 / N
+	 *	Sy = Sy1 + Sy2
+	 *	Syy = Syy1 + Syy2 + N1 * N2 * (Sy1/N1 - Sy2/N2)^2 / N
+	 *	Sxy = Sxy1 + Sxy2 + N1 * N2 * (Sx1/N1 - Sx2/N2) * (Sy1/N1 - Sy2/N2) / N
+	 *
+	 * It's worth handling the special cases N1 = 0 and N2 = 0 separately
+	 * since those cases are trivial, and we then don't need to worry about
+	 * division-by-zero errors in the general case.
+	 *--------------------
+	 */
+	if (N1 == 0.0)
+	{
+		N = N2;
+		Sx = Sx2;
+		Sxx = Sxx2;
+		Sy = Sy2;
+		Syy = Syy2;
+		Sxy = Sxy2;
+	}
+	else if (N2 == 0.0)
+	{
+		N = N1;
+		Sx = Sx1;
+		Sxx = Sxx1;
+		Sy = Sy1;
+		Syy = Syy1;
+		Sxy = Sxy1;
+	}
+	else
+	{
+		N = N1 + N2;
+		Sx = float8_pl(Sx1, Sx2);
+		tmp1 = Sx1 / N1 - Sx2 / N2;
+		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp1 * tmp1 / N;
+		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		Sy = float8_pl(Sy1, Sy2);
+		tmp2 = Sy1 / N1 - Sy2 / N2;
+		Syy = Syy1 + Syy2 + N1 * N2 * tmp2 * tmp2 / N;
+		check_float8_val(Syy, isinf(Syy1) || isinf(Syy2), true);
+		Sxy = Sxy1 + Sxy2 + N1 * N2 * tmp1 * tmp2 / N;
+		check_float8_val(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
+	}
+
+	/*
+	 * If we're invoked as an aggregate, we can cheat and modify our first
+	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
+	 * new array with the updated transition data and return it.
+	 */
+	if (AggCheckCallContext(fcinfo, NULL))
+	{
+		transvalues1[0] = N;
+		transvalues1[1] = Sx;
+		transvalues1[2] = Sxx;
+		transvalues1[3] = Sy;
+		transvalues1[4] = Syy;
+		transvalues1[5] = Sxy;
+
+		PG_RETURN_ARRAYTYPE_P(transarray1);
+	}
+	else
+	{
+		Datum		transdatums[6];
+		ArrayType  *result;
+
+		transdatums[0] = Float8GetDatumFast(N);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
+		transdatums[3] = Float8GetDatumFast(Sy);
+		transdatums[4] = Float8GetDatumFast(Syy);
+		transdatums[5] = Float8GetDatumFast(Sxy);
+
+		result = construct_array(transdatums, 6,
+								 FLOAT8OID,
+								 sizeof(float8), FLOAT8PASSBYVAL, 'd');
+
+		PG_RETURN_ARRAYTYPE_P(result);
+	}
 }
 
 
@@ -2797,27 +3080,21 @@ float8_regr_sxx(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_regr_sxx", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	Sxx = transvalues[2];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative result */
+	if (Sxx <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(numerator / N);
+	PG_RETURN_FLOAT8(Sxx);
 }
 
 Datum
@@ -2826,27 +3103,21 @@ float8_regr_syy(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumY,
-				sumY2,
-				numerator;
+				Syy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_syy", 6);
 	N = transvalues[0];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
+	Syy = transvalues[4];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumY2 - sumY * sumY;
-	check_float8_val(numerator, isinf(sumY2) || isinf(sumY), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
+	/* Watch out for roundoff error producing a negative result */
+	if (Syy <= 0.0)
 		PG_RETURN_FLOAT8(0.0);
 
-	PG_RETURN_FLOAT8(numerator / N);
+	PG_RETURN_FLOAT8(Syy);
 }
 
 Datum
@@ -2855,28 +3126,19 @@ float8_regr_sxy(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumY,
-				sumXY,
-				numerator;
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_sxy", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumXY - sumX * sumY;
-	check_float8_val(numerator, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-
 	/* A negative result is valid here */
 
-	PG_RETURN_FLOAT8(numerator / N);
+	PG_RETURN_FLOAT8(Sxy);
 }
 
 Datum
@@ -2885,17 +3147,17 @@ float8_regr_avgx(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX;
+				Sx;
 
 	transvalues = check_float8_array(transarray, "float8_regr_avgx", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
+	Sx = transvalues[1];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(sumX / N);
+	PG_RETURN_FLOAT8(Sx / N);
 }
 
 Datum
@@ -2904,17 +3166,17 @@ float8_regr_avgy(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumY;
+				Sy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_avgy", 6);
 	N = transvalues[0];
-	sumY = transvalues[3];
+	Sy = transvalues[3];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(sumY / N);
+	PG_RETURN_FLOAT8(Sy / N);
 }
 
 Datum
@@ -2923,26 +3185,17 @@ float8_covar_pop(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumY,
-				sumXY,
-				numerator;
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_covar_pop", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumXY - sumX * sumY;
-	check_float8_val(numerator, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-
-	PG_RETURN_FLOAT8(numerator / (N * N));
+	PG_RETURN_FLOAT8(Sxy / N);
 }
 
 Datum
@@ -2951,26 +3204,17 @@ float8_covar_samp(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumY,
-				sumXY,
-				numerator;
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_covar_samp", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxy = transvalues[5];
 
 	/* if N is <= 1 we should return NULL */
 	if (N < 2.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumXY - sumX * sumY;
-	check_float8_val(numerator, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-
-	PG_RETURN_FLOAT8(numerator / (N * (N - 1.0)));
+	PG_RETURN_FLOAT8(Sxy / (N - 1.0));
 }
 
 Datum
@@ -2979,38 +3223,25 @@ float8_corr(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumY2,
-				sumXY,
-				numeratorX,
-				numeratorY,
-				numeratorXY;
+				Sxx,
+				Syy,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_corr", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
-	sumXY = transvalues[5];
+	Sxx = transvalues[2];
+	Syy = transvalues[4];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorY = N * sumY2 - sumY * sumY;
-	check_float8_val(numeratorY, isinf(sumY2) || isinf(sumY), true);
-	numeratorXY = N * sumXY - sumX * sumY;
-	check_float8_val(numeratorXY, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-	if (numeratorX <= 0 || numeratorY <= 0)
+	/* per spec, return NULL for horizontal and vertical lines */
+	if (Sxx <= 0 || Syy <= 0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(numeratorXY / sqrt(numeratorX * numeratorY));
+	PG_RETURN_FLOAT8(Sxy / sqrt(Sxx * Syy));
 }
 
 Datum
@@ -3019,42 +3250,29 @@ float8_regr_r2(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumY2,
-				sumXY,
-				numeratorX,
-				numeratorY,
-				numeratorXY;
+				Sxx,
+				Syy,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_r2", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
-	sumXY = transvalues[5];
+	Sxx = transvalues[2];
+	Syy = transvalues[4];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorY = N * sumY2 - sumY * sumY;
-	check_float8_val(numeratorY, isinf(sumY2) || isinf(sumY), true);
-	numeratorXY = N * sumXY - sumX * sumY;
-	check_float8_val(numeratorXY, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-	if (numeratorX <= 0)
+	/* per spec, return NULL for a vertical line */
+	if (Sxx <= 0)
 		PG_RETURN_NULL();
-	/* per spec, horizontal line produces 1.0 */
-	if (numeratorY <= 0)
+
+	/* per spec, return 1.0 for a horizontal line */
+	if (Syy <= 0)
 		PG_RETURN_FLOAT8(1.0);
 
-	PG_RETURN_FLOAT8((numeratorXY * numeratorXY) /
-					 (numeratorX * numeratorY));
+	PG_RETURN_FLOAT8((Sxy * Sxy) / (Sxx * Syy));
 }
 
 Datum
@@ -3063,33 +3281,23 @@ float8_regr_slope(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumXY,
-				numeratorX,
-				numeratorXY;
+				Sxx,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_slope", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxx = transvalues[2];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorXY = N * sumXY - sumX * sumY;
-	check_float8_val(numeratorXY, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-	if (numeratorX <= 0)
+	/* per spec, return NULL for a vertical line */
+	if (Sxx <= 0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(numeratorXY / numeratorX);
+	PG_RETURN_FLOAT8(Sxy / Sxx);
 }
 
 Datum
@@ -3098,33 +3306,27 @@ float8_regr_intercept(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumXY,
-				numeratorX,
-				numeratorXXY;
+				Sx,
+				Sxx,
+				Sy,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_intercept", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sx = transvalues[1];
+	Sxx = transvalues[2];
+	Sy = transvalues[3];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorXXY = sumY * sumX2 - sumX * sumXY;
-	check_float8_val(numeratorXXY, isinf(sumY) || isinf(sumX2) ||
-					 isinf(sumX) || isinf(sumXY), true);
-	if (numeratorX <= 0)
+	/* per spec, return NULL for a vertical line */
+	if (Sxx <= 0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(numeratorXXY / numeratorX);
+	PG_RETURN_FLOAT8((Sy - Sx * Sxy / Sxx) / N);
 }
 
 
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
new file mode 100644
index 5e21622..717e965
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -198,6 +198,50 @@ select avg('NaN'::numeric) from generate
  NaN
 (1 row)
 
+-- verify correct results for infinite inputs
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('1'), ('infinity')) v(x);
+   avg    | var_pop 
+----------+---------
+ Infinity |     NaN
+(1 row)
+
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('infinity'), ('1')) v(x);
+   avg    | var_pop 
+----------+---------
+ Infinity |     NaN
+(1 row)
+
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('infinity'), ('infinity')) v(x);
+   avg    | var_pop 
+----------+---------
+ Infinity |     NaN
+(1 row)
+
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('-infinity'), ('infinity')) v(x);
+ avg | var_pop 
+-----+---------
+ NaN |     NaN
+(1 row)
+
+-- test accuracy with a large input offset
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES (100000003), (100000004), (100000006), (100000007)) v(x);
+    avg    | var_pop 
+-----------+---------
+ 100000005 |     2.5
+(1 row)
+
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES (7000000000005), (7000000000007)) v(x);
+      avg      | var_pop 
+---------------+---------
+ 7000000000006 |       1
+(1 row)
+
 -- SQL2003 binary aggregates
 SELECT regr_count(b, a) FROM aggtest;
  regr_count 
@@ -253,6 +297,90 @@ SELECT corr(b, a) FROM aggtest;
  0.139634516517873
 (1 row)
 
+-- test accum and combine functions directly
+CREATE TABLE regr_test (x float8, y float8);
+INSERT INTO regr_test VALUES (10,150),(20,250),(30,350),(80,540),(100,200);
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test WHERE x IN (10,20,30,80);
+ count | sum | regr_sxx | sum  | regr_syy | regr_sxy 
+-------+-----+----------+------+----------+----------
+     4 | 140 |     2900 | 1290 |    83075 |    15050
+(1 row)
+
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test;
+ count | sum | regr_sxx | sum  | regr_syy | regr_sxy 
+-------+-----+----------+------+----------+----------
+     5 | 240 |     6280 | 1490 |    95080 |     8680
+(1 row)
+
+SELECT float8_accum('{4,140,2900}'::float8[], 100);
+ float8_accum 
+--------------
+ {5,240,6280}
+(1 row)
+
+SELECT float8_regr_accum('{4,140,2900,1290,83075,15050}'::float8[], 200, 100);
+      float8_regr_accum       
+------------------------------
+ {5,240,6280,1490,95080,8680}
+(1 row)
+
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test WHERE x IN (10,20,30);
+ count | sum | regr_sxx | sum | regr_syy | regr_sxy 
+-------+-----+----------+-----+----------+----------
+     3 |  60 |      200 | 750 |    20000 |     2000
+(1 row)
+
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test WHERE x IN (80,100);
+ count | sum | regr_sxx | sum | regr_syy | regr_sxy 
+-------+-----+----------+-----+----------+----------
+     2 | 180 |      200 | 740 |    57800 |    -3400
+(1 row)
+
+SELECT float8_combine('{3,60,200}'::float8[], '{0,0,0}'::float8[]);
+ float8_combine 
+----------------
+ {3,60,200}
+(1 row)
+
+SELECT float8_combine('{0,0,0}'::float8[], '{2,180,200}'::float8[]);
+ float8_combine 
+----------------
+ {2,180,200}
+(1 row)
+
+SELECT float8_combine('{3,60,200}'::float8[], '{2,180,200}'::float8[]);
+ float8_combine 
+----------------
+ {5,240,6280}
+(1 row)
+
+SELECT float8_regr_combine('{3,60,200,750,20000,2000}'::float8[],
+                           '{0,0,0,0,0,0}'::float8[]);
+    float8_regr_combine    
+---------------------------
+ {3,60,200,750,20000,2000}
+(1 row)
+
+SELECT float8_regr_combine('{0,0,0,0,0,0}'::float8[],
+                           '{2,180,200,740,57800,-3400}'::float8[]);
+     float8_regr_combine     
+-----------------------------
+ {2,180,200,740,57800,-3400}
+(1 row)
+
+SELECT float8_regr_combine('{3,60,200,750,20000,2000}'::float8[],
+                           '{2,180,200,740,57800,-3400}'::float8[]);
+     float8_regr_combine      
+------------------------------
+ {5,240,6280,1490,95080,8680}
+(1 row)
+
+DROP TABLE regr_test;
+-- test count, distinct
 SELECT count(four) AS cnt_1000 FROM onek;
  cnt_1000 
 ----------
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
new file mode 100644
index 7e77467..83a5492
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -51,6 +51,22 @@ select avg(null::float8) from generate_s
 select sum('NaN'::numeric) from generate_series(1,3);
 select avg('NaN'::numeric) from generate_series(1,3);
 
+-- verify correct results for infinite inputs
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('1'), ('infinity')) v(x);
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('infinity'), ('1')) v(x);
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('infinity'), ('infinity')) v(x);
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('-infinity'), ('infinity')) v(x);
+
+-- test accuracy with a large input offset
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES (100000003), (100000004), (100000006), (100000007)) v(x);
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES (7000000000005), (7000000000007)) v(x);
+
 -- SQL2003 binary aggregates
 SELECT regr_count(b, a) FROM aggtest;
 SELECT regr_sxx(b, a) FROM aggtest;
@@ -62,6 +78,31 @@ SELECT regr_slope(b, a), regr_intercept(
 SELECT covar_pop(b, a), covar_samp(b, a) FROM aggtest;
 SELECT corr(b, a) FROM aggtest;
 
+-- test accum and combine functions directly
+CREATE TABLE regr_test (x float8, y float8);
+INSERT INTO regr_test VALUES (10,150),(20,250),(30,350),(80,540),(100,200);
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test WHERE x IN (10,20,30,80);
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test;
+SELECT float8_accum('{4,140,2900}'::float8[], 100);
+SELECT float8_regr_accum('{4,140,2900,1290,83075,15050}'::float8[], 200, 100);
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test WHERE x IN (10,20,30);
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test WHERE x IN (80,100);
+SELECT float8_combine('{3,60,200}'::float8[], '{0,0,0}'::float8[]);
+SELECT float8_combine('{0,0,0}'::float8[], '{2,180,200}'::float8[]);
+SELECT float8_combine('{3,60,200}'::float8[], '{2,180,200}'::float8[]);
+SELECT float8_regr_combine('{3,60,200,750,20000,2000}'::float8[],
+                           '{0,0,0,0,0,0}'::float8[]);
+SELECT float8_regr_combine('{0,0,0,0,0,0}'::float8[],
+                           '{2,180,200,740,57800,-3400}'::float8[]);
+SELECT float8_regr_combine('{3,60,200,750,20000,2000}'::float8[],
+                           '{2,180,200,740,57800,-3400}'::float8[]);
+DROP TABLE regr_test;
+
+-- test count, distinct
 SELECT count(four) AS cnt_1000 FROM onek;
 SELECT count(DISTINCT four) AS cnt_4 FROM onek;
 
#6Madeleine Thompson
madeleineth@gmail.com
In reply to: Dean Rasheed (#5)
Re: BUG #15307: Low numerical precision of (Co-) Variance

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested

This is my first PostgreSQL review. Hopefully I'm getting it right. I independently ran into the bug in question and found that the author had already written a patch. I'm happy the author wrote this.

(1) I am not experienced with PostgreSQL internals, so I can't comment on the coding style or usage of internal functions.

(2) The patch applies cleanly to ce4887bd025b95c7b455fefd817a418844c6aad3. "make check", "make check-world", and "make install" pass.

(3) The patch addresses the numeric inaccuracy reported in the bug. (Yay!) I believe the tests are sufficient to confirm that it works as intended. I don't think the test coverage *before* this patch was sufficient, and I appreciate the improved test coverage of the combine functions. I verified the new tests manually.

(4) The comments cite Youngs & Cramer (1971). This is a dated citation. It justifies its algorithms based on pre-IEEE754 single-precision arithmetic. Most notably, it assumes that floating-point division is not exact. That said, the algorithm is still a good choice; it is justified now because compared to the standard Welford variance, it is less prone to causing pipeline stalls on a modern CPU. Maybe link to Schubert & Gentz 2018 (https://dl.acm.org/citation.cfm?id=3223036) instead. The new float8_combine combine code is (almost) S&G eqn. 22; the float8_accum code is S&G eqn. 28. float8_regr_accum and float8_regr_combine are S&G eqn. 22.

(5) I think the comment /* Watch out for roundoff error producing a negative variance */ and associated check are obsolete now.

The new status of this patch is: Waiting on Author

#7Madeleine Thompson
madeleineth@gmail.com
In reply to: Madeleine Thompson (#6)
Re: BUG #15307: Low numerical precision of (Co-) Variance

By the way, I did not see the discussion thread or notice that an author of the paper I mentioned and the reporter of the bug were the same person until after I wrote the review. Oops.

#8Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Madeleine Thompson (#6)
Re: BUG #15307: Low numerical precision of (Co-) Variance

On 27 September 2018 at 06:12, Madeleine Thompson <madeleineth@gmail.com> wrote:

This is my first PostgreSQL review. Hopefully I'm getting it right. I independently ran into the bug in question and found that the author had already written a patch. I'm happy the author wrote this.

Thanks for the review. And yes, this sort of review is exactly what we need.

(1) I am not experienced with PostgreSQL internals, so I can't comment on the coding style or usage of internal functions.

(2) The patch applies cleanly to ce4887bd025b95c7b455fefd817a418844c6aad3. "make check", "make check-world", and "make install" pass.

(3) The patch addresses the numeric inaccuracy reported in the bug. (Yay!) I believe the tests are sufficient to confirm that it works as intended. I don't think the test coverage *before* this patch was sufficient, and I appreciate the improved test coverage of the combine functions. I verified the new tests manually.

Excellent. Thanks for testing.

(4) The comments cite Youngs & Cramer (1971). This is a dated citation. It justifies its algorithms based on pre-IEEE754 single-precision arithmetic. Most notably, it assumes that floating-point division is not exact. That said, the algorithm is still a good choice; it is justified now because compared to the standard Welford variance, it is less prone to causing pipeline stalls on a modern CPU. Maybe link to Schubert & Gentz 2018 (https://dl.acm.org/citation.cfm?id=3223036) instead. The new float8_combine combine code is (almost) S&G eqn. 22; the float8_accum code is S&G eqn. 28. float8_regr_accum and float8_regr_combine are S&G eqn. 22.

Hmm, I think that Youngs & Cramer should be cited as the original
inventors of the algorithm (which pre-dates the first version of
IEEE754 by a few years), although I'm happy to also cite Schubert &
Gentz as a more modern justification for the choice of algorithm.

Regarding the combine functions, I think perhaps Chan, Golub & LeVeque
[1]: Updating Formulae and a Pairwise Algorithm for Computing Sample Variances, T. F. Chan, G. H. Golub & R. J. LeVeque, COMPSTAT 1982.
variance, not covariance, but that's a straightforward generalisation
of their work.

[1]: Updating Formulae and a Pairwise Algorithm for Computing Sample Variances, T. F. Chan, G. H. Golub & R. J. LeVeque, COMPSTAT 1982.
Variances, T. F. Chan, G. H. Golub & R. J. LeVeque, COMPSTAT 1982.

(5) I think the comment /* Watch out for roundoff error producing a negative variance */ and associated check are obsolete now.

Ah, good point. We clearly only ever add positive quantities to Sxx
and Syy, and likewise when they are combined, so there is no way that
they can ever become negative now. Another neat consequence of this
algorithm.

I'll post an updated patch in a while.

Regards,
Dean

#9Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Dean Rasheed (#8)
1 attachment(s)
Re: BUG #15307: Low numerical precision of (Co-) Variance

On Thu, 27 Sep 2018 at 14:22, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

I'll post an updated patch in a while.

I finally got round to looking at this again. Here is an update, based
on the review comments.

The next question is whether or not to back-patch this. Although this
was reported as a bug, my inclination is to apply this to HEAD only,
based on the lack of prior complaints. That also matches our decision
for other similar patches, e.g., 7d9a4737c2.

Regards,
Dean

Attachments:

implement-float-aggs-using-youngs-cramer-v3.patchtext/x-patch; charset=US-ASCII; name=implement-float-aggs-using-youngs-cramer-v3.patchDownload
diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c
new file mode 100644
index df35557..d1e12c1
--- a/src/backend/utils/adt/float.c
+++ b/src/backend/utils/adt/float.c
@@ -2401,13 +2401,39 @@ setseed(PG_FUNCTION_ARGS)
  *		float8_stddev_samp	- produce final result for float STDDEV_SAMP()
  *		float8_stddev_pop	- produce final result for float STDDEV_POP()
  *
+ * The naive schoolbook implementation of these aggregates works by
+ * accumulating sum(X) and sum(X^2).  However, this approach suffers from
+ * large rounding errors in the final computation of quantities like the
+ * population variance (N*sum(X^2) - sum(X)^2) / N^2, since each of the
+ * intermediate terms is potentially very large, while the difference is often
+ * quite small.
+ *
+ * Instead we use the Youngs-Cramer algorithm [1] which works by accumulating
+ * Sx=sum(X) and Sxx=sum((X-Sx/N)^2), using a numerically stable algorithm to
+ * incrementally update those quantities.  The final computations of each of
+ * the aggregate values is then trivial and gives more accurate results (for
+ * example, the population variance is just Sxx/N).  This algorithm is also
+ * fairly easy to generalize to allow parallel execution without loss of
+ * precision (see, for example, [2]).  For more details, and a comparison of
+ * this with other algorithms, see [3].
+ *
  * The transition datatype for all these aggregates is a 3-element array
- * of float8, holding the values N, sum(X), sum(X*X) in that order.
+ * of float8, holding the values N, Sx, Sxx in that order.
  *
  * Note that we represent N as a float to avoid having to build a special
  * datatype.  Given a reasonable floating-point implementation, there should
  * be no accuracy loss unless N exceeds 2 ^ 52 or so (by which time the
  * user will have doubtless lost interest anyway...)
+ *
+ * [1] Some Results Relevant to Choice of Sum and Sum-of-Product Algorithms,
+ * E. A. Youngs and E. M. Cramer, Technometrics Vol 13, No 3, August 1971.
+ *
+ * [2] Updating Formulae and a Pairwise Algorithm for Computing Sample
+ * Variances, T. F. Chan, G. H. Golub & R. J. LeVeque, COMPSTAT 1982.
+ *
+ * [3] Numerically Stable Parallel Computation of (Co-)Variance, Erich
+ * Schubert and Michael Gertz, Proceedings of the 30th International
+ * Conference on Scientific and Statistical Database Management, 2018.
  */
 
 static float8 *
@@ -2441,18 +2467,90 @@ float8_combine(PG_FUNCTION_ARGS)
 	ArrayType  *transarray2 = PG_GETARG_ARRAYTYPE_P(1);
 	float8	   *transvalues1;
 	float8	   *transvalues2;
-
-	if (!AggCheckCallContext(fcinfo, NULL))
-		elog(ERROR, "aggregate function called in non-aggregate context");
+	float8		N1,
+				Sx1,
+				Sxx1,
+				N2,
+				Sx2,
+				Sxx2,
+				tmp,
+				N,
+				Sx,
+				Sxx;
 
 	transvalues1 = check_float8_array(transarray1, "float8_combine", 3);
 	transvalues2 = check_float8_array(transarray2, "float8_combine", 3);
 
-	transvalues1[0] = transvalues1[0] + transvalues2[0];
-	transvalues1[1] = float8_pl(transvalues1[1], transvalues2[1]);
-	transvalues1[2] = float8_pl(transvalues1[2], transvalues2[2]);
+	N1 = transvalues1[0];
+	Sx1 = transvalues1[1];
+	Sxx1 = transvalues1[2];
 
-	PG_RETURN_ARRAYTYPE_P(transarray1);
+	N2 = transvalues2[0];
+	Sx2 = transvalues2[1];
+	Sxx2 = transvalues2[2];
+
+	/*--------------------
+	 * The transition values combine using a generalization of the
+	 * Youngs-Cramer algorithm as follows:
+	 *
+	 *	N = N1 + N2
+	 *	Sx = Sx1 + Sx2
+	 *	Sxx = Sxx1 + Sxx2 + N1 * N2 * (Sx1/N1 - Sx2/N2)^2 / N;
+	 *
+	 * It's worth handling the special cases N1 = 0 and N2 = 0 separately
+	 * since those cases are trivial, and we then don't need to worry about
+	 * division-by-zero errors in the general case.
+	 *--------------------
+	 */
+	if (N1 == 0.0)
+	{
+		N = N2;
+		Sx = Sx2;
+		Sxx = Sxx2;
+	}
+	else if (N2 == 0.0)
+	{
+		N = N1;
+		Sx = Sx1;
+		Sxx = Sxx1;
+	}
+	else
+	{
+		N = N1 + N2;
+		Sx = float8_pl(Sx1, Sx2);
+		tmp = Sx1 / N1 - Sx2 / N2;
+		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp * tmp / N;
+		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+	}
+
+	/*
+	 * If we're invoked as an aggregate, we can cheat and modify our first
+	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
+	 * new array with the updated transition data and return it.
+	 */
+	if (AggCheckCallContext(fcinfo, NULL))
+	{
+		transvalues1[0] = N;
+		transvalues1[1] = Sx;
+		transvalues1[2] = Sxx;
+
+		PG_RETURN_ARRAYTYPE_P(transarray1);
+	}
+	else
+	{
+		Datum		transdatums[3];
+		ArrayType  *result;
+
+		transdatums[0] = Float8GetDatumFast(N);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
+
+		result = construct_array(transdatums, 3,
+								 FLOAT8OID,
+								 sizeof(float8), FLOAT8PASSBYVAL, 'd');
+
+		PG_RETURN_ARRAYTYPE_P(result);
+	}
 }
 
 Datum
@@ -2461,8 +2559,43 @@ float8_accum(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8		newval = PG_GETARG_FLOAT8(1);
 	float8	   *transvalues;
+	float8		N,
+				Sx,
+				Sxx,
+				tmp;
 
 	transvalues = check_float8_array(transarray, "float8_accum", 3);
+	N = transvalues[0];
+	Sx = transvalues[1];
+	Sxx = transvalues[2];
+
+	/*
+	 * Use the Youngs-Cramer algorithm to incorporate the new value into the
+	 * transition values.
+	 */
+	N += 1.0;
+	Sx += newval;
+	if (transvalues[0] > 0.0)
+	{
+		tmp = newval * N - Sx;
+		Sxx += tmp * tmp / (N * transvalues[0]);
+
+		/*
+		 * Overflow check.  We only report an overflow error when finite
+		 * inputs lead to infinite results.  Note also that Sxx should be NaN
+		 * if any of the inputs are infinite, so we intentionally prevent Sxx
+		 * from becoming infinite.
+		 */
+		if (isinf(Sx) || isinf(Sxx))
+		{
+			if (!isinf(transvalues[1]) && !isinf(newval))
+				ereport(ERROR,
+						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+						 errmsg("value out of range: overflow")));
+
+			Sxx = get_float8_nan();
+		}
+	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -2471,9 +2604,9 @@ float8_accum(PG_FUNCTION_ARGS)
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transvalues[0] = N;
+		transvalues[1] = Sx;
+		transvalues[2] = Sxx;
 
 		PG_RETURN_ARRAYTYPE_P(transarray);
 	}
@@ -2482,9 +2615,9 @@ float8_accum(PG_FUNCTION_ARGS)
 		Datum		transdatums[3];
 		ArrayType  *result;
 
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transdatums[0] = Float8GetDatumFast(N);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
 
 		result = construct_array(transdatums, 3,
 								 FLOAT8OID,
@@ -2502,8 +2635,43 @@ float4_accum(PG_FUNCTION_ARGS)
 	/* do computations as float8 */
 	float8		newval = PG_GETARG_FLOAT4(1);
 	float8	   *transvalues;
+	float8		N,
+				Sx,
+				Sxx,
+				tmp;
 
 	transvalues = check_float8_array(transarray, "float4_accum", 3);
+	N = transvalues[0];
+	Sx = transvalues[1];
+	Sxx = transvalues[2];
+
+	/*
+	 * Use the Youngs-Cramer algorithm to incorporate the new value into the
+	 * transition values.
+	 */
+	N += 1.0;
+	Sx += newval;
+	if (transvalues[0] > 0.0)
+	{
+		tmp = newval * N - Sx;
+		Sxx += tmp * tmp / (N * transvalues[0]);
+
+		/*
+		 * Overflow check.  We only report an overflow error when finite
+		 * inputs lead to infinite results.  Note also that Sxx should be NaN
+		 * if any of the inputs are infinite, so we intentionally prevent Sxx
+		 * from becoming infinite.
+		 */
+		if (isinf(Sx) || isinf(Sxx))
+		{
+			if (!isinf(transvalues[1]) && !isinf(newval))
+				ereport(ERROR,
+						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+						 errmsg("value out of range: overflow")));
+
+			Sxx = get_float8_nan();
+		}
+	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -2512,9 +2680,9 @@ float4_accum(PG_FUNCTION_ARGS)
 	 */
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transvalues[0] = N;
+		transvalues[1] = Sx;
+		transvalues[2] = Sxx;
 
 		PG_RETURN_ARRAYTYPE_P(transarray);
 	}
@@ -2523,9 +2691,9 @@ float4_accum(PG_FUNCTION_ARGS)
 		Datum		transdatums[3];
 		ArrayType  *result;
 
-		transvalues[0] = transvalues[0] + 1.0;
-		transvalues[1] = float8_pl(transvalues[1], newval);
-		transvalues[2] = float8_pl(transvalues[2], float8_mul(newval, newval));
+		transdatums[0] = Float8GetDatumFast(N);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
 
 		result = construct_array(transdatums, 3,
 								 FLOAT8OID,
@@ -2541,18 +2709,18 @@ float8_avg(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX;
+				Sx;
 
 	transvalues = check_float8_array(transarray, "float8_avg", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	/* ignore sumX2 */
+	Sx = transvalues[1];
+	/* ignore Sxx */
 
 	/* SQL defines AVG of no values to be NULL */
 	if (N == 0.0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(sumX / N);
+	PG_RETURN_FLOAT8(Sx / N);
 }
 
 Datum
@@ -2561,27 +2729,20 @@ float8_var_pop(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_var_pop", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Sx */
+	Sxx = transvalues[2];
 
 	/* Population variance is undefined when N is 0, so return NULL */
 	if (N == 0.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
-		PG_RETURN_FLOAT8(0.0);
+	/* Note that Sxx is guaranteed to be non-negative */
 
-	PG_RETURN_FLOAT8(numerator / (N * N));
+	PG_RETURN_FLOAT8(Sxx / N);
 }
 
 Datum
@@ -2590,27 +2751,20 @@ float8_var_samp(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_var_samp", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Sx */
+	Sxx = transvalues[2];
 
 	/* Sample variance is undefined when N is 0 or 1, so return NULL */
 	if (N <= 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
-		PG_RETURN_FLOAT8(0.0);
+	/* Note that Sxx is guaranteed to be non-negative */
 
-	PG_RETURN_FLOAT8(numerator / (N * (N - 1.0)));
+	PG_RETURN_FLOAT8(Sxx / (N - 1.0));
 }
 
 Datum
@@ -2619,27 +2773,20 @@ float8_stddev_pop(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_stddev_pop", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Sx */
+	Sxx = transvalues[2];
 
 	/* Population stddev is undefined when N is 0, so return NULL */
 	if (N == 0.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
-		PG_RETURN_FLOAT8(0.0);
+	/* Note that Sxx is guaranteed to be non-negative */
 
-	PG_RETURN_FLOAT8(sqrt(numerator / (N * N)));
+	PG_RETURN_FLOAT8(sqrt(Sxx / N));
 }
 
 Datum
@@ -2648,27 +2795,20 @@ float8_stddev_samp(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_stddev_samp", 3);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	/* ignore Sx */
+	Sxx = transvalues[2];
 
 	/* Sample stddev is undefined when N is 0 or 1, so return NULL */
 	if (N <= 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
-		PG_RETURN_FLOAT8(0.0);
+	/* Note that Sxx is guaranteed to be non-negative */
 
-	PG_RETURN_FLOAT8(sqrt(numerator / (N * (N - 1.0))));
+	PG_RETURN_FLOAT8(sqrt(Sxx / (N - 1.0)));
 }
 
 /*
@@ -2676,9 +2816,14 @@ float8_stddev_samp(PG_FUNCTION_ARGS)
  *		SQL2003 BINARY AGGREGATES
  *		=========================
  *
+ * As with the preceding aggregates, we use the Youngs-Cramer algorithm to
+ * reduce rounding errors in the aggregate final functions.
+ *
  * The transition datatype for all these aggregates is a 6-element array of
- * float8, holding the values N, sum(X), sum(X*X), sum(Y), sum(Y*Y), sum(X*Y)
- * in that order.  Note that Y is the first argument to the aggregates!
+ * float8, holding the values N, Sx=sum(X), Sxx=sum((X-Sx/N)^2), Sy=sum(Y),
+ * Syy=sum((Y-Sy/N)^2), Sxy=sum((X-Sx/N)*(Y-Sy/N)) in that order.
+ *
+ * Note that Y is the first argument to all these aggregates!
  *
  * It might seem attractive to optimize this by having multiple accumulator
  * functions that only calculate the sums actually needed.  But on most
@@ -2695,32 +2840,66 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 	float8		newvalX = PG_GETARG_FLOAT8(2);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumY2,
-				sumXY;
+				Sx,
+				Sxx,
+				Sy,
+				Syy,
+				Sxy,
+				tmpX,
+				tmpY,
+				scale;
 
 	transvalues = check_float8_array(transarray, "float8_regr_accum", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
-	sumXY = transvalues[5];
+	Sx = transvalues[1];
+	Sxx = transvalues[2];
+	Sy = transvalues[3];
+	Syy = transvalues[4];
+	Sxy = transvalues[5];
 
+	/*
+	 * Use the Youngs-Cramer algorithm to incorporate the new values into the
+	 * transition values.
+	 */
 	N += 1.0;
-	sumX += newvalX;
-	check_float8_val(sumX, isinf(transvalues[1]) || isinf(newvalX), true);
-	sumX2 += newvalX * newvalX;
-	check_float8_val(sumX2, isinf(transvalues[2]) || isinf(newvalX), true);
-	sumY += newvalY;
-	check_float8_val(sumY, isinf(transvalues[3]) || isinf(newvalY), true);
-	sumY2 += newvalY * newvalY;
-	check_float8_val(sumY2, isinf(transvalues[4]) || isinf(newvalY), true);
-	sumXY += newvalX * newvalY;
-	check_float8_val(sumXY, isinf(transvalues[5]) || isinf(newvalX) ||
-					 isinf(newvalY), true);
+	Sx += newvalX;
+	Sy += newvalY;
+	if (transvalues[0] > 0.0)
+	{
+		tmpX = newvalX * N - Sx;
+		tmpY = newvalY * N - Sy;
+		scale = 1.0 / (N * transvalues[0]);
+		Sxx += tmpX * tmpX * scale;
+		Syy += tmpY * tmpY * scale;
+		Sxy += tmpX * tmpY * scale;
+
+		/*
+		 * Overflow check.  We only report an overflow error when finite
+		 * inputs lead to infinite results.  Note also that Sxx, Syy and Sxy
+		 * should be NaN if any of the relevant inputs are infinite, so we
+		 * intentionally prevent them from becoming infinite.
+		 */
+		if (isinf(Sx) || isinf(Sxx) || isinf(Sy) || isinf(Syy) || isinf(Sxy))
+		{
+			if (((isinf(Sx) || isinf(Sxx)) &&
+				 !isinf(transvalues[1]) && !isinf(newvalX)) ||
+				((isinf(Sy) || isinf(Syy)) &&
+				 !isinf(transvalues[3]) && !isinf(newvalY)) ||
+				(isinf(Sxy) &&
+				 !isinf(transvalues[1]) && !isinf(newvalX) &&
+				 !isinf(transvalues[3]) && !isinf(newvalY)))
+				ereport(ERROR,
+						(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+						 errmsg("value out of range: overflow")));
+
+			if (isinf(Sxx))
+				Sxx = get_float8_nan();
+			if (isinf(Syy))
+				Syy = get_float8_nan();
+			if (isinf(Sxy))
+				Sxy = get_float8_nan();
+		}
+	}
 
 	/*
 	 * If we're invoked as an aggregate, we can cheat and modify our first
@@ -2730,11 +2909,11 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 	if (AggCheckCallContext(fcinfo, NULL))
 	{
 		transvalues[0] = N;
-		transvalues[1] = sumX;
-		transvalues[2] = sumX2;
-		transvalues[3] = sumY;
-		transvalues[4] = sumY2;
-		transvalues[5] = sumXY;
+		transvalues[1] = Sx;
+		transvalues[2] = Sxx;
+		transvalues[3] = Sy;
+		transvalues[4] = Syy;
+		transvalues[5] = Sxy;
 
 		PG_RETURN_ARRAYTYPE_P(transarray);
 	}
@@ -2744,11 +2923,11 @@ float8_regr_accum(PG_FUNCTION_ARGS)
 		ArrayType  *result;
 
 		transdatums[0] = Float8GetDatumFast(N);
-		transdatums[1] = Float8GetDatumFast(sumX);
-		transdatums[2] = Float8GetDatumFast(sumX2);
-		transdatums[3] = Float8GetDatumFast(sumY);
-		transdatums[4] = Float8GetDatumFast(sumY2);
-		transdatums[5] = Float8GetDatumFast(sumXY);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
+		transdatums[3] = Float8GetDatumFast(Sy);
+		transdatums[4] = Float8GetDatumFast(Syy);
+		transdatums[5] = Float8GetDatumFast(Sxy);
 
 		result = construct_array(transdatums, 6,
 								 FLOAT8OID,
@@ -2773,21 +2952,127 @@ float8_regr_combine(PG_FUNCTION_ARGS)
 	ArrayType  *transarray2 = PG_GETARG_ARRAYTYPE_P(1);
 	float8	   *transvalues1;
 	float8	   *transvalues2;
-
-	if (!AggCheckCallContext(fcinfo, NULL))
-		elog(ERROR, "aggregate function called in non-aggregate context");
+	float8		N1,
+				Sx1,
+				Sxx1,
+				Sy1,
+				Syy1,
+				Sxy1,
+				N2,
+				Sx2,
+				Sxx2,
+				Sy2,
+				Syy2,
+				Sxy2,
+				tmp1,
+				tmp2,
+				N,
+				Sx,
+				Sxx,
+				Sy,
+				Syy,
+				Sxy;
 
 	transvalues1 = check_float8_array(transarray1, "float8_regr_combine", 6);
 	transvalues2 = check_float8_array(transarray2, "float8_regr_combine", 6);
 
-	transvalues1[0] = transvalues1[0] + transvalues2[0];
-	transvalues1[1] = float8_pl(transvalues1[1], transvalues2[1]);
-	transvalues1[2] = float8_pl(transvalues1[2], transvalues2[2]);
-	transvalues1[3] = float8_pl(transvalues1[3], transvalues2[3]);
-	transvalues1[4] = float8_pl(transvalues1[4], transvalues2[4]);
-	transvalues1[5] = float8_pl(transvalues1[5], transvalues2[5]);
+	N1 = transvalues1[0];
+	Sx1 = transvalues1[1];
+	Sxx1 = transvalues1[2];
+	Sy1 = transvalues1[3];
+	Syy1 = transvalues1[4];
+	Sxy1 = transvalues1[5];
 
-	PG_RETURN_ARRAYTYPE_P(transarray1);
+	N2 = transvalues2[0];
+	Sx2 = transvalues2[1];
+	Sxx2 = transvalues2[2];
+	Sy2 = transvalues2[3];
+	Syy2 = transvalues2[4];
+	Sxy2 = transvalues2[5];
+
+	/*--------------------
+	 * The transition values combine using a generalization of the
+	 * Youngs-Cramer algorithm as follows:
+	 *
+	 *	N = N1 + N2
+	 *	Sx = Sx1 + Sx2
+	 *	Sxx = Sxx1 + Sxx2 + N1 * N2 * (Sx1/N1 - Sx2/N2)^2 / N
+	 *	Sy = Sy1 + Sy2
+	 *	Syy = Syy1 + Syy2 + N1 * N2 * (Sy1/N1 - Sy2/N2)^2 / N
+	 *	Sxy = Sxy1 + Sxy2 + N1 * N2 * (Sx1/N1 - Sx2/N2) * (Sy1/N1 - Sy2/N2) / N
+	 *
+	 * It's worth handling the special cases N1 = 0 and N2 = 0 separately
+	 * since those cases are trivial, and we then don't need to worry about
+	 * division-by-zero errors in the general case.
+	 *--------------------
+	 */
+	if (N1 == 0.0)
+	{
+		N = N2;
+		Sx = Sx2;
+		Sxx = Sxx2;
+		Sy = Sy2;
+		Syy = Syy2;
+		Sxy = Sxy2;
+	}
+	else if (N2 == 0.0)
+	{
+		N = N1;
+		Sx = Sx1;
+		Sxx = Sxx1;
+		Sy = Sy1;
+		Syy = Syy1;
+		Sxy = Sxy1;
+	}
+	else
+	{
+		N = N1 + N2;
+		Sx = float8_pl(Sx1, Sx2);
+		tmp1 = Sx1 / N1 - Sx2 / N2;
+		Sxx = Sxx1 + Sxx2 + N1 * N2 * tmp1 * tmp1 / N;
+		check_float8_val(Sxx, isinf(Sxx1) || isinf(Sxx2), true);
+		Sy = float8_pl(Sy1, Sy2);
+		tmp2 = Sy1 / N1 - Sy2 / N2;
+		Syy = Syy1 + Syy2 + N1 * N2 * tmp2 * tmp2 / N;
+		check_float8_val(Syy, isinf(Syy1) || isinf(Syy2), true);
+		Sxy = Sxy1 + Sxy2 + N1 * N2 * tmp1 * tmp2 / N;
+		check_float8_val(Sxy, isinf(Sxy1) || isinf(Sxy2), true);
+	}
+
+	/*
+	 * If we're invoked as an aggregate, we can cheat and modify our first
+	 * parameter in-place to reduce palloc overhead. Otherwise we construct a
+	 * new array with the updated transition data and return it.
+	 */
+	if (AggCheckCallContext(fcinfo, NULL))
+	{
+		transvalues1[0] = N;
+		transvalues1[1] = Sx;
+		transvalues1[2] = Sxx;
+		transvalues1[3] = Sy;
+		transvalues1[4] = Syy;
+		transvalues1[5] = Sxy;
+
+		PG_RETURN_ARRAYTYPE_P(transarray1);
+	}
+	else
+	{
+		Datum		transdatums[6];
+		ArrayType  *result;
+
+		transdatums[0] = Float8GetDatumFast(N);
+		transdatums[1] = Float8GetDatumFast(Sx);
+		transdatums[2] = Float8GetDatumFast(Sxx);
+		transdatums[3] = Float8GetDatumFast(Sy);
+		transdatums[4] = Float8GetDatumFast(Syy);
+		transdatums[5] = Float8GetDatumFast(Sxy);
+
+		result = construct_array(transdatums, 6,
+								 FLOAT8OID,
+								 sizeof(float8), FLOAT8PASSBYVAL, 'd');
+
+		PG_RETURN_ARRAYTYPE_P(result);
+	}
 }
 
 
@@ -2797,27 +3082,19 @@ float8_regr_sxx(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				numerator;
+				Sxx;
 
 	transvalues = check_float8_array(transarray, "float8_regr_sxx", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
+	Sxx = transvalues[2];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumX2 - sumX * sumX;
-	check_float8_val(numerator, isinf(sumX2) || isinf(sumX), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
-		PG_RETURN_FLOAT8(0.0);
+	/* Note that Sxx is guaranteed to be non-negative */
 
-	PG_RETURN_FLOAT8(numerator / N);
+	PG_RETURN_FLOAT8(Sxx);
 }
 
 Datum
@@ -2826,27 +3103,19 @@ float8_regr_syy(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumY,
-				sumY2,
-				numerator;
+				Syy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_syy", 6);
 	N = transvalues[0];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
+	Syy = transvalues[4];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumY2 - sumY * sumY;
-	check_float8_val(numerator, isinf(sumY2) || isinf(sumY), true);
-
-	/* Watch out for roundoff error producing a negative numerator */
-	if (numerator <= 0.0)
-		PG_RETURN_FLOAT8(0.0);
+	/* Note that Syy is guaranteed to be non-negative */
 
-	PG_RETURN_FLOAT8(numerator / N);
+	PG_RETURN_FLOAT8(Syy);
 }
 
 Datum
@@ -2855,28 +3124,19 @@ float8_regr_sxy(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumY,
-				sumXY,
-				numerator;
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_sxy", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumXY - sumX * sumY;
-	check_float8_val(numerator, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-
 	/* A negative result is valid here */
 
-	PG_RETURN_FLOAT8(numerator / N);
+	PG_RETURN_FLOAT8(Sxy);
 }
 
 Datum
@@ -2885,17 +3145,17 @@ float8_regr_avgx(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX;
+				Sx;
 
 	transvalues = check_float8_array(transarray, "float8_regr_avgx", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
+	Sx = transvalues[1];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(sumX / N);
+	PG_RETURN_FLOAT8(Sx / N);
 }
 
 Datum
@@ -2904,17 +3164,17 @@ float8_regr_avgy(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumY;
+				Sy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_avgy", 6);
 	N = transvalues[0];
-	sumY = transvalues[3];
+	Sy = transvalues[3];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(sumY / N);
+	PG_RETURN_FLOAT8(Sy / N);
 }
 
 Datum
@@ -2923,26 +3183,17 @@ float8_covar_pop(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumY,
-				sumXY,
-				numerator;
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_covar_pop", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumXY - sumX * sumY;
-	check_float8_val(numerator, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-
-	PG_RETURN_FLOAT8(numerator / (N * N));
+	PG_RETURN_FLOAT8(Sxy / N);
 }
 
 Datum
@@ -2951,26 +3202,17 @@ float8_covar_samp(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumY,
-				sumXY,
-				numerator;
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_covar_samp", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxy = transvalues[5];
 
 	/* if N is <= 1 we should return NULL */
 	if (N < 2.0)
 		PG_RETURN_NULL();
 
-	numerator = N * sumXY - sumX * sumY;
-	check_float8_val(numerator, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-
-	PG_RETURN_FLOAT8(numerator / (N * (N - 1.0)));
+	PG_RETURN_FLOAT8(Sxy / (N - 1.0));
 }
 
 Datum
@@ -2979,38 +3221,27 @@ float8_corr(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumY2,
-				sumXY,
-				numeratorX,
-				numeratorY,
-				numeratorXY;
+				Sxx,
+				Syy,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_corr", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
-	sumXY = transvalues[5];
+	Sxx = transvalues[2];
+	Syy = transvalues[4];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorY = N * sumY2 - sumY * sumY;
-	check_float8_val(numeratorY, isinf(sumY2) || isinf(sumY), true);
-	numeratorXY = N * sumXY - sumX * sumY;
-	check_float8_val(numeratorXY, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-	if (numeratorX <= 0 || numeratorY <= 0)
+	/* Note that Sxx and Syy are guaranteed to be non-negative */
+
+	/* per spec, return NULL for horizontal and vertical lines */
+	if (Sxx == 0 || Syy == 0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(numeratorXY / sqrt(numeratorX * numeratorY));
+	PG_RETURN_FLOAT8(Sxy / sqrt(Sxx * Syy));
 }
 
 Datum
@@ -3019,42 +3250,31 @@ float8_regr_r2(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumY2,
-				sumXY,
-				numeratorX,
-				numeratorY,
-				numeratorXY;
+				Sxx,
+				Syy,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_r2", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumY2 = transvalues[4];
-	sumXY = transvalues[5];
+	Sxx = transvalues[2];
+	Syy = transvalues[4];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorY = N * sumY2 - sumY * sumY;
-	check_float8_val(numeratorY, isinf(sumY2) || isinf(sumY), true);
-	numeratorXY = N * sumXY - sumX * sumY;
-	check_float8_val(numeratorXY, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-	if (numeratorX <= 0)
+	/* Note that Sxx and Syy are guaranteed to be non-negative */
+
+	/* per spec, return NULL for a vertical line */
+	if (Sxx == 0)
 		PG_RETURN_NULL();
-	/* per spec, horizontal line produces 1.0 */
-	if (numeratorY <= 0)
+
+	/* per spec, return 1.0 for a horizontal line */
+	if (Syy == 0)
 		PG_RETURN_FLOAT8(1.0);
 
-	PG_RETURN_FLOAT8((numeratorXY * numeratorXY) /
-					 (numeratorX * numeratorY));
+	PG_RETURN_FLOAT8((Sxy * Sxy) / (Sxx * Syy));
 }
 
 Datum
@@ -3063,33 +3283,25 @@ float8_regr_slope(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumXY,
-				numeratorX,
-				numeratorXY;
+				Sxx,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_slope", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sxx = transvalues[2];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorXY = N * sumXY - sumX * sumY;
-	check_float8_val(numeratorXY, isinf(sumXY) || isinf(sumX) ||
-					 isinf(sumY), true);
-	if (numeratorX <= 0)
+	/* Note that Sxx is guaranteed to be non-negative */
+
+	/* per spec, return NULL for a vertical line */
+	if (Sxx == 0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(numeratorXY / numeratorX);
+	PG_RETURN_FLOAT8(Sxy / Sxx);
 }
 
 Datum
@@ -3098,33 +3310,29 @@ float8_regr_intercept(PG_FUNCTION_ARGS)
 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
 	float8	   *transvalues;
 	float8		N,
-				sumX,
-				sumX2,
-				sumY,
-				sumXY,
-				numeratorX,
-				numeratorXXY;
+				Sx,
+				Sxx,
+				Sy,
+				Sxy;
 
 	transvalues = check_float8_array(transarray, "float8_regr_intercept", 6);
 	N = transvalues[0];
-	sumX = transvalues[1];
-	sumX2 = transvalues[2];
-	sumY = transvalues[3];
-	sumXY = transvalues[5];
+	Sx = transvalues[1];
+	Sxx = transvalues[2];
+	Sy = transvalues[3];
+	Sxy = transvalues[5];
 
 	/* if N is 0 we should return NULL */
 	if (N < 1.0)
 		PG_RETURN_NULL();
 
-	numeratorX = N * sumX2 - sumX * sumX;
-	check_float8_val(numeratorX, isinf(sumX2) || isinf(sumX), true);
-	numeratorXXY = sumY * sumX2 - sumX * sumXY;
-	check_float8_val(numeratorXXY, isinf(sumY) || isinf(sumX2) ||
-					 isinf(sumX) || isinf(sumXY), true);
-	if (numeratorX <= 0)
+	/* Note that Sxx is guaranteed to be non-negative */
+
+	/* per spec, return NULL for a vertical line */
+	if (Sxx == 0)
 		PG_RETURN_NULL();
 
-	PG_RETURN_FLOAT8(numeratorXXY / numeratorX);
+	PG_RETURN_FLOAT8((Sy - Sx * Sxy / Sxx) / N);
 }
 
 
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
new file mode 100644
index 5e21622..717e965
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -198,6 +198,50 @@ select avg('NaN'::numeric) from generate
  NaN
 (1 row)
 
+-- verify correct results for infinite inputs
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('1'), ('infinity')) v(x);
+   avg    | var_pop 
+----------+---------
+ Infinity |     NaN
+(1 row)
+
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('infinity'), ('1')) v(x);
+   avg    | var_pop 
+----------+---------
+ Infinity |     NaN
+(1 row)
+
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('infinity'), ('infinity')) v(x);
+   avg    | var_pop 
+----------+---------
+ Infinity |     NaN
+(1 row)
+
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('-infinity'), ('infinity')) v(x);
+ avg | var_pop 
+-----+---------
+ NaN |     NaN
+(1 row)
+
+-- test accuracy with a large input offset
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES (100000003), (100000004), (100000006), (100000007)) v(x);
+    avg    | var_pop 
+-----------+---------
+ 100000005 |     2.5
+(1 row)
+
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES (7000000000005), (7000000000007)) v(x);
+      avg      | var_pop 
+---------------+---------
+ 7000000000006 |       1
+(1 row)
+
 -- SQL2003 binary aggregates
 SELECT regr_count(b, a) FROM aggtest;
  regr_count 
@@ -253,6 +297,90 @@ SELECT corr(b, a) FROM aggtest;
  0.139634516517873
 (1 row)
 
+-- test accum and combine functions directly
+CREATE TABLE regr_test (x float8, y float8);
+INSERT INTO regr_test VALUES (10,150),(20,250),(30,350),(80,540),(100,200);
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test WHERE x IN (10,20,30,80);
+ count | sum | regr_sxx | sum  | regr_syy | regr_sxy 
+-------+-----+----------+------+----------+----------
+     4 | 140 |     2900 | 1290 |    83075 |    15050
+(1 row)
+
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test;
+ count | sum | regr_sxx | sum  | regr_syy | regr_sxy 
+-------+-----+----------+------+----------+----------
+     5 | 240 |     6280 | 1490 |    95080 |     8680
+(1 row)
+
+SELECT float8_accum('{4,140,2900}'::float8[], 100);
+ float8_accum 
+--------------
+ {5,240,6280}
+(1 row)
+
+SELECT float8_regr_accum('{4,140,2900,1290,83075,15050}'::float8[], 200, 100);
+      float8_regr_accum       
+------------------------------
+ {5,240,6280,1490,95080,8680}
+(1 row)
+
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test WHERE x IN (10,20,30);
+ count | sum | regr_sxx | sum | regr_syy | regr_sxy 
+-------+-----+----------+-----+----------+----------
+     3 |  60 |      200 | 750 |    20000 |     2000
+(1 row)
+
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test WHERE x IN (80,100);
+ count | sum | regr_sxx | sum | regr_syy | regr_sxy 
+-------+-----+----------+-----+----------+----------
+     2 | 180 |      200 | 740 |    57800 |    -3400
+(1 row)
+
+SELECT float8_combine('{3,60,200}'::float8[], '{0,0,0}'::float8[]);
+ float8_combine 
+----------------
+ {3,60,200}
+(1 row)
+
+SELECT float8_combine('{0,0,0}'::float8[], '{2,180,200}'::float8[]);
+ float8_combine 
+----------------
+ {2,180,200}
+(1 row)
+
+SELECT float8_combine('{3,60,200}'::float8[], '{2,180,200}'::float8[]);
+ float8_combine 
+----------------
+ {5,240,6280}
+(1 row)
+
+SELECT float8_regr_combine('{3,60,200,750,20000,2000}'::float8[],
+                           '{0,0,0,0,0,0}'::float8[]);
+    float8_regr_combine    
+---------------------------
+ {3,60,200,750,20000,2000}
+(1 row)
+
+SELECT float8_regr_combine('{0,0,0,0,0,0}'::float8[],
+                           '{2,180,200,740,57800,-3400}'::float8[]);
+     float8_regr_combine     
+-----------------------------
+ {2,180,200,740,57800,-3400}
+(1 row)
+
+SELECT float8_regr_combine('{3,60,200,750,20000,2000}'::float8[],
+                           '{2,180,200,740,57800,-3400}'::float8[]);
+     float8_regr_combine      
+------------------------------
+ {5,240,6280,1490,95080,8680}
+(1 row)
+
+DROP TABLE regr_test;
+-- test count, distinct
 SELECT count(four) AS cnt_1000 FROM onek;
  cnt_1000 
 ----------
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
new file mode 100644
index 7e77467..83a5492
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -51,6 +51,22 @@ select avg(null::float8) from generate_s
 select sum('NaN'::numeric) from generate_series(1,3);
 select avg('NaN'::numeric) from generate_series(1,3);
 
+-- verify correct results for infinite inputs
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('1'), ('infinity')) v(x);
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('infinity'), ('1')) v(x);
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('infinity'), ('infinity')) v(x);
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES ('-infinity'), ('infinity')) v(x);
+
+-- test accuracy with a large input offset
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES (100000003), (100000004), (100000006), (100000007)) v(x);
+SELECT avg(x::float8), var_pop(x::float8)
+FROM (VALUES (7000000000005), (7000000000007)) v(x);
+
 -- SQL2003 binary aggregates
 SELECT regr_count(b, a) FROM aggtest;
 SELECT regr_sxx(b, a) FROM aggtest;
@@ -62,6 +78,31 @@ SELECT regr_slope(b, a), regr_intercept(
 SELECT covar_pop(b, a), covar_samp(b, a) FROM aggtest;
 SELECT corr(b, a) FROM aggtest;
 
+-- test accum and combine functions directly
+CREATE TABLE regr_test (x float8, y float8);
+INSERT INTO regr_test VALUES (10,150),(20,250),(30,350),(80,540),(100,200);
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test WHERE x IN (10,20,30,80);
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test;
+SELECT float8_accum('{4,140,2900}'::float8[], 100);
+SELECT float8_regr_accum('{4,140,2900,1290,83075,15050}'::float8[], 200, 100);
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test WHERE x IN (10,20,30);
+SELECT count(*), sum(x), regr_sxx(y,x), sum(y),regr_syy(y,x), regr_sxy(y,x)
+FROM regr_test WHERE x IN (80,100);
+SELECT float8_combine('{3,60,200}'::float8[], '{0,0,0}'::float8[]);
+SELECT float8_combine('{0,0,0}'::float8[], '{2,180,200}'::float8[]);
+SELECT float8_combine('{3,60,200}'::float8[], '{2,180,200}'::float8[]);
+SELECT float8_regr_combine('{3,60,200,750,20000,2000}'::float8[],
+                           '{0,0,0,0,0,0}'::float8[]);
+SELECT float8_regr_combine('{0,0,0,0,0,0}'::float8[],
+                           '{2,180,200,740,57800,-3400}'::float8[]);
+SELECT float8_regr_combine('{3,60,200,750,20000,2000}'::float8[],
+                           '{2,180,200,740,57800,-3400}'::float8[]);
+DROP TABLE regr_test;
+
+-- test count, distinct
 SELECT count(four) AS cnt_1000 FROM onek;
 SELECT count(DISTINCT four) AS cnt_4 FROM onek;
 
#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dean Rasheed (#9)
Re: BUG #15307: Low numerical precision of (Co-) Variance

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

The next question is whether or not to back-patch this. Although this
was reported as a bug, my inclination is to apply this to HEAD only,
based on the lack of prior complaints. That also matches our decision
for other similar patches, e.g., 7d9a4737c2.

+1 for no back-patch. Even though the new results are hopefully more
accurate, they're still different from before, and that might cause
issues for somebody if it happens in a stable branch.

regards, tom lane

#11Madeleine Thompson
madeleineth@gmail.com
In reply to: Dean Rasheed (#9)
Re: BUG #15307: Low numerical precision of (Co-) Variance

On Wed, Oct 3, 2018 at 9:04 AM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

On Thu, 27 Sep 2018 at 14:22, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

I'll post an updated patch in a while.

I finally got round to looking at this again. Here is an update, based
on the review comments.

The next question is whether or not to back-patch this. Although this
was reported as a bug, my inclination is to apply this to HEAD only,
based on the lack of prior complaints. That also matches our decision
for other similar patches, e.g., 7d9a4737c2.

This diff looks good to me. Also, it applies cleanly against
abd9ca377d669a6e0560e854d7e987438d0e612e and passes `make
check-world`.

I agree that this is not suitable for a patch release.

Madeleine

#12Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Madeleine Thompson (#11)
Re: BUG #15307: Low numerical precision of (Co-) Variance

On Wed, 3 Oct 2018 at 15:58, Madeleine Thompson <madeleineth@gmail.com> wrote:

This diff looks good to me. Also, it applies cleanly against
abd9ca377d669a6e0560e854d7e987438d0e612e and passes `make
check-world`.

I agree that this is not suitable for a patch release.

Pushed to master. Thanks for the review.

Regards,
Dean