Add PRODUCT() aggregate function

Started by Jeevan Chalke1 day ago6 messageshackers

jeevan.chalke@enterprisedb.com

1 day ago

Hello Hackers,

PostgreSQL has aggregates for summing a set of values (sum()) but not for
multiplying them. Computing the product of a column is a fairly common
request -- e.g. compounding growth/return factors, combining independent
probabilities, computing factorial-like or geometric quantities -- and today
it requires either a custom aggregate or the exp(sum(ln(...))) trick, which
does not work for zero or negative inputs and loses precision.

This idea was proposed by Peter Eisentraut, and the attached patch
implements
a built-in PRODUCT() aggregate.

*What it does*
-----

PRODUCT() returns the product of all non-null input values. It is defined
for
int2, int4, int8, float4, float8 and numeric input, and always returns
numeric.

A few examples:

CREATE TABLE t (g int, v int);
INSERT INTO t VALUES (1,2),(1,3),(1,4),(2,5),(2,-6),(2,0);

SELECT product(v) FROM t;
product
---------
0

SELECT g, product(v) FROM t WHERE v <> 0 GROUP BY g ORDER BY g;
g | product
---+---------
1 | 24
2 | -30

Like sum(), PRODUCT() ignores NULL inputs and returns NULL for an empty
input
set (or a group consisting only of NULLs).

*Design / implementation notes*
-----

* The result type and the internal transition state are both numeric,
regardless of the input type. Using numeric for the running product
avoids
overflow in the intermediate state for the integer and floating-point
variants, where a product grows much faster than a sum. (A sufficiently
large product can of course still overflow numeric and raise an error.)

* For numeric input, the transition and combine functions are simply the
existing numeric_mul(). For the other input types, small non-strict
transition functions (int2_product_accum, int4_product_accum,
int8_product_accum, float4_product_accum, float8_product_accum) promote
the
input to numeric and then call numeric_mul().

* PRODUCT() supports Partial Mode (parallel aggregation), using
numeric_mul()
as the combine function. Because the transition type is numeric rather
than
internal, no serialization/deserialization functions are needed.

* No inverse transition (moving-aggregate) function is provided. An inverse
for a product would require division, which is unreliable or undefined
when
any input is zero (and lossy in general), so as a window aggregate over a
moving frame PRODUCT() falls back to recomputing the frame.

*Open questions*
-----

* Naming. I went with PRODUCT(); other systems and discussions have used
names like PROD or MUL. Happy to change it if there is a consensus.

* Return type. Always returning numeric is the safe choice for overflow,
but
it does mean product(double precision) returns numeric rather than a float
.
An alternative would be to return float8 for the floating-point inputs. I
leaned towards numeric for consistency and to avoid overflow surprises;
feedback welcome.

* Type coverage. The patch covers the standard numeric input types. money
and interval were intentionally left out, since a product of those types
has
no clear meaning.

*Testing / docs*
-----

The patch adds regression tests for all input types and the relevant edge
cases
(NULLs, DISTINCT, FILTER, zero/negative inputs, Infinity/NaN, overflow,
parallel
aggregation, and window usage), along with documentation updates.

Thoughts and review feedback are very welcome.

Thanks

--
*Jeevan Chalke*
*Senior Principal Engineer, Engineering Manager*
*Product Development*

enterprisedb.com <https://www.enterprisedb.com>

Dean Rasheed

dean.a.rasheed@gmail.com

1 day ago

In reply to: Jeevan Chalke (#1)

Re: Add PRODUCT() aggregate function

On Tue, 23 Jun 2026 at 08:49, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:

PRODUCT() returns the product of all non-null input values. It is defined for
int2, int4, int8, float4, float8 and numeric input, and always returns numeric.

I don't think that you need to define it for all those types. I
suspect that you could just define it for numeric and float8, and let
implicit casting do the rest.

Regards,
Dean

Dean Rasheed

dean.a.rasheed@gmail.com

1 day ago

In reply to: Dean Rasheed (#2)

Re: Add PRODUCT() aggregate function

On Tue, 23 Jun 2026 at 09:37, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

On Tue, 23 Jun 2026 at 08:49, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:

PRODUCT() returns the product of all non-null input values. It is defined for
int2, int4, int8, float4, float8 and numeric input, and always returns numeric.

I don't think that you need to define it for all those types. I
suspect that you could just define it for numeric and float8, and let
implicit casting do the rest.

... and perhaps make the float8 version return float8.

Regards,
Dean

Jim Jones

jim.jones@uni-muenster.de

1 day ago

In reply to: Dean Rasheed (#2)

Re: Add PRODUCT() aggregate function

Hi Jeevan

On 23/06/2026 10:37, Dean Rasheed wrote:

On Tue, 23 Jun 2026 at 08:49, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:

PRODUCT() returns the product of all non-null input values. It is defined for
int2, int4, int8, float4, float8 and numeric input, and always returns numeric.

I don't think that you need to define it for all those types. I
suspect that you could just define it for numeric and float8, and let
implicit casting do the rest.

I've tested the patch in many different scenarios and all results look
fine -- valgrind also didn't report anything :)

The test coverage is comprehensive! For the sake of completeness I'd add
numeric tests for NaN and Infitinty with positive numeric values in the
set, e.g:

postgres=# WITH j (v) AS (VALUES
('NaN'::numeric),('Infinity'::numeric),(3.14))
SELECT product(v) FROM j;
product
---------
NaN
(1 row)

Other than that and the point mentioned by Dean I have nothing to add at
this point.

Thanks for the patch.

Best, Jim

Jeevan Chalke

jeevan.chalke@enterprisedb.com

1 day ago

In reply to: Dean Rasheed (#3)

Re: Add PRODUCT() aggregate function

On Tue, Jun 23, 2026 at 2:13 PM Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:

On Tue, 23 Jun 2026 at 09:37, Dean Rasheed <dean.a.rasheed@gmail.com>
wrote:

On Tue, 23 Jun 2026 at 08:49, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:

PRODUCT() returns the product of all non-null input values. It is

defined for

int2, int4, int8, float4, float8 and numeric input, and always returns

numeric.

I don't think that you need to define it for all those types. I
suspect that you could just define it for numeric and float8, and let
implicit casting do the rest.

Thank you, Dean, for looking at this.

I appreciate the suggestion, but I don't think the implicit-casting
approach
works well here, for the following reasons:

1. The integer types (int2/int4/int8) have implicit casts to both float8
and
numeric. Since float8 is the preferred type in the numeric type category,
the function resolution machinery would select the product(float8) variant.
That has two consequences: an extra cast function has to be executed per
row,
and, more importantly, integer inputs would be accumulated as float8.
For a large product that yields a lossy result in exponent form, whereas
accumulating in numeric gives an exact answer.

2. This approach is also consistent with the existing aggregates —
sum(), avg(), min()/max(), etc. all define per-type variants rather than
relying on implicit input casting. The patch follows that established
pattern rather than introducing a new one.

3. There is also a small performance penalty to the casting approach:
the per-row execution of the cast function itself, in addition to the
resolution issue noted in (1).

I ran a quick test to demonstrate point (1). For large products, the float8
implementation returns a lossy, exponential result like 9.9999970000003e+20,

whereas numeric returns the exact value of 999999700000029999999.

I would be happy to share the test script if it's helpful.

... and perhaps make the float8 version return float8.

Yes, I already noted this under "Open questions" in my first email. We can
certainly make the float4 and float8 variants return float8 instead of
numeric. Let's see what others think before making that adjustment.

Thanks

Regards,
Dean

--
*Jeevan Chalke*
*Senior Principal Engineer, Engineering Manager*
*Product Development*

enterprisedb.com <https://www.enterprisedb.com>

Jeevan Chalke

jeevan.chalke@enterprisedb.com

1 day ago

In reply to: Jim Jones (#4)

Re: Add PRODUCT() aggregate function

On Tue, Jun 23, 2026 at 4:32 PM Jim Jones <jim.jones@uni-muenster.de> wrote:

Hi Jeevan

On 23/06/2026 10:37, Dean Rasheed wrote:

On Tue, 23 Jun 2026 at 08:49, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:

PRODUCT() returns the product of all non-null input values. It is

defined for

int2, int4, int8, float4, float8 and numeric input, and always returns

numeric.

I don't think that you need to define it for all those types. I
suspect that you could just define it for numeric and float8, and let
implicit casting do the rest.

+1

I've tested the patch in many different scenarios and all results look
fine -- valgrind also didn't report anything :)

The test coverage is comprehensive! For the sake of completeness I'd add
numeric tests for NaN and Infitinty with positive numeric values in the
set, e.g:

postgres=# WITH j (v) AS (VALUES
('NaN'::numeric),('Infinity'::numeric),(3.14))
SELECT product(v) FROM j;
product
---------
NaN
(1 row)

Other than that and the point mentioned by Dean I have nothing to add at
this point.

Thanks, Jim, for the thorough testing.

I'll include that test case in the next version of the patch.

Thanks for the patch.

Best, Jim

--
*Jeevan Chalke*
*Senior Principal Engineer, Engineering Manager*
*Product Development*

enterprisedb.com <https://www.enterprisedb.com>

Add PRODUCT() aggregate function

Attachments: