Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

Started by David Rowley3 months ago12 messages

dgrowleyml@gmail.com

3 months ago

1 attachment(s)

Since e2debb643, we've had the ability to determine if a column is
NULLable as early as during constant folding. This seems like a good
time to consider converting COUNT(not_null_col) into COUNT(*), which
is faster and may result in far fewer columns being deformed from the
tuple.

To make this work, I invented "SupportRequestSimplifyAggref", which is
similar to the existing SupportRequestSimplify, which is for
FuncExprs. Aggregates use Aggrefs, so we need something else.

It's easy to see that count(*) is faster. Here's a quick test in an
unpatched master:

create table t (a int, b int, c int, d int, e int, f int, g int, h int
not null);
insert into t (h) select 1 from generate_Series(1,1000000);
vacuum freeze t;

master:
select count(h) from t;
Time: 16.442 ms
Time: 16.255 ms
Time: 16.322 ms

master:
select count(*) from t;
Time: 12.203 ms
Time: 11.402 ms
Time: 12.054 ms (+37%)

With the patch applied, both queries will perform the same.

It may be possible to apply transformations to other aggregate
functions too, but I don't want to discuss that here. I mostly want to
determine if the infrastructure is ok and do the count(*) one because
it seems like the most likely one to be useful.

One thing I wasn't too sure about was if we should make it possible
for the support function to return something that's not an Aggref. In
theory, something like COUNT(NULL) could just return '0'::bigint.
While that does seem an optimisation that wouldn't be applied very
often, I have opted to leave it so that such an optimisation *could*
be done by the support function. I also happen to test that that
doesn't entirely break the query, as ordinarily it would if we didn't
have Query.hasAggs (It's not too dissimilar to removing unused columns
from a subquery)

Should we do this?

David

Attachments:

v1-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patchapplication/octet-stream; name=v1-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patchDownload

From 9f3d7cdd1ab367ad0c7f0c24caea9fb463d1fc71 Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Sat, 25 Oct 2025 09:12:53 +1300
Subject: [PATCH v1] Have the planner replace COUNT(ANY) with COUNT(*), when
 possible

This adds SupportRequestSimplifyAggref to allow pg_proc.prosupport
functions to receive an Aggref and allow them to determine if there is a
way that the Aggref call can be optimized.

Also added is a support function to allow transformation of COUNT(ANY)
into COUNT(*).  This is possible to do when the given "ANY" cannot be
NULL and also that there are no ORDER BY / DISTINCT clauses within the
Aggref.  This is a useful transformation to do as it is common that
people write COUNT(1), which until now has added unneeded overhead.
When counting a NOT NULL column, the overheads can be worse as that
might mean deforming more of the tuple, which for large fact tables may
be many columns in.

It may be possible to add prosupport functions for other aggregates.  We
could consider if ORDER BY could be dropped for some calls, e.g. the
ORDER BY is quite useless in MAX(c ORDER BY c).

There is a little bit of passing fallout from adjusting
expr_is_nonnullable() to handle Const which results in a plan change in
the aggregates.out regression test.  Previously, nothing was able to
determine that "One-Time Filter: (100 IS NOT NULL)" was always true,
therefore useless to include in the plan.
---
 .../postgres_fdw/expected/postgres_fdw.out    |  20 +--
 src/backend/optimizer/plan/initsplan.c        |  20 +--
 src/backend/optimizer/util/clauses.c          |  76 +++++++++++
 src/backend/utils/adt/int8.c                  |  68 +++++++++-
 src/include/nodes/supportnodes.h              |  25 ++++
 src/include/optimizer/optimizer.h             |   3 +
 src/test/regress/expected/aggregates.out      | 125 +++++++++++++++++-
 src/test/regress/sql/aggregates.sql           |  44 +++++-
 src/tools/pgindent/typedefs.list              |   1 +
 9 files changed, 346 insertions(+), 36 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index cd28126049d..48e3185b227 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2975,9 +2975,9 @@ select sum(t1.c1), count(t2.c1) from ft1 t1 inner join ft2 t2 on (t1.c1 = t2.c1)
                                                          QUERY PLAN                                                         
 ----------------------------------------------------------------------------------------------------------------------------
  Aggregate
-   Output: sum(t1.c1), count(t2.c1)
+   Output: sum(t1.c1), count(*)
    ->  Foreign Scan
-         Output: t1.c1, t2.c1
+         Output: t1.c1
          Filter: (((((t1.c1 * t2.c1) / (t1.c1 * t2.c1)))::double precision * random()) <= '1'::double precision)
          Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
          Remote SQL: SELECT r1."C 1", r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = r1."C 1"))))
@@ -3073,12 +3073,12 @@ select c2 * (random() <= 1)::int as c2 from ft2 group by c2 * (random() <= 1)::i
 -- GROUP BY clause in various forms, cardinal, alias and constant expression
 explain (verbose, costs off)
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
-                                                 QUERY PLAN                                                 
-------------------------------------------------------------------------------------------------------------
+                                                QUERY PLAN                                                 
+-----------------------------------------------------------------------------------------------------------
  Foreign Scan
-   Output: (count(c2)), c2, 5, 7.0, 9
+   Output: (count(*)), c2, 5, 7.0, 9
    Relations: Aggregate on (public.ft1)
-   Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST
+   Remote SQL: SELECT count(*), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST
 (4 rows)
 
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
@@ -3379,8 +3379,8 @@ select distinct (select count(*) filter (where t2.c2 = 6 and t2.c1 < 10) from ft
 -- Inner query is aggregation query
 explain (verbose, costs off)
 select distinct (select count(t1.c1) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;
-                                                                      QUERY PLAN                                                                      
-------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                    QUERY PLAN                                                                    
+--------------------------------------------------------------------------------------------------------------------------------------------------
  Unique
    Output: ((SubPlan expr_1))
    ->  Sort
@@ -3391,9 +3391,9 @@ select distinct (select count(t1.c1) filter (where t2.c2 = 6 and t2.c1 < 10) fro
                Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (((c2 % 6) = 0))
                SubPlan expr_1
                  ->  Foreign Scan
-                       Output: (count(t1.c1) FILTER (WHERE ((t2.c2 = 6) AND (t2.c1 < 10))))
+                       Output: (count(*) FILTER (WHERE ((t2.c2 = 6) AND (t2.c1 < 10))))
                        Relations: Aggregate on (public.ft1 t1)
-                       Remote SQL: SELECT count("C 1") FILTER (WHERE (($1::integer = 6) AND ($2::integer < 10))) FROM "S 1"."T 1" WHERE (("C 1" = 6))
+                       Remote SQL: SELECT count(*) FILTER (WHERE (($1::integer = 6) AND ($2::integer < 10))) FROM "S 1"."T 1" WHERE (("C 1" = 6))
 (13 rows)
 
 select distinct (select count(t1.c1) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 65d473d95b6..671c5cde8fc 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -3413,22 +3413,6 @@ add_base_clause_to_rel(PlannerInfo *root, Index relid,
 										 restrictinfo->security_level);
 }
 
-/*
- * expr_is_nonnullable
- *	  Check to see if the Expr cannot be NULL
- *
- * Currently we only support simple Vars.
- */
-static bool
-expr_is_nonnullable(PlannerInfo *root, Expr *expr)
-{
-	/* For now only check simple Vars */
-	if (!IsA(expr, Var))
-		return false;
-
-	return var_is_nonnullable(root, (Var *) expr, true);
-}
-
 /*
  * restriction_is_always_true
  *	  Check to see if the RestrictInfo is always true.
@@ -3465,7 +3449,7 @@ restriction_is_always_true(PlannerInfo *root,
 		if (nulltest->argisrow)
 			return false;
 
-		return expr_is_nonnullable(root, nulltest->arg);
+		return expr_is_nonnullable(root, nulltest->arg, true);
 	}
 
 	/* If it's an OR, check its sub-clauses */
@@ -3530,7 +3514,7 @@ restriction_is_always_false(PlannerInfo *root,
 		if (nulltest->argisrow)
 			return false;
 
-		return expr_is_nonnullable(root, nulltest->arg);
+		return expr_is_nonnullable(root, nulltest->arg, true);
 	}
 
 	/* If it's an OR, check its sub-clauses */
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 81d768ff2a2..ea8c6bc05ca 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -131,6 +131,8 @@ static Expr *simplify_function(Oid funcid,
 							   Oid result_collid, Oid input_collid, List **args_p,
 							   bool funcvariadic, bool process_args, bool allow_non_const,
 							   eval_const_expressions_context *context);
+static Node *simplify_aggref(Aggref *aggref,
+							 eval_const_expressions_context *context);
 static List *reorder_function_arguments(List *args, int pronargs,
 										HeapTuple func_tuple);
 static List *add_function_defaults(List *args, int pronargs,
@@ -2628,6 +2630,9 @@ eval_const_expressions_mutator(Node *node,
 				newexpr->location = expr->location;
 				return (Node *) newexpr;
 			}
+		case T_Aggref:
+			node = ece_generic_processing(node);
+			return simplify_aggref((Aggref *) node, context);
 		case T_OpExpr:
 			{
 				OpExpr	   *expr = (OpExpr *) node;
@@ -4194,6 +4199,59 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod,
 	return newexpr;
 }
 
+/*
+ * simplify_aggref
+ *		Call the Aggref.aggfnoid's prosupport function to allow it to
+ *		determine if simplification of the Aggref is possible.  Returns the
+ *		newly simplified node if conversion took place; otherwise, returns the
+ *		original Aggref.
+ *
+ * See SupportRequestSimplifyAggref comments in supportnodes.h for further
+ * details.
+ */
+static Node *
+simplify_aggref(Aggref *aggref, eval_const_expressions_context *context)
+{
+	HeapTuple	agg_tuple;
+	Form_pg_proc aggform;
+	Oid			prosupport;
+
+	agg_tuple = SearchSysCache1(PROCOID, ObjectIdGetDatum(aggref->aggfnoid));
+	if (!HeapTupleIsValid(agg_tuple))
+		elog(ERROR, "cache lookup failed for function %u", aggref->aggfnoid);
+	aggform = (Form_pg_proc) GETSTRUCT(agg_tuple);
+	prosupport = aggform->prosupport;
+	ReleaseSysCache(agg_tuple);
+
+	if (OidIsValid(prosupport))
+	{
+		SupportRequestSimplifyAggref req;
+		Aggref	   *newaggref;
+
+		/*
+		 * Build a SupportRequestSimplifyAggref node to pass to the support
+		 * function.
+		 */
+		req.type = T_SupportRequestSimplifyAggref;
+		req.root = context->root;
+		req.aggref = aggref;
+
+		newaggref = (Aggref *) DatumGetPointer(OidFunctionCall1(prosupport,
+																PointerGetDatum(&req)));
+
+		/*
+		 * We expect the support function to return either a new Aggref or
+		 * NULL (when simplification isn't possible).
+		 */
+		Assert(newaggref != aggref || newaggref == NULL);
+
+		if (newaggref != NULL)
+			return (Node *) newaggref;
+	}
+
+	return (Node *) aggref;
+}
+
 /*
  * var_is_nonnullable: check to see if the Var cannot be NULL
  *
@@ -4255,6 +4313,24 @@ var_is_nonnullable(PlannerInfo *root, Var *var, bool use_rel_info)
 	return false;
 }
 
+/*
+ * expr_is_nonnullable: check to see if the Expr cannot be NULL
+ *
+ * This is mostly a wrapper around var_is_nonnullable() but also made to
+ * handle Const.  We don't currently have the ability to handle Exprs any
+ * more complex than that.
+ */
+bool
+expr_is_nonnullable(PlannerInfo *root, Expr *expr, bool use_rel_info)
+{
+	if (IsA(expr, Var))
+		return var_is_nonnullable(root, (Var *) expr, use_rel_info);
+	if (IsA(expr, Const))
+		return !castNode(Const, expr)->constisnull;
+
+	return false;
+}
+
 /*
  * expand_function_arguments: convert named-notation args to positional args
  * and/or insert default args, as needed
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index bdea490202a..66daf4e1f3a 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -17,14 +17,18 @@
 #include <limits.h>
 #include <math.h>
 
+#include "access/htup_details.h"
+#include "catalog/pg_aggregate.h"
 #include "common/int.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
+#include "nodes/makefuncs.h" // makeConst
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
 #include "utils/builtins.h"
-
+#include "utils/fmgroids.h"
+#include "utils/syscache.h"
 
 typedef struct
 {
@@ -811,6 +815,68 @@ int8inc_support(PG_FUNCTION_ARGS)
 		PG_RETURN_POINTER(req);
 	}
 
+	if (IsA(rawreq, SupportRequestSimplifyAggref))
+	{
+		SupportRequestSimplifyAggref *req = (SupportRequestSimplifyAggref *) rawreq;
+		Aggref	   *agg = req->aggref;
+
+		/*
+		 * Check for COUNT(ANY) and try to convert to COUNT(*). The input
+		 * argument cannot be NULL, we can't have an ORDER BY / DISTINCT in
+		 * the aggregate, and agglevelsup must be 0.
+		 *
+		 * Technically COUNT(ANY) must have 1 arg, but be paranoid and check.
+		 */
+		if (agg->aggfnoid == F_COUNT_ANY && list_length(agg->args) == 1)
+		{
+			TargetEntry *tle = (TargetEntry *) linitial(agg->args);
+			Expr	   *arg = tle->expr;
+
+			/* Check for unsupported cases */
+			if (agg->aggdistinct != NIL || agg->aggorder != NIL ||
+				agg->agglevelsup != 0)
+				PG_RETURN_POINTER(NULL);
+
+#ifdef NOT_USED
+
+			/*
+			 * XXX this was just an experiment to test the query still worked
+			 * as expected.
+			 */
+			if (IsA(arg, Const) && castNode(Const, arg)->constisnull)
+				PG_RETURN_POINTER(makeConst(INT8OID,
+											-1,
+											InvalidOid,
+											sizeof(int64),
+											Int64GetDatum(0),
+											false,
+											true));
+#endif
+
+			if (expr_is_nonnullable(req->root, arg, false))
+			{
+				Aggref	   *newagg;
+
+				/* We don't expect these to have been set yet */
+				Assert(agg->aggtransno == -1);
+				Assert(agg->aggtranstype == InvalidOid);
+
+				/* Convert COUNT(ANY) to COUNT(*) by making a new Aggref */
+				newagg = makeNode(Aggref);
+				memcpy(newagg, agg, sizeof(Aggref));
+				newagg->aggfnoid = F_COUNT_;
+
+				/* count(*) has no args */
+				newagg->aggargtypes = NULL;
+				newagg->args = NULL;
+				newagg->aggstar = true;
+				newagg->location = -1;
+
+				PG_RETURN_POINTER(newagg);
+			}
+		}
+	}
+
 	PG_RETURN_POINTER(NULL);
 }
 
diff --git a/src/include/nodes/supportnodes.h b/src/include/nodes/supportnodes.h
index 7b623d54058..01f6358cbaa 100644
--- a/src/include/nodes/supportnodes.h
+++ b/src/include/nodes/supportnodes.h
@@ -69,6 +69,31 @@ typedef struct SupportRequestSimplify
 	FuncExpr   *fcall;			/* Function call to be simplified */
 } SupportRequestSimplify;
 
+/*
+ * Similar to SupportRequestSimplify but for Aggref node types.
+ *
+ * The supports conversions such as swapping a COUNT(1) or COUNT(notnullcol)
+ * to COUNT(*).
+ *
+ * Supporting functions can consult 'root' and the input 'aggref'.  When the
+ * implementing support function deems the simplification is possible, it must
+ * create a new Node (probably another Aggref) and not modify the original.
+ * The newly created Node should then be returned to indicate that the
+ * conversion is to take place.  When no conversion is possible, a NULL
+ * pointer should be returned.
+ *
+ * It is important to consider that implementing support functions can receive
+ * Aggrefs with agglevelsup > 0.  Careful consideration should be given to
+ * whether the simplification is still possible at levels above 0.
+ */
+typedef struct SupportRequestSimplifyAggref
+{
+	NodeTag		type;
+
+	PlannerInfo *root;			/* Planner's infrastructure */
+	Aggref	   *aggref;			/* Aggref to be simplified */
+} SupportRequestSimplifyAggref;
+
 /*
  * The Selectivity request allows the support function to provide a
  * selectivity estimate for a function appearing at top level of a WHERE
diff --git a/src/include/optimizer/optimizer.h b/src/include/optimizer/optimizer.h
index a34113903c0..bb3d9e277c8 100644
--- a/src/include/optimizer/optimizer.h
+++ b/src/include/optimizer/optimizer.h
@@ -148,6 +148,9 @@ extern Expr *evaluate_expr(Expr *expr, Oid result_type, int32 result_typmod,
 
 extern bool var_is_nonnullable(PlannerInfo *root, Var *var, bool use_rel_info);
 
+extern bool expr_is_nonnullable(PlannerInfo *root, Expr *expr,
+								bool use_rel_info);
+
 extern List *expand_function_arguments(List *args, bool include_out_arguments,
 									   Oid result_type,
 									   HeapTuple func_tuple);
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index bc83a6e188e..be0e1573183 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -1219,24 +1219,42 @@ select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
  9999 | 1
 (3 rows)
 
--- interesting corner case: constant gets optimized into a seqscan
+-- two interesting corner cases: both non-null and null constant gets
+-- optimized into a seqscan
 explain (costs off)
   select max(100) from tenk1;
-                     QUERY PLAN                     
-----------------------------------------------------
+           QUERY PLAN            
+---------------------------------
+ Result
+   Replaces: MinMaxAggregate
+   InitPlan minmax_1
+     ->  Limit
+           ->  Seq Scan on tenk1
+(5 rows)
+
+select max(100) from tenk1;
+ max 
+-----
+ 100
+(1 row)
+
+explain (costs off)
+  select max(null) from tenk1;
+                        QUERY PLAN                         
+-----------------------------------------------------------
  Result
    Replaces: MinMaxAggregate
    InitPlan minmax_1
      ->  Limit
            ->  Result
-                 One-Time Filter: (100 IS NOT NULL)
+                 One-Time Filter: (NULL::text IS NOT NULL)
                  ->  Seq Scan on tenk1
 (7 rows)
 
-select max(100) from tenk1;
+select max(null) from tenk1;
  max 
 -----
- 100
+ 
 (1 row)
 
 -- try it on an inheritance tree
@@ -2821,6 +2839,101 @@ select pg_typeof(cleast_agg(variadic array[4.5,f1])) from int4_tbl;
  numeric
 (1 row)
 
+--
+-- Test SupportRequestSimplifyAggref code
+--
+begin;
+create table agg_simplify (a int, not_null_col int not null, nullable_col int);
+-- Ensure count(not_null_col) uses count(*)
+explain (costs off, verbose)
+select count(not_null_col) from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure count(<not null const>) uses count(*)
+explain (costs off, verbose)
+select count('bananas') from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure count(null) isn't optimized
+explain (costs off, verbose)
+select count(null) from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(NULL::unknown)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure count(nullable_col) does not use count(*)
+explain (costs off, verbose)
+select count(nullable_col) from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(nullable_col)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure there's no optimization with DISTINCT aggs
+explain (costs off, verbose)
+select count(distinct not_null_col) from agg_simplify;
+                 QUERY PLAN                  
+---------------------------------------------
+ Aggregate
+   Output: count(DISTINCT not_null_col)
+   ->  Sort
+         Output: not_null_col
+         Sort Key: agg_simplify.not_null_col
+         ->  Seq Scan on public.agg_simplify
+               Output: not_null_col
+(7 rows)
+
+-- Ensure there's no optimization with ORDER BY aggs
+explain (costs off, verbose)
+select count(not_null_col order by not_null_col) from agg_simplify;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Aggregate
+   Output: count(not_null_col ORDER BY not_null_col)
+   ->  Sort
+         Output: not_null_col
+         Sort Key: agg_simplify.not_null_col
+         ->  Seq Scan on public.agg_simplify
+               Output: not_null_col
+(7 rows)
+
+-- Ensure we don't optimize to count(*) with agglevelsup > 0
+explain (costs off, verbose)
+select a from agg_simplify a group by a
+having exists (select 1 from onek b where count(a.not_null_col) = b.four);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ HashAggregate
+   Output: a.a
+   Group Key: a.a
+   Filter: EXISTS(SubPlan exists_1)
+   ->  Seq Scan on public.agg_simplify a
+         Output: a.a, a.not_null_col, a.nullable_col
+   SubPlan exists_1
+     ->  Seq Scan on public.onek b
+           Filter: (count(a.not_null_col) = b.four)
+(9 rows)
+
+rollback;
 -- test aggregates with common transition functions share the same states
 begin work;
 create type avg_state as (total bigint, count bigint);
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index 908af50def3..77ca6ffa3a9 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -416,11 +416,16 @@ explain (costs off)
   select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
 select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
 
--- interesting corner case: constant gets optimized into a seqscan
+-- two interesting corner cases: both non-null and null constant gets
+-- optimized into a seqscan
 explain (costs off)
   select max(100) from tenk1;
 select max(100) from tenk1;
 
+explain (costs off)
+  select max(null) from tenk1;
+select max(null) from tenk1;
+
 -- try it on an inheritance tree
 create table minmaxtest(f1 int);
 create table minmaxtest1() inherits (minmaxtest);
@@ -1108,6 +1113,43 @@ select cleast_agg(4.5,f1) from int4_tbl;
 select cleast_agg(variadic array[4.5,f1]) from int4_tbl;
 select pg_typeof(cleast_agg(variadic array[4.5,f1])) from int4_tbl;
 
+--
+-- Test SupportRequestSimplifyAggref code
+--
+begin;
+create table agg_simplify (a int, not_null_col int not null, nullable_col int);
+
+-- Ensure count(not_null_col) uses count(*)
+explain (costs off, verbose)
+select count(not_null_col) from agg_simplify;
+
+-- Ensure count(<not null const>) uses count(*)
+explain (costs off, verbose)
+select count('bananas') from agg_simplify;
+
+-- Ensure count(null) isn't optimized
+explain (costs off, verbose)
+select count(null) from agg_simplify;
+
+-- Ensure count(nullable_col) does not use count(*)
+explain (costs off, verbose)
+select count(nullable_col) from agg_simplify;
+
+-- Ensure there's no optimization with DISTINCT aggs
+explain (costs off, verbose)
+select count(distinct not_null_col) from agg_simplify;
+
+-- Ensure there's no optimization with ORDER BY aggs
+explain (costs off, verbose)
+select count(not_null_col order by not_null_col) from agg_simplify;
+
+-- Ensure we don't optimize to count(*) with agglevelsup > 0
+explain (costs off, verbose)
+select a from agg_simplify a group by a
+having exists (select 1 from onek b where count(a.not_null_col) = b.four);
+
+rollback;
+
 -- test aggregates with common transition functions share the same states
 begin work;
 
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 43fe3bcd593..281bc29ca76 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2916,6 +2916,7 @@ SupportRequestOptimizeWindowClause
 SupportRequestRows
 SupportRequestSelectivity
 SupportRequestSimplify
+SupportRequestSimplifyAggref
 SupportRequestWFuncMonotonic
 Syn
 SyncOps
-- 
2.43.0

Corey Huinker

corey.huinker@gmail.com

2 months ago

In reply to: David Rowley (#1)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

It may be possible to apply transformations to other aggregate
functions too, but I don't want to discuss that here. I mostly want to
determine if the infrastructure is ok and do the count(*) one because
it seems like the most likely one to be useful.

One thing I wasn't too sure about was if we should make it possible
for the support function to return something that's not an Aggref. In
theory, something like COUNT(NULL) could just return '0'::bigint.
While that does seem an optimisation that wouldn't be applied very
often, I have opted to leave it so that such an optimisation *could*
be done by the support function. I also happen to test that that
doesn't entirely break the query, as ordinarily it would if we didn't
have Query.hasAggs (It's not too dissimilar to removing unused columns
from a subquery)

Should we do this?

David

Automatic query improvements are a Good Thing. We're never going to educate
everyone that COUNT(1) is an anti-pattern, so it's easier to make it not an
anti-pattern.

I'm in favor of the COUNT(NULL) optimization as well, as one of my favorite
programming tropes is "There is nothing faster than nothing".

The check seems lightweight enough to me. Applies clean and tests pass.
Test coverage seems to cover all the cases.

David Rowley

dgrowleyml@gmail.com

2 months ago

In reply to: Corey Huinker (#2)

1 attachment(s)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

On Thu, 30 Oct 2025 at 16:40, Corey Huinker <corey.huinker@gmail.com> wrote:

I'm in favor of the COUNT(NULL) optimization as well, as one of my favorite programming tropes is "There is nothing faster than nothing".

I think it would be much more interesting to do that if it could be
detected in cases like:

SELECT COUNT(col) FROM table WHERE col IS NULL;

which might be a more realistic thing if the query without the WHERE
clause was part of a VIEW. However, we don't currently have any
infrastructure to detect when a column *is* NULL. There's only the
opposite with expr_is_nonnullable() or var_is_nonnullable().

This does make me wonder if constant-folding is too early to do this
transformation, as it currently happens before
add_base_clause_to_rel() therefore we can't really transform cases
such as:

SELECT count(nullable_col) FROM t WHERE nullable_col IS NOT NULL;

There might be a better spot in planning to do this at a point after
add_base_clause_to_rel() is called. It just has to happen before the
search for Aggrefs with the same aggtransfn in preprocess_aggref() as
it's too late to swap aggregate functions around when they've already
been grouped together with other aggs that can share the same
transition state. I'm just subtly aware about Tom's complaints with
the restriction_is_always_true() code as he thought it should go into
the constant folding code, where it mostly now is, per Richard's work
to put it there.

The check seems lightweight enough to me. Applies clean and tests pass. Test coverage seems to cover all the cases.

Thanks for having a look and testing.

I've attached a very slightly revised version of the patch. I became
aware of a function named get_func_support(), which can be used rather
than fetching the pg_proc tuple from SysCache, which I was doing in
v1. No other changes.

David

Attachments:

v2-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patchapplication/octet-stream; name=v2-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patchDownload

From 8087c2b73aa6fc5deba1000c22eccc9e12d3e2a6 Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Sat, 25 Oct 2025 09:12:53 +1300
Subject: [PATCH v2] Have the planner replace COUNT(ANY) with COUNT(*), when
 possible

This adds SupportRequestSimplifyAggref to allow pg_proc.prosupport
functions to receive an Aggref and allow them to determine if there is a
way that the Aggref call can be optimized.

Also added is a support function to allow transformation of COUNT(ANY)
into COUNT(*).  This is possible to do when the given "ANY" cannot be
NULL and also that there are no ORDER BY / DISTINCT clauses within the
Aggref.  This is a useful transformation to do as it is common that
people write COUNT(1), which until now has added unneeded overhead.
When counting a NOT NULL column, the overheads can be worse as that
might mean deforming more of the tuple, which for large fact tables may
be many columns in.

It may be possible to add prosupport functions for other aggregates.  We
could consider if ORDER BY could be dropped for some calls, e.g. the
ORDER BY is quite useless in MAX(c ORDER BY c).

There is a little bit of passing fallout from adjusting
expr_is_nonnullable() to handle Const which results in a plan change in
the aggregates.out regression test.  Previously, nothing was able to
determine that "One-Time Filter: (100 IS NOT NULL)" was always true,
therefore useless to include in the plan.

Discussion: https://postgr.es/m/CAApHDvqGcPTagXpKfH=CrmHBqALpziThJEDs_MrPqjKVeDF9wA@mail.gmail.com
---
 .../postgres_fdw/expected/postgres_fdw.out    |  20 +--
 src/backend/optimizer/plan/initsplan.c        |  20 +--
 src/backend/optimizer/util/clauses.c          |  67 ++++++++++
 src/backend/utils/adt/int8.c                  |  68 +++++++++-
 src/include/nodes/supportnodes.h              |  25 ++++
 src/include/optimizer/optimizer.h             |   3 +
 src/test/regress/expected/aggregates.out      | 125 +++++++++++++++++-
 src/test/regress/sql/aggregates.sql           |  44 +++++-
 src/tools/pgindent/typedefs.list              |   1 +
 9 files changed, 337 insertions(+), 36 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index cd28126049d..48e3185b227 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2975,9 +2975,9 @@ select sum(t1.c1), count(t2.c1) from ft1 t1 inner join ft2 t2 on (t1.c1 = t2.c1)
                                                          QUERY PLAN                                                         
 ----------------------------------------------------------------------------------------------------------------------------
  Aggregate
-   Output: sum(t1.c1), count(t2.c1)
+   Output: sum(t1.c1), count(*)
    ->  Foreign Scan
-         Output: t1.c1, t2.c1
+         Output: t1.c1
          Filter: (((((t1.c1 * t2.c1) / (t1.c1 * t2.c1)))::double precision * random()) <= '1'::double precision)
          Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
          Remote SQL: SELECT r1."C 1", r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = r1."C 1"))))
@@ -3073,12 +3073,12 @@ select c2 * (random() <= 1)::int as c2 from ft2 group by c2 * (random() <= 1)::i
 -- GROUP BY clause in various forms, cardinal, alias and constant expression
 explain (verbose, costs off)
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
-                                                 QUERY PLAN                                                 
-------------------------------------------------------------------------------------------------------------
+                                                QUERY PLAN                                                 
+-----------------------------------------------------------------------------------------------------------
  Foreign Scan
-   Output: (count(c2)), c2, 5, 7.0, 9
+   Output: (count(*)), c2, 5, 7.0, 9
    Relations: Aggregate on (public.ft1)
-   Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST
+   Remote SQL: SELECT count(*), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST
 (4 rows)
 
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
@@ -3379,8 +3379,8 @@ select distinct (select count(*) filter (where t2.c2 = 6 and t2.c1 < 10) from ft
 -- Inner query is aggregation query
 explain (verbose, costs off)
 select distinct (select count(t1.c1) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;
-                                                                      QUERY PLAN                                                                      
-------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                    QUERY PLAN                                                                    
+--------------------------------------------------------------------------------------------------------------------------------------------------
  Unique
    Output: ((SubPlan expr_1))
    ->  Sort
@@ -3391,9 +3391,9 @@ select distinct (select count(t1.c1) filter (where t2.c2 = 6 and t2.c1 < 10) fro
                Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (((c2 % 6) = 0))
                SubPlan expr_1
                  ->  Foreign Scan
-                       Output: (count(t1.c1) FILTER (WHERE ((t2.c2 = 6) AND (t2.c1 < 10))))
+                       Output: (count(*) FILTER (WHERE ((t2.c2 = 6) AND (t2.c1 < 10))))
                        Relations: Aggregate on (public.ft1 t1)
-                       Remote SQL: SELECT count("C 1") FILTER (WHERE (($1::integer = 6) AND ($2::integer < 10))) FROM "S 1"."T 1" WHERE (("C 1" = 6))
+                       Remote SQL: SELECT count(*) FILTER (WHERE (($1::integer = 6) AND ($2::integer < 10))) FROM "S 1"."T 1" WHERE (("C 1" = 6))
 (13 rows)
 
 select distinct (select count(t1.c1) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 65d473d95b6..671c5cde8fc 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -3413,22 +3413,6 @@ add_base_clause_to_rel(PlannerInfo *root, Index relid,
 										 restrictinfo->security_level);
 }
 
-/*
- * expr_is_nonnullable
- *	  Check to see if the Expr cannot be NULL
- *
- * Currently we only support simple Vars.
- */
-static bool
-expr_is_nonnullable(PlannerInfo *root, Expr *expr)
-{
-	/* For now only check simple Vars */
-	if (!IsA(expr, Var))
-		return false;
-
-	return var_is_nonnullable(root, (Var *) expr, true);
-}
-
 /*
  * restriction_is_always_true
  *	  Check to see if the RestrictInfo is always true.
@@ -3465,7 +3449,7 @@ restriction_is_always_true(PlannerInfo *root,
 		if (nulltest->argisrow)
 			return false;
 
-		return expr_is_nonnullable(root, nulltest->arg);
+		return expr_is_nonnullable(root, nulltest->arg, true);
 	}
 
 	/* If it's an OR, check its sub-clauses */
@@ -3530,7 +3514,7 @@ restriction_is_always_false(PlannerInfo *root,
 		if (nulltest->argisrow)
 			return false;
 
-		return expr_is_nonnullable(root, nulltest->arg);
+		return expr_is_nonnullable(root, nulltest->arg, true);
 	}
 
 	/* If it's an OR, check its sub-clauses */
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 81d768ff2a2..ba440d2f5eb 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -131,6 +131,8 @@ static Expr *simplify_function(Oid funcid,
 							   Oid result_collid, Oid input_collid, List **args_p,
 							   bool funcvariadic, bool process_args, bool allow_non_const,
 							   eval_const_expressions_context *context);
+static Node *simplify_aggref(Aggref *aggref,
+							 eval_const_expressions_context *context);
 static List *reorder_function_arguments(List *args, int pronargs,
 										HeapTuple func_tuple);
 static List *add_function_defaults(List *args, int pronargs,
@@ -2628,6 +2630,9 @@ eval_const_expressions_mutator(Node *node,
 				newexpr->location = expr->location;
 				return (Node *) newexpr;
 			}
+		case T_Aggref:
+			node = ece_generic_processing(node);
+			return simplify_aggref((Aggref *) node, context);
 		case T_OpExpr:
 			{
 				OpExpr	   *expr = (OpExpr *) node;
@@ -4194,6 +4199,50 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod,
 	return newexpr;
 }
 
+/*
+ * simplify_aggref
+ *		Call the Aggref.aggfnoid's prosupport function to allow it to
+ *		determine if simplification of the Aggref is possible.  Returns the
+ *		newly simplified node if conversion took place; otherwise, returns the
+ *		original Aggref.
+ *
+ * See SupportRequestSimplifyAggref comments in supportnodes.h for further
+ * details.
+ */
+static Node *
+simplify_aggref(Aggref *aggref, eval_const_expressions_context *context)
+{
+	Oid			prosupport = get_func_support(aggref->aggfnoid);
+
+	if (OidIsValid(prosupport))
+	{
+		SupportRequestSimplifyAggref req;
+		Aggref	   *newaggref;
+
+		/*
+		 * Build a SupportRequestSimplifyAggref node to pass to the support
+		 * function.
+		 */
+		req.type = T_SupportRequestSimplifyAggref;
+		req.root = context->root;
+		req.aggref = aggref;
+
+		newaggref = (Aggref *) DatumGetPointer(OidFunctionCall1(prosupport,
+																PointerGetDatum(&req)));
+
+		/*
+		 * We expect the support function to return either a new Aggref or
+		 * NULL (when simplification isn't possible).
+		 */
+		Assert(newaggref != aggref || newaggref == NULL);
+
+		if (newaggref != NULL)
+			return (Node *) newaggref;
+	}
+
+	return (Node *) aggref;
+}
+
 /*
  * var_is_nonnullable: check to see if the Var cannot be NULL
  *
@@ -4255,6 +4304,24 @@ var_is_nonnullable(PlannerInfo *root, Var *var, bool use_rel_info)
 	return false;
 }
 
+/*
+ * expr_is_nonnullable: check to see if the Expr cannot be NULL
+ *
+ * This is mostly a wrapper around var_is_nonnullable() but also made to
+ * handle Const.  We don't currently have the ability to handle Exprs any
+ * more complex than that.
+ */
+bool
+expr_is_nonnullable(PlannerInfo *root, Expr *expr, bool use_rel_info)
+{
+	if (IsA(expr, Var))
+		return var_is_nonnullable(root, (Var *) expr, use_rel_info);
+	if (IsA(expr, Const))
+		return !castNode(Const, expr)->constisnull;
+
+	return false;
+}
+
 /*
  * expand_function_arguments: convert named-notation args to positional args
  * and/or insert default args, as needed
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index bdea490202a..66daf4e1f3a 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -17,14 +17,18 @@
 #include <limits.h>
 #include <math.h>
 
+#include "access/htup_details.h"
+#include "catalog/pg_aggregate.h"
 #include "common/int.h"
 #include "funcapi.h"
 #include "libpq/pqformat.h"
+#include "nodes/makefuncs.h" // makeConst
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
 #include "utils/builtins.h"
-
+#include "utils/fmgroids.h"
+#include "utils/syscache.h"
 
 typedef struct
 {
@@ -811,6 +815,68 @@ int8inc_support(PG_FUNCTION_ARGS)
 		PG_RETURN_POINTER(req);
 	}
 
+	if (IsA(rawreq, SupportRequestSimplifyAggref))
+	{
+		SupportRequestSimplifyAggref *req = (SupportRequestSimplifyAggref *) rawreq;
+		Aggref	   *agg = req->aggref;
+
+		/*
+		 * Check for COUNT(ANY) and try to convert to COUNT(*). The input
+		 * argument cannot be NULL, we can't have an ORDER BY / DISTINCT in
+		 * the aggregate, and agglevelsup must be 0.
+		 *
+		 * Technically COUNT(ANY) must have 1 arg, but be paranoid and check.
+		 */
+		if (agg->aggfnoid == F_COUNT_ANY && list_length(agg->args) == 1)
+		{
+			TargetEntry *tle = (TargetEntry *) linitial(agg->args);
+			Expr	   *arg = tle->expr;
+
+			/* Check for unsupported cases */
+			if (agg->aggdistinct != NIL || agg->aggorder != NIL ||
+				agg->agglevelsup != 0)
+				PG_RETURN_POINTER(NULL);
+
+#ifdef NOT_USED
+
+			/*
+			 * XXX this was just an experiment to test the query still worked
+			 * as expected.
+			 */
+			if (IsA(arg, Const) && castNode(Const, arg)->constisnull)
+				PG_RETURN_POINTER(makeConst(INT8OID,
+											-1,
+											InvalidOid,
+											sizeof(int64),
+											Int64GetDatum(0),
+											false,
+											true));
+#endif
+
+			if (expr_is_nonnullable(req->root, arg, false))
+			{
+				Aggref	   *newagg;
+
+				/* We don't expect these to have been set yet */
+				Assert(agg->aggtransno == -1);
+				Assert(agg->aggtranstype == InvalidOid);
+
+				/* Convert COUNT(ANY) to COUNT(*) by making a new Aggref */
+				newagg = makeNode(Aggref);
+				memcpy(newagg, agg, sizeof(Aggref));
+				newagg->aggfnoid = F_COUNT_;
+
+				/* count(*) has no args */
+				newagg->aggargtypes = NULL;
+				newagg->args = NULL;
+				newagg->aggstar = true;
+				newagg->location = -1;
+
+				PG_RETURN_POINTER(newagg);
+			}
+		}
+	}
+
 	PG_RETURN_POINTER(NULL);
 }
 
diff --git a/src/include/nodes/supportnodes.h b/src/include/nodes/supportnodes.h
index 7b623d54058..01f6358cbaa 100644
--- a/src/include/nodes/supportnodes.h
+++ b/src/include/nodes/supportnodes.h
@@ -69,6 +69,31 @@ typedef struct SupportRequestSimplify
 	FuncExpr   *fcall;			/* Function call to be simplified */
 } SupportRequestSimplify;
 
+/*
+ * Similar to SupportRequestSimplify but for Aggref node types.
+ *
+ * The supports conversions such as swapping a COUNT(1) or COUNT(notnullcol)
+ * to COUNT(*).
+ *
+ * Supporting functions can consult 'root' and the input 'aggref'.  When the
+ * implementing support function deems the simplification is possible, it must
+ * create a new Node (probably another Aggref) and not modify the original.
+ * The newly created Node should then be returned to indicate that the
+ * conversion is to take place.  When no conversion is possible, a NULL
+ * pointer should be returned.
+ *
+ * It is important to consider that implementing support functions can receive
+ * Aggrefs with agglevelsup > 0.  Careful consideration should be given to
+ * whether the simplification is still possible at levels above 0.
+ */
+typedef struct SupportRequestSimplifyAggref
+{
+	NodeTag		type;
+
+	PlannerInfo *root;			/* Planner's infrastructure */
+	Aggref	   *aggref;			/* Aggref to be simplified */
+} SupportRequestSimplifyAggref;
+
 /*
  * The Selectivity request allows the support function to provide a
  * selectivity estimate for a function appearing at top level of a WHERE
diff --git a/src/include/optimizer/optimizer.h b/src/include/optimizer/optimizer.h
index a34113903c0..bb3d9e277c8 100644
--- a/src/include/optimizer/optimizer.h
+++ b/src/include/optimizer/optimizer.h
@@ -148,6 +148,9 @@ extern Expr *evaluate_expr(Expr *expr, Oid result_type, int32 result_typmod,
 
 extern bool var_is_nonnullable(PlannerInfo *root, Var *var, bool use_rel_info);
 
+extern bool expr_is_nonnullable(PlannerInfo *root, Expr *expr,
+								bool use_rel_info);
+
 extern List *expand_function_arguments(List *args, bool include_out_arguments,
 									   Oid result_type,
 									   HeapTuple func_tuple);
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index bc83a6e188e..be0e1573183 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -1219,24 +1219,42 @@ select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
  9999 | 1
 (3 rows)
 
--- interesting corner case: constant gets optimized into a seqscan
+-- two interesting corner cases: both non-null and null constant gets
+-- optimized into a seqscan
 explain (costs off)
   select max(100) from tenk1;
-                     QUERY PLAN                     
-----------------------------------------------------
+           QUERY PLAN            
+---------------------------------
+ Result
+   Replaces: MinMaxAggregate
+   InitPlan minmax_1
+     ->  Limit
+           ->  Seq Scan on tenk1
+(5 rows)
+
+select max(100) from tenk1;
+ max 
+-----
+ 100
+(1 row)
+
+explain (costs off)
+  select max(null) from tenk1;
+                        QUERY PLAN                         
+-----------------------------------------------------------
  Result
    Replaces: MinMaxAggregate
    InitPlan minmax_1
      ->  Limit
            ->  Result
-                 One-Time Filter: (100 IS NOT NULL)
+                 One-Time Filter: (NULL::text IS NOT NULL)
                  ->  Seq Scan on tenk1
 (7 rows)
 
-select max(100) from tenk1;
+select max(null) from tenk1;
  max 
 -----
- 100
+ 
 (1 row)
 
 -- try it on an inheritance tree
@@ -2821,6 +2839,101 @@ select pg_typeof(cleast_agg(variadic array[4.5,f1])) from int4_tbl;
  numeric
 (1 row)
 
+--
+-- Test SupportRequestSimplifyAggref code
+--
+begin;
+create table agg_simplify (a int, not_null_col int not null, nullable_col int);
+-- Ensure count(not_null_col) uses count(*)
+explain (costs off, verbose)
+select count(not_null_col) from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure count(<not null const>) uses count(*)
+explain (costs off, verbose)
+select count('bananas') from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure count(null) isn't optimized
+explain (costs off, verbose)
+select count(null) from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(NULL::unknown)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure count(nullable_col) does not use count(*)
+explain (costs off, verbose)
+select count(nullable_col) from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(nullable_col)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure there's no optimization with DISTINCT aggs
+explain (costs off, verbose)
+select count(distinct not_null_col) from agg_simplify;
+                 QUERY PLAN                  
+---------------------------------------------
+ Aggregate
+   Output: count(DISTINCT not_null_col)
+   ->  Sort
+         Output: not_null_col
+         Sort Key: agg_simplify.not_null_col
+         ->  Seq Scan on public.agg_simplify
+               Output: not_null_col
+(7 rows)
+
+-- Ensure there's no optimization with ORDER BY aggs
+explain (costs off, verbose)
+select count(not_null_col order by not_null_col) from agg_simplify;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Aggregate
+   Output: count(not_null_col ORDER BY not_null_col)
+   ->  Sort
+         Output: not_null_col
+         Sort Key: agg_simplify.not_null_col
+         ->  Seq Scan on public.agg_simplify
+               Output: not_null_col
+(7 rows)
+
+-- Ensure we don't optimize to count(*) with agglevelsup > 0
+explain (costs off, verbose)
+select a from agg_simplify a group by a
+having exists (select 1 from onek b where count(a.not_null_col) = b.four);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ HashAggregate
+   Output: a.a
+   Group Key: a.a
+   Filter: EXISTS(SubPlan exists_1)
+   ->  Seq Scan on public.agg_simplify a
+         Output: a.a, a.not_null_col, a.nullable_col
+   SubPlan exists_1
+     ->  Seq Scan on public.onek b
+           Filter: (count(a.not_null_col) = b.four)
+(9 rows)
+
+rollback;
 -- test aggregates with common transition functions share the same states
 begin work;
 create type avg_state as (total bigint, count bigint);
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index 908af50def3..77ca6ffa3a9 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -416,11 +416,16 @@ explain (costs off)
   select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
 select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
 
--- interesting corner case: constant gets optimized into a seqscan
+-- two interesting corner cases: both non-null and null constant gets
+-- optimized into a seqscan
 explain (costs off)
   select max(100) from tenk1;
 select max(100) from tenk1;
 
+explain (costs off)
+  select max(null) from tenk1;
+select max(null) from tenk1;
+
 -- try it on an inheritance tree
 create table minmaxtest(f1 int);
 create table minmaxtest1() inherits (minmaxtest);
@@ -1108,6 +1113,43 @@ select cleast_agg(4.5,f1) from int4_tbl;
 select cleast_agg(variadic array[4.5,f1]) from int4_tbl;
 select pg_typeof(cleast_agg(variadic array[4.5,f1])) from int4_tbl;
 
+--
+-- Test SupportRequestSimplifyAggref code
+--
+begin;
+create table agg_simplify (a int, not_null_col int not null, nullable_col int);
+
+-- Ensure count(not_null_col) uses count(*)
+explain (costs off, verbose)
+select count(not_null_col) from agg_simplify;
+
+-- Ensure count(<not null const>) uses count(*)
+explain (costs off, verbose)
+select count('bananas') from agg_simplify;
+
+-- Ensure count(null) isn't optimized
+explain (costs off, verbose)
+select count(null) from agg_simplify;
+
+-- Ensure count(nullable_col) does not use count(*)
+explain (costs off, verbose)
+select count(nullable_col) from agg_simplify;
+
+-- Ensure there's no optimization with DISTINCT aggs
+explain (costs off, verbose)
+select count(distinct not_null_col) from agg_simplify;
+
+-- Ensure there's no optimization with ORDER BY aggs
+explain (costs off, verbose)
+select count(not_null_col order by not_null_col) from agg_simplify;
+
+-- Ensure we don't optimize to count(*) with agglevelsup > 0
+explain (costs off, verbose)
+select a from agg_simplify a group by a
+having exists (select 1 from onek b where count(a.not_null_col) = b.four);
+
+rollback;
+
 -- test aggregates with common transition functions share the same states
 begin work;
 
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index df88c78fe3a..3fe031aee01 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2915,6 +2915,7 @@ SupportRequestOptimizeWindowClause
 SupportRequestRows
 SupportRequestSelectivity
 SupportRequestSimplify
+SupportRequestSimplifyAggref
 SupportRequestWFuncMonotonic
 Syn
 SyncOps
-- 
2.43.0

Corey Huinker

corey.huinker@gmail.com

2 months ago

In reply to: David Rowley (#3)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

which might be a more realistic thing if the query without the WHERE
clause was part of a VIEW. However, we don't currently have any
infrastructure to detect when a column *is* NULL. There's only the
opposite with expr_is_nonnullable() or var_is_nonnullable().

But we'd still catch NULL constants, yes?

I've attached a very slightly revised version of the patch. I became
aware of a function named get_func_support(), which can be used rather
than fetching the pg_proc tuple from SysCache, which I was doing in
v1. No other changes.

David

The change is a big win for clarity. Applies clean. Passes.

Matheus Alcantara

matheusssilv97@gmail.com

2 months ago

In reply to: David Rowley (#3)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

Hi, thanks for working on this!

On Thu Oct 30, 2025 at 1:20 AM -03, David Rowley wrote:

I've attached a very slightly revised version of the patch. I became
aware of a function named get_func_support(), which can be used rather
than fetching the pg_proc tuple from SysCache, which I was doing in
v1. No other changes.

I looked the code and it seems to be in a good shape, but I tried to
apply the v2 on top of e7ccb247b38 in master to run some tests and a
rebase is necessary.

git am v2-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patch
Applying: Have the planner replace COUNT(ANY) with COUNT(*), when possible
error: patch failed: contrib/postgres_fdw/expected/postgres_fdw.out:2975
error: contrib/postgres_fdw/expected/postgres_fdw.out: patch does not apply
error: patch failed: src/test/regress/expected/aggregates.out:1219
error: src/test/regress/expected/aggregates.out: patch does not apply
Patch failed at 0001 Have the planner replace COUNT(ANY) with COUNT(*), when possible

--
Matheus Alcantara

David Rowley

dgrowleyml@gmail.com

2 months ago

In reply to: Matheus Alcantara (#5)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

On Tue, 4 Nov 2025 at 09:38, Matheus Alcantara <matheusssilv97@gmail.com> wrote:

I looked the code and it seems to be in a good shape, but I tried to
apply the v2 on top of e7ccb247b38 in master to run some tests and a
rebase is necessary.

Are you sure you've not got something else in your branch? It applies
ok here, and the CFbot isn't complaining either. CFBot's is based on
cf8be0225, which is 2 commits before the one you're trying, but
src/test/regress/expected/aggregates.out hasn't been changed since
2025-10-07.

David

David Rowley

dgrowleyml@gmail.com

2 months ago

In reply to: Corey Huinker (#4)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

On Sat, 1 Nov 2025 at 14:19, Corey Huinker <corey.huinker@gmail.com> wrote:

which might be a more realistic thing if the query without the WHERE
clause was part of a VIEW. However, we don't currently have any
infrastructure to detect when a column *is* NULL. There's only the
opposite with expr_is_nonnullable() or var_is_nonnullable().

But we'd still catch NULL constants, yes?

Yes. It could. I've left that part of the patch #ifdef'd out. I wasn't
planning on using it. I just left it there as an example for if
someone wanted to test it.

The change is a big win for clarity. Applies clean. Passes.

Thanks for checking.

David

Matheus Alcantara

matheusssilv97@gmail.com

2 months ago

In reply to: David Rowley (#6)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

On Mon Nov 3, 2025 at 7:47 PM -03, David Rowley wrote:

On Tue, 4 Nov 2025 at 09:38, Matheus Alcantara <matheusssilv97@gmail.com> wrote:

I looked the code and it seems to be in a good shape, but I tried to
apply the v2 on top of e7ccb247b38 in master to run some tests and a
rebase is necessary.

Are you sure you've not got something else in your branch? It applies
ok here, and the CFbot isn't complaining either. CFBot's is based on
cf8be0225, which is 2 commits before the one you're trying, but
src/test/regress/expected/aggregates.out hasn't been changed since
2025-10-07.

Yes, my branch is clean, I even tried to apply on a cleaned git clone
but it is still failling to apply, very strange. I've added the cfbot
remote and cherry picked your commit and this works. I'll investigate
later why I'm not able to apply your patch directly.

I've tested and benchmarked the patch using count(1) and
count(not_null_col) and I've got similar results, ~30% of improvements
compared with master.

The code seems good to me, I don't have too many comments, I'm just not
sure if we should keep the #ifdef NOT_USED block but I'm not totally
against it. I'm +1 for the idea.

Thanks

--
Matheus Alcantara
EDB: http://www.enterprisedb.com

David Rowley

dgrowleyml@gmail.com

2 months ago

In reply to: Matheus Alcantara (#8)

1 attachment(s)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

On Wed, 5 Nov 2025 at 08:51, Matheus Alcantara <matheusssilv97@gmail.com> wrote:

On Mon Nov 3, 2025 at 7:47 PM -03, David Rowley wrote:

Are you sure you've not got something else in your branch? It applies
ok here, and the CFbot isn't complaining either. CFBot's is based on
cf8be0225, which is 2 commits before the one you're trying, but
src/test/regress/expected/aggregates.out hasn't been changed since
2025-10-07.

Yes, my branch is clean, I even tried to apply on a cleaned git clone
but it is still failling to apply, very strange. I've added the cfbot
remote and cherry picked your commit and this works. I'll investigate
later why I'm not able to apply your patch directly.

Did you look at: git diff origin/master..master ?
I've certainly accidentally periodically committed to my local master
which I ended up doing: git reset --hard origin/master to fix

The code seems good to me, I don't have too many comments, I'm just not
sure if we should keep the #ifdef NOT_USED block but I'm not totally
against it. I'm +1 for the idea.

Thanks for the review. I might not have been clear that I had only
intended the NOT_USED part as an example for during the review period.
I'd never intended it going any further.

I've attached a version with the NOT_USED part removed (and a bunch of
#includes I forgot to remove). The only other change was a minor
revision to some comments.

The primary concern I have now is when in planning that we do this
Aggref simplification. Maybe I shouldn't be too concerned about that
as there doesn't seem to be a current reason not to put it where it
is. If someone comes up with a reason to do it later in planning at
some point in the future, we can consider moving it then. That sort of
excludes extensions with aggregates that want to have a
SupportRequestSimplifyAggref support function that might need the
processing done later in planning, but that just feels like a
situation that's unlikely to arise.

David

Attachments:

v3-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patchapplication/octet-stream; name=v3-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patchDownload

From 59d913bcf0e11d04c5f5a4c39fd9d6c95f4dff2c Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Sat, 25 Oct 2025 09:12:53 +1300
Subject: [PATCH v3] Have the planner replace COUNT(ANY) with COUNT(*), when
 possible

This adds SupportRequestSimplifyAggref to allow pg_proc.prosupport
functions to receive an Aggref and allow them to determine if there is a
way that the Aggref call can be optimized.

Also added is a support function to allow transformation of COUNT(ANY)
into COUNT(*).  This is possible to do when the given "ANY" cannot be
NULL and also that there are no ORDER BY / DISTINCT clauses within the
Aggref.  This is a useful transformation to do as it is common that
people write COUNT(1), which until now has added unneeded overhead.
When counting a NOT NULL column, the overheads can be worse as that
might mean deforming more of the tuple, which for large fact tables may
be many columns in.

It may be possible to add prosupport functions for other aggregates.  We
could consider if ORDER BY could be dropped for some calls, e.g. the
ORDER BY is quite useless in MAX(c ORDER BY c).

There is a little bit of passing fallout from adjusting
expr_is_nonnullable() to handle Const which results in a plan change in
the aggregates.out regression test.  Previously, nothing was able to
determine that "One-Time Filter: (100 IS NOT NULL)" was always true,
therefore useless to include in the plan.

Discussion: https://postgr.es/m/CAApHDvqGcPTagXpKfH=CrmHBqALpziThJEDs_MrPqjKVeDF9wA@mail.gmail.com
---
 .../postgres_fdw/expected/postgres_fdw.out    |  20 +--
 src/backend/optimizer/plan/initsplan.c        |  20 +--
 src/backend/optimizer/util/clauses.c          |  67 ++++++++++
 src/backend/utils/adt/int8.c                  |  49 ++++++-
 src/include/nodes/supportnodes.h              |  25 ++++
 src/include/optimizer/optimizer.h             |   3 +
 src/test/regress/expected/aggregates.out      | 125 +++++++++++++++++-
 src/test/regress/sql/aggregates.sql           |  44 +++++-
 src/tools/pgindent/typedefs.list              |   1 +
 9 files changed, 318 insertions(+), 36 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index cd28126049d..48e3185b227 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2975,9 +2975,9 @@ select sum(t1.c1), count(t2.c1) from ft1 t1 inner join ft2 t2 on (t1.c1 = t2.c1)
                                                          QUERY PLAN                                                         
 ----------------------------------------------------------------------------------------------------------------------------
  Aggregate
-   Output: sum(t1.c1), count(t2.c1)
+   Output: sum(t1.c1), count(*)
    ->  Foreign Scan
-         Output: t1.c1, t2.c1
+         Output: t1.c1
          Filter: (((((t1.c1 * t2.c1) / (t1.c1 * t2.c1)))::double precision * random()) <= '1'::double precision)
          Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
          Remote SQL: SELECT r1."C 1", r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = r1."C 1"))))
@@ -3073,12 +3073,12 @@ select c2 * (random() <= 1)::int as c2 from ft2 group by c2 * (random() <= 1)::i
 -- GROUP BY clause in various forms, cardinal, alias and constant expression
 explain (verbose, costs off)
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
-                                                 QUERY PLAN                                                 
-------------------------------------------------------------------------------------------------------------
+                                                QUERY PLAN                                                 
+-----------------------------------------------------------------------------------------------------------
  Foreign Scan
-   Output: (count(c2)), c2, 5, 7.0, 9
+   Output: (count(*)), c2, 5, 7.0, 9
    Relations: Aggregate on (public.ft1)
-   Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST
+   Remote SQL: SELECT count(*), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST
 (4 rows)
 
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
@@ -3379,8 +3379,8 @@ select distinct (select count(*) filter (where t2.c2 = 6 and t2.c1 < 10) from ft
 -- Inner query is aggregation query
 explain (verbose, costs off)
 select distinct (select count(t1.c1) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;
-                                                                      QUERY PLAN                                                                      
-------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                    QUERY PLAN                                                                    
+--------------------------------------------------------------------------------------------------------------------------------------------------
  Unique
    Output: ((SubPlan expr_1))
    ->  Sort
@@ -3391,9 +3391,9 @@ select distinct (select count(t1.c1) filter (where t2.c2 = 6 and t2.c1 < 10) fro
                Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (((c2 % 6) = 0))
                SubPlan expr_1
                  ->  Foreign Scan
-                       Output: (count(t1.c1) FILTER (WHERE ((t2.c2 = 6) AND (t2.c1 < 10))))
+                       Output: (count(*) FILTER (WHERE ((t2.c2 = 6) AND (t2.c1 < 10))))
                        Relations: Aggregate on (public.ft1 t1)
-                       Remote SQL: SELECT count("C 1") FILTER (WHERE (($1::integer = 6) AND ($2::integer < 10))) FROM "S 1"."T 1" WHERE (("C 1" = 6))
+                       Remote SQL: SELECT count(*) FILTER (WHERE (($1::integer = 6) AND ($2::integer < 10))) FROM "S 1"."T 1" WHERE (("C 1" = 6))
 (13 rows)
 
 select distinct (select count(t1.c1) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 65d473d95b6..671c5cde8fc 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -3413,22 +3413,6 @@ add_base_clause_to_rel(PlannerInfo *root, Index relid,
 										 restrictinfo->security_level);
 }
 
-/*
- * expr_is_nonnullable
- *	  Check to see if the Expr cannot be NULL
- *
- * Currently we only support simple Vars.
- */
-static bool
-expr_is_nonnullable(PlannerInfo *root, Expr *expr)
-{
-	/* For now only check simple Vars */
-	if (!IsA(expr, Var))
-		return false;
-
-	return var_is_nonnullable(root, (Var *) expr, true);
-}
-
 /*
  * restriction_is_always_true
  *	  Check to see if the RestrictInfo is always true.
@@ -3465,7 +3449,7 @@ restriction_is_always_true(PlannerInfo *root,
 		if (nulltest->argisrow)
 			return false;
 
-		return expr_is_nonnullable(root, nulltest->arg);
+		return expr_is_nonnullable(root, nulltest->arg, true);
 	}
 
 	/* If it's an OR, check its sub-clauses */
@@ -3530,7 +3514,7 @@ restriction_is_always_false(PlannerInfo *root,
 		if (nulltest->argisrow)
 			return false;
 
-		return expr_is_nonnullable(root, nulltest->arg);
+		return expr_is_nonnullable(root, nulltest->arg, true);
 	}
 
 	/* If it's an OR, check its sub-clauses */
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 81d768ff2a2..ba440d2f5eb 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -131,6 +131,8 @@ static Expr *simplify_function(Oid funcid,
 							   Oid result_collid, Oid input_collid, List **args_p,
 							   bool funcvariadic, bool process_args, bool allow_non_const,
 							   eval_const_expressions_context *context);
+static Node *simplify_aggref(Aggref *aggref,
+							 eval_const_expressions_context *context);
 static List *reorder_function_arguments(List *args, int pronargs,
 										HeapTuple func_tuple);
 static List *add_function_defaults(List *args, int pronargs,
@@ -2628,6 +2630,9 @@ eval_const_expressions_mutator(Node *node,
 				newexpr->location = expr->location;
 				return (Node *) newexpr;
 			}
+		case T_Aggref:
+			node = ece_generic_processing(node);
+			return simplify_aggref((Aggref *) node, context);
 		case T_OpExpr:
 			{
 				OpExpr	   *expr = (OpExpr *) node;
@@ -4194,6 +4199,50 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod,
 	return newexpr;
 }
 
+/*
+ * simplify_aggref
+ *		Call the Aggref.aggfnoid's prosupport function to allow it to
+ *		determine if simplification of the Aggref is possible.  Returns the
+ *		newly simplified node if conversion took place; otherwise, returns the
+ *		original Aggref.
+ *
+ * See SupportRequestSimplifyAggref comments in supportnodes.h for further
+ * details.
+ */
+static Node *
+simplify_aggref(Aggref *aggref, eval_const_expressions_context *context)
+{
+	Oid			prosupport = get_func_support(aggref->aggfnoid);
+
+	if (OidIsValid(prosupport))
+	{
+		SupportRequestSimplifyAggref req;
+		Aggref	   *newaggref;
+
+		/*
+		 * Build a SupportRequestSimplifyAggref node to pass to the support
+		 * function.
+		 */
+		req.type = T_SupportRequestSimplifyAggref;
+		req.root = context->root;
+		req.aggref = aggref;
+
+		newaggref = (Aggref *) DatumGetPointer(OidFunctionCall1(prosupport,
+																PointerGetDatum(&req)));
+
+		/*
+		 * We expect the support function to return either a new Aggref or
+		 * NULL (when simplification isn't possible).
+		 */
+		Assert(newaggref != aggref || newaggref == NULL);
+
+		if (newaggref != NULL)
+			return (Node *) newaggref;
+	}
+
+	return (Node *) aggref;
+}
+
 /*
  * var_is_nonnullable: check to see if the Var cannot be NULL
  *
@@ -4255,6 +4304,24 @@ var_is_nonnullable(PlannerInfo *root, Var *var, bool use_rel_info)
 	return false;
 }
 
+/*
+ * expr_is_nonnullable: check to see if the Expr cannot be NULL
+ *
+ * This is mostly a wrapper around var_is_nonnullable() but also made to
+ * handle Const.  We don't currently have the ability to handle Exprs any
+ * more complex than that.
+ */
+bool
+expr_is_nonnullable(PlannerInfo *root, Expr *expr, bool use_rel_info)
+{
+	if (IsA(expr, Var))
+		return var_is_nonnullable(root, (Var *) expr, use_rel_info);
+	if (IsA(expr, Const))
+		return !castNode(Const, expr)->constisnull;
+
+	return false;
+}
+
 /*
  * expand_function_arguments: convert named-notation args to positional args
  * and/or insert default args, as needed
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index bdea490202a..9cd420b4b9d 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -24,7 +24,7 @@
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
 #include "utils/builtins.h"
-
+#include "utils/fmgroids.h"
 
 typedef struct
 {
@@ -811,6 +811,53 @@ int8inc_support(PG_FUNCTION_ARGS)
 		PG_RETURN_POINTER(req);
 	}
 
+	if (IsA(rawreq, SupportRequestSimplifyAggref))
+	{
+		SupportRequestSimplifyAggref *req = (SupportRequestSimplifyAggref *) rawreq;
+		Aggref	   *agg = req->aggref;
+
+		/*
+		 * Check for COUNT(ANY) and try to convert to COUNT(*). The input
+		 * argument cannot be NULL, we can't have an ORDER BY / DISTINCT in
+		 * the aggregate, and agglevelsup must be 0.
+		 *
+		 * Technically COUNT(ANY) must have 1 arg, but be paranoid and check.
+		 */
+		if (agg->aggfnoid == F_COUNT_ANY && list_length(agg->args) == 1)
+		{
+			TargetEntry *tle = (TargetEntry *) linitial(agg->args);
+			Expr	   *arg = tle->expr;
+
+			/* Check for unsupported cases */
+			if (agg->aggdistinct != NIL || agg->aggorder != NIL ||
+				agg->agglevelsup != 0)
+				PG_RETURN_POINTER(NULL);
+
+			/* If the arg isn't NULLable, do the conversion */
+			if (expr_is_nonnullable(req->root, arg, false))
+			{
+				Aggref	   *newagg;
+
+				/* We don't expect these to have been set yet */
+				Assert(agg->aggtransno == -1);
+				Assert(agg->aggtranstype == InvalidOid);
+
+				/* Convert COUNT(ANY) to COUNT(*) by making a new Aggref */
+				newagg = makeNode(Aggref);
+				memcpy(newagg, agg, sizeof(Aggref));
+				newagg->aggfnoid = F_COUNT_;
+
+				/* count(*) has no args */
+				newagg->aggargtypes = NULL;
+				newagg->args = NULL;
+				newagg->aggstar = true;
+				newagg->location = -1;
+
+				PG_RETURN_POINTER(newagg);
+			}
+		}
+	}
+
 	PG_RETURN_POINTER(NULL);
 }
 
diff --git a/src/include/nodes/supportnodes.h b/src/include/nodes/supportnodes.h
index 7b623d54058..1ca7a66a15b 100644
--- a/src/include/nodes/supportnodes.h
+++ b/src/include/nodes/supportnodes.h
@@ -69,6 +69,31 @@ typedef struct SupportRequestSimplify
 	FuncExpr   *fcall;			/* Function call to be simplified */
 } SupportRequestSimplify;
 
+/*
+ * Similar to SupportRequestSimplify but for Aggref node types.
+ *
+ * This supports conversions such as swapping COUNT(1) or COUNT(notnullcol)
+ * for COUNT(*).
+ *
+ * Supporting functions can consult 'root' and the input 'aggref'.  When the
+ * implementing support function deems the simplification is possible, it must
+ * create a new Node (probably another Aggref) and not modify the original.
+ * The newly created Node should then be returned to indicate that the
+ * conversion is to take place.  When no conversion is possible, a NULL
+ * pointer should be returned.
+ *
+ * It is important to consider that implementing support functions can receive
+ * Aggrefs with agglevelsup > 0.  Careful consideration should be given to
+ * whether the simplification is still possible at levels above 0.
+ */
+typedef struct SupportRequestSimplifyAggref
+{
+	NodeTag		type;
+
+	PlannerInfo *root;			/* Planner's infrastructure */
+	Aggref	   *aggref;			/* Aggref to be simplified */
+} SupportRequestSimplifyAggref;
+
 /*
  * The Selectivity request allows the support function to provide a
  * selectivity estimate for a function appearing at top level of a WHERE
diff --git a/src/include/optimizer/optimizer.h b/src/include/optimizer/optimizer.h
index d0aa8ab0c1c..44ec5296a18 100644
--- a/src/include/optimizer/optimizer.h
+++ b/src/include/optimizer/optimizer.h
@@ -147,6 +147,9 @@ extern Expr *evaluate_expr(Expr *expr, Oid result_type, int32 result_typmod,
 
 extern bool var_is_nonnullable(PlannerInfo *root, Var *var, bool use_rel_info);
 
+extern bool expr_is_nonnullable(PlannerInfo *root, Expr *expr,
+								bool use_rel_info);
+
 extern List *expand_function_arguments(List *args, bool include_out_arguments,
 									   Oid result_type,
 									   HeapTuple func_tuple);
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index bc83a6e188e..be0e1573183 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -1219,24 +1219,42 @@ select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
  9999 | 1
 (3 rows)
 
--- interesting corner case: constant gets optimized into a seqscan
+-- two interesting corner cases: both non-null and null constant gets
+-- optimized into a seqscan
 explain (costs off)
   select max(100) from tenk1;
-                     QUERY PLAN                     
-----------------------------------------------------
+           QUERY PLAN            
+---------------------------------
+ Result
+   Replaces: MinMaxAggregate
+   InitPlan minmax_1
+     ->  Limit
+           ->  Seq Scan on tenk1
+(5 rows)
+
+select max(100) from tenk1;
+ max 
+-----
+ 100
+(1 row)
+
+explain (costs off)
+  select max(null) from tenk1;
+                        QUERY PLAN                         
+-----------------------------------------------------------
  Result
    Replaces: MinMaxAggregate
    InitPlan minmax_1
      ->  Limit
            ->  Result
-                 One-Time Filter: (100 IS NOT NULL)
+                 One-Time Filter: (NULL::text IS NOT NULL)
                  ->  Seq Scan on tenk1
 (7 rows)
 
-select max(100) from tenk1;
+select max(null) from tenk1;
  max 
 -----
- 100
+ 
 (1 row)
 
 -- try it on an inheritance tree
@@ -2821,6 +2839,101 @@ select pg_typeof(cleast_agg(variadic array[4.5,f1])) from int4_tbl;
  numeric
 (1 row)
 
+--
+-- Test SupportRequestSimplifyAggref code
+--
+begin;
+create table agg_simplify (a int, not_null_col int not null, nullable_col int);
+-- Ensure count(not_null_col) uses count(*)
+explain (costs off, verbose)
+select count(not_null_col) from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure count(<not null const>) uses count(*)
+explain (costs off, verbose)
+select count('bananas') from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure count(null) isn't optimized
+explain (costs off, verbose)
+select count(null) from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(NULL::unknown)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure count(nullable_col) does not use count(*)
+explain (costs off, verbose)
+select count(nullable_col) from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(nullable_col)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure there's no optimization with DISTINCT aggs
+explain (costs off, verbose)
+select count(distinct not_null_col) from agg_simplify;
+                 QUERY PLAN                  
+---------------------------------------------
+ Aggregate
+   Output: count(DISTINCT not_null_col)
+   ->  Sort
+         Output: not_null_col
+         Sort Key: agg_simplify.not_null_col
+         ->  Seq Scan on public.agg_simplify
+               Output: not_null_col
+(7 rows)
+
+-- Ensure there's no optimization with ORDER BY aggs
+explain (costs off, verbose)
+select count(not_null_col order by not_null_col) from agg_simplify;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Aggregate
+   Output: count(not_null_col ORDER BY not_null_col)
+   ->  Sort
+         Output: not_null_col
+         Sort Key: agg_simplify.not_null_col
+         ->  Seq Scan on public.agg_simplify
+               Output: not_null_col
+(7 rows)
+
+-- Ensure we don't optimize to count(*) with agglevelsup > 0
+explain (costs off, verbose)
+select a from agg_simplify a group by a
+having exists (select 1 from onek b where count(a.not_null_col) = b.four);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ HashAggregate
+   Output: a.a
+   Group Key: a.a
+   Filter: EXISTS(SubPlan exists_1)
+   ->  Seq Scan on public.agg_simplify a
+         Output: a.a, a.not_null_col, a.nullable_col
+   SubPlan exists_1
+     ->  Seq Scan on public.onek b
+           Filter: (count(a.not_null_col) = b.four)
+(9 rows)
+
+rollback;
 -- test aggregates with common transition functions share the same states
 begin work;
 create type avg_state as (total bigint, count bigint);
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index 908af50def3..77ca6ffa3a9 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -416,11 +416,16 @@ explain (costs off)
   select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
 select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
 
--- interesting corner case: constant gets optimized into a seqscan
+-- two interesting corner cases: both non-null and null constant gets
+-- optimized into a seqscan
 explain (costs off)
   select max(100) from tenk1;
 select max(100) from tenk1;
 
+explain (costs off)
+  select max(null) from tenk1;
+select max(null) from tenk1;
+
 -- try it on an inheritance tree
 create table minmaxtest(f1 int);
 create table minmaxtest1() inherits (minmaxtest);
@@ -1108,6 +1113,43 @@ select cleast_agg(4.5,f1) from int4_tbl;
 select cleast_agg(variadic array[4.5,f1]) from int4_tbl;
 select pg_typeof(cleast_agg(variadic array[4.5,f1])) from int4_tbl;
 
+--
+-- Test SupportRequestSimplifyAggref code
+--
+begin;
+create table agg_simplify (a int, not_null_col int not null, nullable_col int);
+
+-- Ensure count(not_null_col) uses count(*)
+explain (costs off, verbose)
+select count(not_null_col) from agg_simplify;
+
+-- Ensure count(<not null const>) uses count(*)
+explain (costs off, verbose)
+select count('bananas') from agg_simplify;
+
+-- Ensure count(null) isn't optimized
+explain (costs off, verbose)
+select count(null) from agg_simplify;
+
+-- Ensure count(nullable_col) does not use count(*)
+explain (costs off, verbose)
+select count(nullable_col) from agg_simplify;
+
+-- Ensure there's no optimization with DISTINCT aggs
+explain (costs off, verbose)
+select count(distinct not_null_col) from agg_simplify;
+
+-- Ensure there's no optimization with ORDER BY aggs
+explain (costs off, verbose)
+select count(not_null_col order by not_null_col) from agg_simplify;
+
+-- Ensure we don't optimize to count(*) with agglevelsup > 0
+explain (costs off, verbose)
+select a from agg_simplify a group by a
+having exists (select 1 from onek b where count(a.not_null_col) = b.four);
+
+rollback;
+
 -- test aggregates with common transition functions share the same states
 begin work;
 
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 018b5919cf6..ade44fc912a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2916,6 +2916,7 @@ SupportRequestOptimizeWindowClause
 SupportRequestRows
 SupportRequestSelectivity
 SupportRequestSimplify
+SupportRequestSimplifyAggref
 SupportRequestWFuncMonotonic
 Syn
 SyncOps
-- 
2.43.0

#10

Matheus Alcantara

matheusssilv97@gmail.com

2 months ago

In reply to: David Rowley (#9)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

On Tue Nov 4, 2025 at 9:16 PM -03, David Rowley wrote:

On Wed, 5 Nov 2025 at 08:51, Matheus Alcantara <matheusssilv97@gmail.com> wrote:

On Mon Nov 3, 2025 at 7:47 PM -03, David Rowley wrote:

Are you sure you've not got something else in your branch? It applies
ok here, and the CFbot isn't complaining either. CFBot's is based on
cf8be0225, which is 2 commits before the one you're trying, but
src/test/regress/expected/aggregates.out hasn't been changed since
2025-10-07.

Yes, my branch is clean, I even tried to apply on a cleaned git clone
but it is still failling to apply, very strange. I've added the cfbot
remote and cherry picked your commit and this works. I'll investigate
later why I'm not able to apply your patch directly.

Did you look at: git diff origin/master..master ?
I've certainly accidentally periodically committed to my local master
which I ended up doing: git reset --hard origin/master to fix

Yes, I ran git reset before trying to apply the v2 and it still had
conflicts, very strange. Anyway the v3 applied clean on my environment
now.

The code seems good to me, I don't have too many comments, I'm just not
sure if we should keep the #ifdef NOT_USED block but I'm not totally
against it. I'm +1 for the idea.

Thanks for the review. I might not have been clear that I had only
intended the NOT_USED part as an example for during the review period.
I'd never intended it going any further.

Ok, it make sense now, thanks for making it clear.

I've attached a version with the NOT_USED part removed (and a bunch of
#includes I forgot to remove). The only other change was a minor
revision to some comments.

Thanks, it looks cleaner.

The primary concern I have now is when in planning that we do this
Aggref simplification. Maybe I shouldn't be too concerned about that
as there doesn't seem to be a current reason not to put it where it
is. If someone comes up with a reason to do it later in planning at
some point in the future, we can consider moving it then. That sort of
excludes extensions with aggregates that want to have a
SupportRequestSimplifyAggref support function that might need the
processing done later in planning, but that just feels like a
situation that's unlikely to arise.

I think it's ok to leave where it is implemented now and it make sense
to me. The SupportRequestSimplifyAggref is similar with
SupportRequestSimplify which is used by simplify_function() that is
called at eval_const_expressions_mutator(), the simplify_aggref() is
also called at the same function, so it seems to be consistent.

--
Matheus Alcantara
EDB: http://www.enterprisedb.com

#11

David Rowley

dgrowleyml@gmail.com

about 2 months ago

In reply to: David Rowley (#9)

1 attachment(s)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

On Wed, 5 Nov 2025 at 13:16, David Rowley <dgrowleyml@gmail.com> wrote:

I've attached a version with the NOT_USED part removed (and a bunch of
#includes I forgot to remove). The only other change was a minor
revision to some comments.

This patch needed to be rebased due to the changes made in b140c8d7a.
I also adjusted a few comments and adjusted some code in
simplify_aggref() which mistakenly assumed the support function would
always return an Aggref. That conflicted with what I'd written in the
header comment for the SupportRequestSimplifyAggref struct; "(probably
another Aggref)". Which is leaving the door open for more aggressive
optimisations that someone might want to do, e.g. the mentioned
COUNT(NULL) replaced with '0'::bigint.

I'm not seeing any reason now not to go ahead with this now. Does
anyone else want to take a look at it before I start wielding the
sword of commitment on it?

David

Attachments:

v4-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patchapplication/octet-stream; name=v4-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patchDownload

From 3dfda9873367b002beff1eb436d3e82dbfeb93e9 Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Sat, 25 Oct 2025 09:12:53 +1300
Subject: [PATCH v4] Have the planner replace COUNT(ANY) with COUNT(*), when
 possible

This adds SupportRequestSimplifyAggref to allow pg_proc.prosupport
functions to receive an Aggref and allow them to determine if there is a
way that the Aggref call can be optimized.

Also added is a support function to allow transformation of COUNT(ANY)
into COUNT(*).  This is possible to do when the given "ANY" cannot be
NULL and also that there are no ORDER BY / DISTINCT clauses within the
Aggref.  This is a useful transformation to do as it is common that
people write COUNT(1), which until now has added unneeded overhead.
When counting a NOT NULL column.  The overheads can be worse as that
might mean deforming more of the tuple, which for large fact tables may
be many columns in.

It may be possible to add prosupport functions for other aggregates.  We
could consider if ORDER BY could be dropped for some calls, e.g. the
ORDER BY is quite useless in MAX(c ORDER BY c).

There is a little bit of passing fallout from adjusting
expr_is_nonnullable() to handle Const which results in a plan change in
the aggregates.out regression test.  Previously, nothing was able to
determine that "One-Time Filter: (100 IS NOT NULL)" was always true,
therefore useless to include in the plan.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/CAApHDvqGcPTagXpKfH=CrmHBqALpziThJEDs_MrPqjKVeDF9wA@mail.gmail.com
---
 .../postgres_fdw/expected/postgres_fdw.out    |  20 +--
 src/backend/optimizer/plan/initsplan.c        |  20 +--
 src/backend/optimizer/util/clauses.c          |  66 +++++++++
 src/backend/utils/adt/int8.c                  |  49 ++++++-
 src/include/nodes/supportnodes.h              |  25 ++++
 src/include/optimizer/optimizer.h             |   3 +
 src/test/regress/expected/aggregates.out      | 125 +++++++++++++++++-
 src/test/regress/sql/aggregates.sql           |  44 +++++-
 src/tools/pgindent/typedefs.list              |   1 +
 9 files changed, 317 insertions(+), 36 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index cd28126049d..48e3185b227 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2975,9 +2975,9 @@ select sum(t1.c1), count(t2.c1) from ft1 t1 inner join ft2 t2 on (t1.c1 = t2.c1)
                                                          QUERY PLAN                                                         
 ----------------------------------------------------------------------------------------------------------------------------
  Aggregate
-   Output: sum(t1.c1), count(t2.c1)
+   Output: sum(t1.c1), count(*)
    ->  Foreign Scan
-         Output: t1.c1, t2.c1
+         Output: t1.c1
          Filter: (((((t1.c1 * t2.c1) / (t1.c1 * t2.c1)))::double precision * random()) <= '1'::double precision)
          Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
          Remote SQL: SELECT r1."C 1", r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r2."C 1" = r1."C 1"))))
@@ -3073,12 +3073,12 @@ select c2 * (random() <= 1)::int as c2 from ft2 group by c2 * (random() <= 1)::i
 -- GROUP BY clause in various forms, cardinal, alias and constant expression
 explain (verbose, costs off)
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
-                                                 QUERY PLAN                                                 
-------------------------------------------------------------------------------------------------------------
+                                                QUERY PLAN                                                 
+-----------------------------------------------------------------------------------------------------------
  Foreign Scan
-   Output: (count(c2)), c2, 5, 7.0, 9
+   Output: (count(*)), c2, 5, 7.0, 9
    Relations: Aggregate on (public.ft1)
-   Remote SQL: SELECT count(c2), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST
+   Remote SQL: SELECT count(*), c2, 5, 7.0, 9 FROM "S 1"."T 1" GROUP BY 2, 3, 5 ORDER BY c2 ASC NULLS LAST
 (4 rows)
 
 select count(c2) w, c2 x, 5 y, 7.0 z from ft1 group by 2, y, 9.0::int order by 2;
@@ -3379,8 +3379,8 @@ select distinct (select count(*) filter (where t2.c2 = 6 and t2.c1 < 10) from ft
 -- Inner query is aggregation query
 explain (verbose, costs off)
 select distinct (select count(t1.c1) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;
-                                                                      QUERY PLAN                                                                      
-------------------------------------------------------------------------------------------------------------------------------------------------------
+                                                                    QUERY PLAN                                                                    
+--------------------------------------------------------------------------------------------------------------------------------------------------
  Unique
    Output: ((SubPlan expr_1))
    ->  Sort
@@ -3391,9 +3391,9 @@ select distinct (select count(t1.c1) filter (where t2.c2 = 6 and t2.c1 < 10) fro
                Remote SQL: SELECT "C 1", c2 FROM "S 1"."T 1" WHERE (((c2 % 6) = 0))
                SubPlan expr_1
                  ->  Foreign Scan
-                       Output: (count(t1.c1) FILTER (WHERE ((t2.c2 = 6) AND (t2.c1 < 10))))
+                       Output: (count(*) FILTER (WHERE ((t2.c2 = 6) AND (t2.c1 < 10))))
                        Relations: Aggregate on (public.ft1 t1)
-                       Remote SQL: SELECT count("C 1") FILTER (WHERE (($1::integer = 6) AND ($2::integer < 10))) FROM "S 1"."T 1" WHERE (("C 1" = 6))
+                       Remote SQL: SELECT count(*) FILTER (WHERE (($1::integer = 6) AND ($2::integer < 10))) FROM "S 1"."T 1" WHERE (("C 1" = 6))
 (13 rows)
 
 select distinct (select count(t1.c1) filter (where t2.c2 = 6 and t2.c1 < 10) from ft1 t1 where t1.c1 = 6) from ft2 t2 where t2.c2 % 6 = 0 order by 1;
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 65d473d95b6..671c5cde8fc 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -3413,22 +3413,6 @@ add_base_clause_to_rel(PlannerInfo *root, Index relid,
 										 restrictinfo->security_level);
 }
 
-/*
- * expr_is_nonnullable
- *	  Check to see if the Expr cannot be NULL
- *
- * Currently we only support simple Vars.
- */
-static bool
-expr_is_nonnullable(PlannerInfo *root, Expr *expr)
-{
-	/* For now only check simple Vars */
-	if (!IsA(expr, Var))
-		return false;
-
-	return var_is_nonnullable(root, (Var *) expr, true);
-}
-
 /*
  * restriction_is_always_true
  *	  Check to see if the RestrictInfo is always true.
@@ -3465,7 +3449,7 @@ restriction_is_always_true(PlannerInfo *root,
 		if (nulltest->argisrow)
 			return false;
 
-		return expr_is_nonnullable(root, nulltest->arg);
+		return expr_is_nonnullable(root, nulltest->arg, true);
 	}
 
 	/* If it's an OR, check its sub-clauses */
@@ -3530,7 +3514,7 @@ restriction_is_always_false(PlannerInfo *root,
 		if (nulltest->argisrow)
 			return false;
 
-		return expr_is_nonnullable(root, nulltest->arg);
+		return expr_is_nonnullable(root, nulltest->arg, true);
 	}
 
 	/* If it's an OR, check its sub-clauses */
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 202ba8ed4bb..22da7802717 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -131,6 +131,8 @@ static Expr *simplify_function(Oid funcid,
 							   Oid result_collid, Oid input_collid, List **args_p,
 							   bool funcvariadic, bool process_args, bool allow_non_const,
 							   eval_const_expressions_context *context);
+static Node *simplify_aggref(Aggref *aggref,
+							 eval_const_expressions_context *context);
 static List *reorder_function_arguments(List *args, int pronargs,
 										HeapTuple func_tuple);
 static List *add_function_defaults(List *args, int pronargs,
@@ -2634,6 +2636,9 @@ eval_const_expressions_mutator(Node *node,
 				newexpr->location = expr->location;
 				return (Node *) newexpr;
 			}
+		case T_Aggref:
+			node = ece_generic_processing(node);
+			return simplify_aggref((Aggref *) node, context);
 		case T_OpExpr:
 			{
 				OpExpr	   *expr = (OpExpr *) node;
@@ -4200,6 +4205,50 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod,
 	return newexpr;
 }
 
+/*
+ * simplify_aggref
+ *		Call the Aggref.aggfnoid's prosupport function to allow it to
+ *		determine if simplification of the Aggref is possible.  Returns the
+ *		newly simplified node if conversion took place; otherwise, returns the
+ *		original Aggref.
+ *
+ * See SupportRequestSimplifyAggref comments in supportnodes.h for further
+ * details.
+ */
+static Node *
+simplify_aggref(Aggref *aggref, eval_const_expressions_context *context)
+{
+	Oid			prosupport = get_func_support(aggref->aggfnoid);
+
+	if (OidIsValid(prosupport))
+	{
+		SupportRequestSimplifyAggref req;
+		Node	   *newnode;
+
+		/*
+		 * Build a SupportRequestSimplifyAggref node to pass to the support
+		 * function.
+		 */
+		req.type = T_SupportRequestSimplifyAggref;
+		req.root = context->root;
+		req.aggref = aggref;
+
+		newnode = (Node *) DatumGetPointer(OidFunctionCall1(prosupport,
+															PointerGetDatum(&req)));
+
+		/*
+		 * We expect the support function to return either a new Node or NULL
+		 * (when simplification isn't possible).
+		 */
+		Assert(newnode != (Node *) aggref || newnode == NULL);
+
+		if (newnode != NULL)
+			return newnode;
+	}
+
+	return (Node *) aggref;
+}
+
 /*
  * var_is_nonnullable: check to see if the Var cannot be NULL
  *
@@ -4261,6 +4310,23 @@ var_is_nonnullable(PlannerInfo *root, Var *var, bool use_rel_info)
 	return false;
 }
 
+/*
+ * expr_is_nonnullable: check to see if the Expr cannot be NULL
+ *
+ * This is mostly a wrapper around var_is_nonnullable() but also made to
+ * handle Const.  Support for other node types may be possible.
+ */
+bool
+expr_is_nonnullable(PlannerInfo *root, Expr *expr, bool use_rel_info)
+{
+	if (IsA(expr, Var))
+		return var_is_nonnullable(root, (Var *) expr, use_rel_info);
+	if (IsA(expr, Const))
+		return !castNode(Const, expr)->constisnull;
+
+	return false;
+}
+
 /*
  * expand_function_arguments: convert named-notation args to positional args
  * and/or insert default args, as needed
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index bdea490202a..9cd420b4b9d 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -24,7 +24,7 @@
 #include "nodes/supportnodes.h"
 #include "optimizer/optimizer.h"
 #include "utils/builtins.h"
-
+#include "utils/fmgroids.h"
 
 typedef struct
 {
@@ -811,6 +811,53 @@ int8inc_support(PG_FUNCTION_ARGS)
 		PG_RETURN_POINTER(req);
 	}
 
+	if (IsA(rawreq, SupportRequestSimplifyAggref))
+	{
+		SupportRequestSimplifyAggref *req = (SupportRequestSimplifyAggref *) rawreq;
+		Aggref	   *agg = req->aggref;
+
+		/*
+		 * Check for COUNT(ANY) and try to convert to COUNT(*). The input
+		 * argument cannot be NULL, we can't have an ORDER BY / DISTINCT in
+		 * the aggregate, and agglevelsup must be 0.
+		 *
+		 * Technically COUNT(ANY) must have 1 arg, but be paranoid and check.
+		 */
+		if (agg->aggfnoid == F_COUNT_ANY && list_length(agg->args) == 1)
+		{
+			TargetEntry *tle = (TargetEntry *) linitial(agg->args);
+			Expr	   *arg = tle->expr;
+
+			/* Check for unsupported cases */
+			if (agg->aggdistinct != NIL || agg->aggorder != NIL ||
+				agg->agglevelsup != 0)
+				PG_RETURN_POINTER(NULL);
+
+			/* If the arg isn't NULLable, do the conversion */
+			if (expr_is_nonnullable(req->root, arg, false))
+			{
+				Aggref	   *newagg;
+
+				/* We don't expect these to have been set yet */
+				Assert(agg->aggtransno == -1);
+				Assert(agg->aggtranstype == InvalidOid);
+
+				/* Convert COUNT(ANY) to COUNT(*) by making a new Aggref */
+				newagg = makeNode(Aggref);
+				memcpy(newagg, agg, sizeof(Aggref));
+				newagg->aggfnoid = F_COUNT_;
+
+				/* count(*) has no args */
+				newagg->aggargtypes = NULL;
+				newagg->args = NULL;
+				newagg->aggstar = true;
+				newagg->location = -1;
+
+				PG_RETURN_POINTER(newagg);
+			}
+		}
+	}
+
 	PG_RETURN_POINTER(NULL);
 }
 
diff --git a/src/include/nodes/supportnodes.h b/src/include/nodes/supportnodes.h
index ea774c7ef6a..5e89605b76e 100644
--- a/src/include/nodes/supportnodes.h
+++ b/src/include/nodes/supportnodes.h
@@ -71,6 +71,31 @@ typedef struct SupportRequestSimplify
 	FuncExpr   *fcall;			/* Function call to be simplified */
 } SupportRequestSimplify;
 
+/*
+ * Similar to SupportRequestSimplify but for Aggref node types.
+ *
+ * This supports conversions such as swapping COUNT(1) or COUNT(notnullcol)
+ * for COUNT(*).
+ *
+ * Supporting functions can consult 'root' and the input 'aggref'.  When the
+ * implementing support function deems the simplification is possible, it must
+ * create a new Node (probably another Aggref) and not modify the original.
+ * The newly created Node should then be returned to indicate that the
+ * conversion is to take place.  When no conversion is possible, a NULL
+ * pointer should be returned.
+ *
+ * It is important to consider that implementing support functions can receive
+ * Aggrefs with agglevelsup > 0.  Careful consideration should be given to
+ * whether the simplification is still possible at levels above 0.
+ */
+typedef struct SupportRequestSimplifyAggref
+{
+	NodeTag		type;
+
+	PlannerInfo *root;			/* Planner's infrastructure */
+	Aggref	   *aggref;			/* Aggref to be simplified */
+} SupportRequestSimplifyAggref;
+
 /*
  * The InlineInFrom request allows the support function to perform plan-time
  * simplification of a call to its target function that appears in FROM.
diff --git a/src/include/optimizer/optimizer.h b/src/include/optimizer/optimizer.h
index d0aa8ab0c1c..44ec5296a18 100644
--- a/src/include/optimizer/optimizer.h
+++ b/src/include/optimizer/optimizer.h
@@ -147,6 +147,9 @@ extern Expr *evaluate_expr(Expr *expr, Oid result_type, int32 result_typmod,
 
 extern bool var_is_nonnullable(PlannerInfo *root, Var *var, bool use_rel_info);
 
+extern bool expr_is_nonnullable(PlannerInfo *root, Expr *expr,
+								bool use_rel_info);
+
 extern List *expand_function_arguments(List *args, bool include_out_arguments,
 									   Oid result_type,
 									   HeapTuple func_tuple);
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index bc83a6e188e..be0e1573183 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -1219,24 +1219,42 @@ select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
  9999 | 1
 (3 rows)
 
--- interesting corner case: constant gets optimized into a seqscan
+-- two interesting corner cases: both non-null and null constant gets
+-- optimized into a seqscan
 explain (costs off)
   select max(100) from tenk1;
-                     QUERY PLAN                     
-----------------------------------------------------
+           QUERY PLAN            
+---------------------------------
+ Result
+   Replaces: MinMaxAggregate
+   InitPlan minmax_1
+     ->  Limit
+           ->  Seq Scan on tenk1
+(5 rows)
+
+select max(100) from tenk1;
+ max 
+-----
+ 100
+(1 row)
+
+explain (costs off)
+  select max(null) from tenk1;
+                        QUERY PLAN                         
+-----------------------------------------------------------
  Result
    Replaces: MinMaxAggregate
    InitPlan minmax_1
      ->  Limit
            ->  Result
-                 One-Time Filter: (100 IS NOT NULL)
+                 One-Time Filter: (NULL::text IS NOT NULL)
                  ->  Seq Scan on tenk1
 (7 rows)
 
-select max(100) from tenk1;
+select max(null) from tenk1;
  max 
 -----
- 100
+ 
 (1 row)
 
 -- try it on an inheritance tree
@@ -2821,6 +2839,101 @@ select pg_typeof(cleast_agg(variadic array[4.5,f1])) from int4_tbl;
  numeric
 (1 row)
 
+--
+-- Test SupportRequestSimplifyAggref code
+--
+begin;
+create table agg_simplify (a int, not_null_col int not null, nullable_col int);
+-- Ensure count(not_null_col) uses count(*)
+explain (costs off, verbose)
+select count(not_null_col) from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure count(<not null const>) uses count(*)
+explain (costs off, verbose)
+select count('bananas') from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(*)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure count(null) isn't optimized
+explain (costs off, verbose)
+select count(null) from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(NULL::unknown)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure count(nullable_col) does not use count(*)
+explain (costs off, verbose)
+select count(nullable_col) from agg_simplify;
+                  QUERY PLAN                   
+-----------------------------------------------
+ Aggregate
+   Output: count(nullable_col)
+   ->  Seq Scan on public.agg_simplify
+         Output: a, not_null_col, nullable_col
+(4 rows)
+
+-- Ensure there's no optimization with DISTINCT aggs
+explain (costs off, verbose)
+select count(distinct not_null_col) from agg_simplify;
+                 QUERY PLAN                  
+---------------------------------------------
+ Aggregate
+   Output: count(DISTINCT not_null_col)
+   ->  Sort
+         Output: not_null_col
+         Sort Key: agg_simplify.not_null_col
+         ->  Seq Scan on public.agg_simplify
+               Output: not_null_col
+(7 rows)
+
+-- Ensure there's no optimization with ORDER BY aggs
+explain (costs off, verbose)
+select count(not_null_col order by not_null_col) from agg_simplify;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ Aggregate
+   Output: count(not_null_col ORDER BY not_null_col)
+   ->  Sort
+         Output: not_null_col
+         Sort Key: agg_simplify.not_null_col
+         ->  Seq Scan on public.agg_simplify
+               Output: not_null_col
+(7 rows)
+
+-- Ensure we don't optimize to count(*) with agglevelsup > 0
+explain (costs off, verbose)
+select a from agg_simplify a group by a
+having exists (select 1 from onek b where count(a.not_null_col) = b.four);
+                     QUERY PLAN                      
+-----------------------------------------------------
+ HashAggregate
+   Output: a.a
+   Group Key: a.a
+   Filter: EXISTS(SubPlan exists_1)
+   ->  Seq Scan on public.agg_simplify a
+         Output: a.a, a.not_null_col, a.nullable_col
+   SubPlan exists_1
+     ->  Seq Scan on public.onek b
+           Filter: (count(a.not_null_col) = b.four)
+(9 rows)
+
+rollback;
 -- test aggregates with common transition functions share the same states
 begin work;
 create type avg_state as (total bigint, count bigint);
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index 908af50def3..77ca6ffa3a9 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -416,11 +416,16 @@ explain (costs off)
   select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
 select max(unique2), generate_series(1,3) as g from tenk1 order by g desc;
 
--- interesting corner case: constant gets optimized into a seqscan
+-- two interesting corner cases: both non-null and null constant gets
+-- optimized into a seqscan
 explain (costs off)
   select max(100) from tenk1;
 select max(100) from tenk1;
 
+explain (costs off)
+  select max(null) from tenk1;
+select max(null) from tenk1;
+
 -- try it on an inheritance tree
 create table minmaxtest(f1 int);
 create table minmaxtest1() inherits (minmaxtest);
@@ -1108,6 +1113,43 @@ select cleast_agg(4.5,f1) from int4_tbl;
 select cleast_agg(variadic array[4.5,f1]) from int4_tbl;
 select pg_typeof(cleast_agg(variadic array[4.5,f1])) from int4_tbl;
 
+--
+-- Test SupportRequestSimplifyAggref code
+--
+begin;
+create table agg_simplify (a int, not_null_col int not null, nullable_col int);
+
+-- Ensure count(not_null_col) uses count(*)
+explain (costs off, verbose)
+select count(not_null_col) from agg_simplify;
+
+-- Ensure count(<not null const>) uses count(*)
+explain (costs off, verbose)
+select count('bananas') from agg_simplify;
+
+-- Ensure count(null) isn't optimized
+explain (costs off, verbose)
+select count(null) from agg_simplify;
+
+-- Ensure count(nullable_col) does not use count(*)
+explain (costs off, verbose)
+select count(nullable_col) from agg_simplify;
+
+-- Ensure there's no optimization with DISTINCT aggs
+explain (costs off, verbose)
+select count(distinct not_null_col) from agg_simplify;
+
+-- Ensure there's no optimization with ORDER BY aggs
+explain (costs off, verbose)
+select count(not_null_col order by not_null_col) from agg_simplify;
+
+-- Ensure we don't optimize to count(*) with agglevelsup > 0
+explain (costs off, verbose)
+select a from agg_simplify a group by a
+having exists (select 1 from onek b where count(a.not_null_col) = b.four);
+
+rollback;
+
 -- test aggregates with common transition functions share the same states
 begin work;
 
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 57a8f0366a5..3706ec42b32 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2923,6 +2923,7 @@ SupportRequestOptimizeWindowClause
 SupportRequestRows
 SupportRequestSelectivity
 SupportRequestSimplify
+SupportRequestSimplifyAggref
 SupportRequestWFuncMonotonic
 Syn
 SyncOps
-- 
2.43.0

#12

David Rowley

dgrowleyml@gmail.com

about 2 months ago

In reply to: David Rowley (#11)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

On Wed, 26 Nov 2025 at 12:37, David Rowley <dgrowleyml@gmail.com> wrote:

I'm not seeing any reason now not to go ahead with this now. Does
anyone else want to take a look at it before I start wielding the
sword of commitment on it?

And pushed. Many thanks to Corey and Matheus for having a look at this.

David