[PATCH] distinct aggregates within a window function WIP
Hi hackers,
I want to propose to you an old patch for Postgres 11, off-site developed
by Oliver Ford,
but I have permission from him to publish it and to continue it's
development,
that allow distinct aggregates, like select sum(distinct nums) within a
window function.
I have rebased it for current git master branch and have made necessary
changes to it to work with Postgres 13devel.
It's a WIP, because it doesn't have tests yet (I will add them later) and
also, it works for a int, float, and numeric types,
but probably distinct check can be rewritten for possible performance
improvement,
with storing the distinct elements in a hash table which should give a
performance improvement.
If you find the implementation of patch acceptable from committers
perspective,
I will answer to all yours design and review notes and will try to go ahead
with it,
also, I will add this patch to the March commit fest.
For example usage of a patch, if you have time series data, with current
Postgres you will get an error:
postgres=# CREATE TABLE t_demo AS
postgres-# SELECT ordinality, day, date_part('week', day) AS week
postgres-# FROM generate_series('2020-01-02', '2020-01-15', '1
day'::interval)
postgres-# WITH ORDINALITY AS day;
SELECT 14
postgres=# SELECT * FROM t_demo;
ordinality | day | week
------------+------------------------+------
1 | 2020-01-02 00:00:00+02 | 1
2 | 2020-01-03 00:00:00+02 | 1
3 | 2020-01-04 00:00:00+02 | 1
4 | 2020-01-05 00:00:00+02 | 1
5 | 2020-01-06 00:00:00+02 | 2
6 | 2020-01-07 00:00:00+02 | 2
7 | 2020-01-08 00:00:00+02 | 2
8 | 2020-01-09 00:00:00+02 | 2
9 | 2020-01-10 00:00:00+02 | 2
10 | 2020-01-11 00:00:00+02 | 2
11 | 2020-01-12 00:00:00+02 | 2
12 | 2020-01-13 00:00:00+02 | 3
13 | 2020-01-14 00:00:00+02 | 3
14 | 2020-01-15 00:00:00+02 | 3
(14 rows)
postgres=# SELECT *,
postgres-# array_agg(DISTINCT week) OVER (ORDER BY day ROWS
postgres(# BETWEEN 2 PRECEDING AND 2
FOLLOWING)
postgres-# FROM t_demo;
ERROR: DISTINCT is not implemented for window functions
LINE 2: array_agg(DISTINCT week) OVER (ORDER BY day ROWS
^
So you will need to write something like this:
postgres=# SELECT *, (SELECT array_agg(DISTINCT unnest) FROM unnest(x)) AS
b
postgres-# FROM
postgres-# (
postgres(# SELECT *,
postgres(# array_agg(week) OVER (ORDER BY day ROWS
postgres(# BETWEEN 2 PRECEDING AND 2 FOLLOWING) AS x
postgres(# FROM t_demo
postgres(# ) AS a;
ordinality | day | week | x | b
------------+------------------------+------+-------------+-------
1 | 2020-01-02 00:00:00+02 | 1 | {1,1,1} | {1}
2 | 2020-01-03 00:00:00+02 | 1 | {1,1,1,1} | {1}
3 | 2020-01-04 00:00:00+02 | 1 | {1,1,1,1,2} | {1,2}
4 | 2020-01-05 00:00:00+02 | 1 | {1,1,1,2,2} | {1,2}
5 | 2020-01-06 00:00:00+02 | 2 | {1,1,2,2,2} | {1,2}
6 | 2020-01-07 00:00:00+02 | 2 | {1,2,2,2,2} | {1,2}
7 | 2020-01-08 00:00:00+02 | 2 | {2,2,2,2,2} | {2}
8 | 2020-01-09 00:00:00+02 | 2 | {2,2,2,2,2} | {2}
9 | 2020-01-10 00:00:00+02 | 2 | {2,2,2,2,2} | {2}
10 | 2020-01-11 00:00:00+02 | 2 | {2,2,2,2,3} | {2,3}
11 | 2020-01-12 00:00:00+02 | 2 | {2,2,2,3,3} | {2,3}
12 | 2020-01-13 00:00:00+02 | 3 | {2,2,3,3,3} | {2,3}
13 | 2020-01-14 00:00:00+02 | 3 | {2,3,3,3} | {2,3}
14 | 2020-01-15 00:00:00+02 | 3 | {3,3,3} | {3}
(14 rows)
With attached version, you will get the desired results:
postgres=# SELECT *,
postgres-# array_agg(DISTINCT week) OVER (ORDER BY day ROWS
postgres(# BETWEEN 2 PRECEDING AND 2
FOLLOWING)
postgres-# FROM t_demo;
ordinality | day | week | array_agg
------------+------------------------+------+-----------
1 | 2020-01-02 00:00:00+02 | 1 | {1}
2 | 2020-01-03 00:00:00+02 | 1 | {1}
3 | 2020-01-04 00:00:00+02 | 1 | {1,2}
4 | 2020-01-05 00:00:00+02 | 1 | {1,2}
5 | 2020-01-06 00:00:00+02 | 2 | {1,2}
6 | 2020-01-07 00:00:00+02 | 2 | {1,2}
7 | 2020-01-08 00:00:00+02 | 2 | {2}
8 | 2020-01-09 00:00:00+02 | 2 | {2}
9 | 2020-01-10 00:00:00+02 | 2 | {2}
10 | 2020-01-11 00:00:00+02 | 2 | {2,3}
11 | 2020-01-12 00:00:00+02 | 2 | {2,3}
12 | 2020-01-13 00:00:00+02 | 3 | {2,3}
13 | 2020-01-14 00:00:00+02 | 3 | {2,3}
14 | 2020-01-15 00:00:00+02 | 3 | {3}
(14 rows)
Attachments:
pg13-distinct-window.patchtext/x-patch; charset=US-ASCII; name=pg13-distinct-window.patchDownload
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 4cc7da268d..66ab18bab6 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -34,9 +34,14 @@
#include "postgres.h"
#include "access/htup_details.h"
+#include "access/nbtree.h"
#include "catalog/objectaccess.h"
#include "catalog/pg_aggregate.h"
+#include "catalog/pg_am.h"
#include "catalog/pg_proc.h"
+#include "catalog/pg_type.h"
+#include "commands/defrem.h"
+#include "common/int.h"
#include "executor/executor.h"
#include "executor/nodeWindowAgg.h"
#include "miscadmin.h"
@@ -156,6 +161,12 @@ typedef struct WindowStatePerAggData
/* Data local to eval_windowaggregates() */
bool restart; /* need to restart this agg in this cycle? */
+
+ FmgrInfo distinct_cmpfn;
+ int16 distinct_typlen;
+ void *distinctArr; /* Array of DISTINCT values */
+ int64 distinctSize; /* Current size of the working distinctArr */
+ int64 distinctCount; /* Current number of values in the working distinctArr */
} WindowStatePerAggData;
static void initialize_windowaggregate(WindowAggState *winstate,
@@ -209,6 +220,24 @@ initialize_windowaggregate(WindowAggState *winstate,
{
MemoryContext oldContext;
+ if (perfuncstate->wfunc->windistinct)
+ {
+ if (peraggstate->distinctSize > 0)
+ {
+ pfree(peraggstate->distinctArr);
+ peraggstate->distinctCount = 0;
+ peraggstate->distinctSize = 0;
+ }
+
+ /* If the type is a fixed length, allocate an initial array of size 16 */
+ Assert(peraggstate->distinct_typlen > 0 || peraggstate->distinct_typlen == -1);
+ if (peraggstate->distinct_typlen > 0)
+ {
+ peraggstate->distinctArr = palloc0(sizeof(int64) * 16);
+ peraggstate->distinctSize = 16;
+ }
+ }
+
/*
* If we're using a private aggcontext, we may reset it here. But if the
* context is shared, we don't know which other aggregates may still need
@@ -278,6 +307,68 @@ advance_windowaggregate(WindowAggState *winstate,
i++;
}
+ if (wfuncstate->wfunc->windistinct && !fcinfo->args[1].isnull)
+ {
+ MemoryContext tupleContext = MemoryContextSwitchTo(oldContext);
+
+ if (peraggstate->distinct_typlen > 0)
+ {
+ for (i = 0; i < peraggstate->distinctCount; i++)
+ {
+ if (DatumGetInt32(FunctionCall2(&peraggstate->distinct_cmpfn,
+ fcinfo->args[1].value,
+ *((int64 *)peraggstate->distinctArr + i))) == 0)
+ return;
+ }
+
+ if (peraggstate->distinctCount == peraggstate->distinctSize)
+ {
+ peraggstate->distinctSize *= 2;
+ peraggstate->distinctArr = (int64 *) repalloc(peraggstate->distinctArr,
+ sizeof(int64) * peraggstate->distinctSize);
+ }
+ *((int64 *)peraggstate->distinctArr + peraggstate->distinctCount) = fcinfo->args[1].value;
+ }
+ else if (peraggstate->distinct_typlen == -1)
+ {
+ uint64 len, tmp_len = 0;
+ int8 *arr = (int8 *) peraggstate->distinctArr;
+ int8 *arg = (int8 *) DatumGetPointer(fcinfo->args[1].value);
+ int64 varlen = VARSIZE_ANY(fcinfo->args[1].value);
+
+ for (i = 0; i < peraggstate->distinctCount; i++)
+ {
+ uint64 curr_len;
+ if (DatumGetInt32(FunctionCall2(&peraggstate->distinct_cmpfn,
+ fcinfo->args[1].value,
+ PointerGetDatum(arr)) == 0))
+ return;
+ curr_len = VARSIZE_ANY(arr);
+ tmp_len += curr_len;
+ arr += curr_len;
+ }
+
+ if (peraggstate->distinctCount == 0)
+ {
+ len = sh_pow2(varlen);
+ peraggstate->distinctArr = (int8 *) palloc0(len);
+ }
+ else
+ {
+ len = sh_pow2(tmp_len + varlen);
+ if (len > peraggstate->distinctSize)
+ peraggstate->distinctArr = (int8 *) repalloc(peraggstate->distinctArr,
+ len);
+ }
+ arr = (int8 *) peraggstate->distinctArr + tmp_len;
+ peraggstate->distinctSize = len;
+ for (i = 0; i < varlen; i++) *arr++ = *arg++;
+ }
+
+ peraggstate->distinctCount++;
+ MemoryContextSwitchTo(tupleContext);
+ }
+
if (peraggstate->transfn.fn_strict)
{
/*
@@ -428,6 +519,9 @@ advance_windowaggregate_base(WindowAggState *winstate,
ExprContext *econtext = winstate->tmpcontext;
ExprState *filter = wfuncstate->aggfilter;
+ if (wfuncstate->wfunc->windistinct)
+ return false;
+
oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
/* Skip anything FILTERed out */
@@ -2610,8 +2704,6 @@ ExecReScanWindowAgg(WindowAggState *node)
/*
* initialize_peragg
- *
- * Almost same as in nodeAgg.c, except we don't support DISTINCT currently.
*/
static WindowStatePerAggData *
initialize_peragg(WindowAggState *winstate, WindowFunc *wfunc,
@@ -2651,6 +2743,27 @@ initialize_peragg(WindowAggState *winstate, WindowFunc *wfunc,
wfunc->winfnoid);
aggform = (Form_pg_aggregate) GETSTRUCT(aggTuple);
+ if (wfunc->windistinct)
+ {
+ Oid opclass,
+ opfamily,
+ cmp_func_oid;
+ HeapTuple typeTuple;
+ Form_pg_type typeform;
+
+ opclass = GetDefaultOpClass(wfunc->winaggargtype, BTREE_AM_OID);
+ opfamily = get_opclass_family(opclass);
+ cmp_func_oid = get_opfamily_proc(opfamily, wfunc->winaggargtype, wfunc->winaggargtype, BTORDER_PROC);
+ fmgr_info(cmp_func_oid, &peraggstate->distinct_cmpfn);
+
+ typeTuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(wfunc->winaggargtype));
+ if (!HeapTupleIsValid(typeTuple))
+ elog(ERROR, "cache lookup failed for type %u", wfunc->winaggargtype);
+ typeform = (Form_pg_type) GETSTRUCT(typeTuple);
+ peraggstate->distinct_typlen = typeform->typlen;
+ ReleaseSysCache(typeTuple);
+ }
+
/*
* Figure out whether we want to use the moving-aggregate implementation,
* and collect the right set of fields from the pg_attribute entry.
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 54ad62bb7f..fb8058af24 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1493,6 +1493,8 @@ _copyWindowFunc(const WindowFunc *from)
COPY_SCALAR_FIELD(winref);
COPY_SCALAR_FIELD(winstar);
COPY_SCALAR_FIELD(winagg);
+ COPY_SCALAR_FIELD(windistinct);
+ COPY_SCALAR_FIELD(winaggargtype);
COPY_LOCATION_FIELD(location);
return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 5b1ba143b1..24a03097a0 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -264,6 +264,8 @@ _equalWindowFunc(const WindowFunc *a, const WindowFunc *b)
COMPARE_SCALAR_FIELD(winref);
COMPARE_SCALAR_FIELD(winstar);
COMPARE_SCALAR_FIELD(winagg);
+ COMPARE_SCALAR_FIELD(windistinct);
+ COMPARE_SCALAR_FIELD(winaggargtype);
COMPARE_LOCATION_FIELD(location);
return true;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index d76fae44b8..c40dbd0cb4 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1162,6 +1162,8 @@ _outWindowFunc(StringInfo str, const WindowFunc *node)
WRITE_UINT_FIELD(winref);
WRITE_BOOL_FIELD(winstar);
WRITE_BOOL_FIELD(winagg);
+ WRITE_BOOL_FIELD(windistinct);
+ WRITE_OID_FIELD(winaggargtype);
WRITE_LOCATION_FIELD(location);
}
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 551ce6c41c..aaa327ce87 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -653,6 +653,8 @@ _readWindowFunc(void)
READ_UINT_FIELD(winref);
READ_BOOL_FIELD(winstar);
READ_BOOL_FIELD(winagg);
+ READ_BOOL_FIELD(windistinct);
+ READ_OID_FIELD(winaggargtype);
READ_LOCATION_FIELD(location);
READ_DONE();
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 2d3ec22407..5cb87c7073 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -2451,6 +2451,8 @@ eval_const_expressions_mutator(Node *node,
newexpr->winref = expr->winref;
newexpr->winstar = expr->winstar;
newexpr->winagg = expr->winagg;
+ newexpr->windistinct = expr->windistinct;
+ newexpr->winaggargtype = expr->winaggargtype;
newexpr->location = expr->location;
return (Node *) newexpr;
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 9c3b6ad916..c663403bfb 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -825,13 +825,21 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
wfunc->location = location;
/*
- * agg_star is allowed for aggregate functions but distinct isn't
+ * In a window function, DISTINCT is allowed only for aggregate
+ * functions and is handled in nodeWindowAgg.c separately from
+ * when it occurs outside of a window function.
*/
if (agg_distinct)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("DISTINCT is not implemented for window functions"),
- parser_errposition(pstate, location)));
+ {
+ if (!wfunc->winagg)
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("DISTINCT is only allowed in aggregate window functions"),
+ parser_errposition(pstate, location)));
+ wfunc->windistinct = true;
+ agg_distinct = false;
+ wfunc->winaggargtype = actual_arg_types[0];
+ }
/*
* Reject attempt to call a parameterless aggregate without (*)
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 116e00bce4..bf4a86b0a9 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -9464,7 +9464,11 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
if (wfunc->winstar)
appendStringInfoChar(buf, '*');
else
+ {
+ if (wfunc->windistinct)
+ appendStringInfoString(buf, "DISTINCT ");
get_rule_expr((Node *) wfunc->args, context, true);
+ }
if (wfunc->aggfilter != NULL)
{
diff --git a/src/include/common/int.h b/src/include/common/int.h
index a2972218e7..6fb2885977 100644
--- a/src/include/common/int.h
+++ b/src/include/common/int.h
@@ -434,4 +434,24 @@ pg_mul_u64_overflow(uint64 a, uint64 b, uint64 *result)
#endif
}
+
+/* calculate ceil(log base 2) of num */
+static inline uint64
+sh_log2(uint64 num)
+{
+ int i;
+ uint64 limit;
+
+ for (i = 0, limit = 1; limit < num; i++, limit <<= 1)
+ ;
+ return i;
+}
+
+/* calculate first power of 2 >= num */
+static inline uint64
+sh_pow2(uint64 num)
+{
+ return ((uint64) 1) << sh_log2(num);
+}
+
#endif /* COMMON_INT_H */
diff --git a/src/include/lib/simplehash.h b/src/include/lib/simplehash.h
index 5a6783f653..669d97cac2 100644
--- a/src/include/lib/simplehash.h
+++ b/src/include/lib/simplehash.h
@@ -57,6 +57,8 @@
* backwards, unless they're empty or already at their optimal position.
*/
+#include "common/int.h"
+
/* helpers */
#define SH_MAKE_PREFIX(a) CppConcat(a,_)
#define SH_MAKE_NAME(name) SH_MAKE_NAME_(SH_MAKE_PREFIX(SH_PREFIX),name)
@@ -215,27 +217,6 @@ SH_SCOPE void SH_STAT(SH_TYPE * tb);
#ifndef SIMPLEHASH_H
#define SIMPLEHASH_H
-/* FIXME: can we move these to a central location? */
-
-/* calculate ceil(log base 2) of num */
-static inline uint64
-sh_log2(uint64 num)
-{
- int i;
- uint64 limit;
-
- for (i = 0, limit = 1; limit < num; i++, limit <<= 1)
- ;
- return i;
-}
-
-/* calculate first power of 2 >= num */
-static inline uint64
-sh_pow2(uint64 num)
-{
- return ((uint64) 1) << sh_log2(num);
-}
-
#ifdef FRONTEND
#define sh_error(...) pg_log_error(__VA_ARGS__)
#define sh_log(...) pg_log_info(__VA_ARGS__)
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index d73be2ad46..710eea443f 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -379,6 +379,8 @@ typedef struct WindowFunc
Index winref; /* index of associated WindowClause */
bool winstar; /* true if argument list was really '*' */
bool winagg; /* is function a simple aggregate? */
+ bool windistinct; /* is function a DISTINCT aggregate? */
+ Oid winaggargtype; /* arg type of function, used for DISTINCT */
int location; /* token location, or -1 if unknown */
} WindowFunc;
Krasiyan Andreev <krasiyan@gmail.com> writes:
I want to propose to you an old patch for Postgres 11, off-site developed
by Oliver Ford,
but I have permission from him to publish it and to continue it's
development,
that allow distinct aggregates, like select sum(distinct nums) within a
window function.
I started to respond by asking whether that's well-defined, but
reading down further I see that that's not actually what the feature
is: what it is is attaching DISTINCT to a window function itself.
I'd still ask whether it's well-defined though, or even minimally
sensible. Window functions are generally supposed to produce one
row per input row --- how does that square with the implicit row
merging of DISTINCT? They're also typically row-order-sensitive
--- how does that work with DISTINCT? Also, to the extent that
this is sensible, can't you get the same results already today
with appropriate use of window framing options?
It's a WIP, because it doesn't have tests yet (I will add them later) and
also, it works for a int, float, and numeric types,
As a rule of thumb, operations like this should not be coded to be
datatype-specific. We threw out some features in the original window
function patch until they could be rewritten to not be limited to a
hard-coded set of data types (cf commit 0a459cec9), and I don't see
why we'd apply a lesser standard here. Certainly DISTINCT for
aggregates has no such limitation.
regards, tom lane
Tom Lane schrieb am 13.01.2020 um 15:19:
what it is is attaching DISTINCT to a window function itself. I'd still ask whether it's well-defined though, or even minimally sensible. Window functions are generally supposed to produce one row per input row --- how does that square with the implicit row merging of DISTINCT? They're also typically row-order-sensitive --- how does that work with DISTINCT? Also, to the extent that this is sensible, can't you get the same results already today with appropriate use of window framing options?
I find the example using array_agg() and cumulative window functions a
bit confusing as well, but I think there are situations where having this
is really helpful, e.g.:
count(distinct some_column) over (partition by something)
I know it's not an argument, but Oracle supports this and porting
queries like that from Oracle to Postgres isn't really fun.
Thomas
I understand yours note about datatype-specific operations, so I need to
think more generic about it.
About yours additional note, I think that it is not possible to get easy
the same result with appropriate use of window framing options,
because "exclude ties" will not exclude "current row" itself, only peers of
it. So, that is the only difference and reason of DISTINCT aggregate.
Maybe, if we can specify at the same time to "exclude ties" and "exclude
current row" itself, there will not be need of DISTINCT, but right now
I think that nor "exclude ties" nor "exclude groups" or "exclude current
row", can specify it, because they can't be nested or used at the same time.
На пн, 13.01.2020 г. в 16:19 Tom Lane <tgl@sss.pgh.pa.us> написа:
Show quoted text
Krasiyan Andreev <krasiyan@gmail.com> writes:
I want to propose to you an old patch for Postgres 11, off-site developed
by Oliver Ford,
but I have permission from him to publish it and to continue it's
development,
that allow distinct aggregates, like select sum(distinct nums) within a
window function.I started to respond by asking whether that's well-defined, but reading down further I see that that's not actually what the feature is: what it is is attaching DISTINCT to a window function itself. I'd still ask whether it's well-defined though, or even minimally sensible. Window functions are generally supposed to produce one row per input row --- how does that square with the implicit row merging of DISTINCT? They're also typically row-order-sensitive --- how does that work with DISTINCT? Also, to the extent that this is sensible, can't you get the same results already today with appropriate use of window framing options?It's a WIP, because it doesn't have tests yet (I will add them later) and
also, it works for a int, float, and numeric types,As a rule of thumb, operations like this should not be coded to be
datatype-specific. We threw out some features in the original window
function patch until they could be rewritten to not be limited to a
hard-coded set of data types (cf commit 0a459cec9), and I don't see
why we'd apply a lesser standard here. Certainly DISTINCT for
aggregates has no such limitation.regards, tom lane
On 13/01/2020 15:19, Tom Lane wrote:
Krasiyan Andreev <krasiyan@gmail.com> writes:
I want to propose to you an old patch for Postgres 11, off-site developed
by Oliver Ford,
but I have permission from him to publish it and to continue it's
development,
that allow distinct aggregates, like select sum(distinct nums) within a
window function.I started to respond by asking whether that's well-defined, but reading down further I see that that's not actually what the feature is: what it is is attaching DISTINCT to a window function itself. I'd still ask whether it's well-defined though, or even minimally sensible. Window functions are generally supposed to produce one row per input row --- how does that square with the implicit row merging of DISTINCT? They're also typically row-order-sensitive --- how does that work with DISTINCT?
It's a little strange because the spec says:
<q>
If the window ordering clause or the window framing clause of the window
structure descriptor that describes the <window name or specification>
is present, then no <aggregate function> simply contained in <window
function> shall specify DISTINCT or <ordered set function>.
</q>
So it seems to be well defined if all you have is a partition.
But then it also says:
<q>
DENSE_RANK() OVER WNS is equivalent to the <window function>:
COUNT (DISTINCT ROW ( VE 1 , ..., VE N ) )
OVER (WNS1 RANGE UNBOUNDED PRECEDING)
</q>
And that kind of looks like a framing clause there.
Also, to the extent that
this is sensible, can't you get the same results already today
with appropriate use of window framing options?
I don't see how.
I have sometimes wanted this feature so I am +1 on us getting at least a
minimal form of it.
--
Vik
I have currently suspended development of this patch, based on it's review,
but I will continue development of the other Oliver Ford's work about
adding support of respect/ignore nulls
for lag(),lead(),first_value(),last_value() and nth_value() and from
first/last for nth_value() patch,
but I am not sure how to proceed with it's implementation and any feedback
will be very helpful.
I have dropped support of from first/last for nth_value(), but also I
reimplemented it in a different way,
by using negative number for the position argument, to be able to get the
same frame in exact reverse order.
After that patch becomes much more simple and major concerns about
precedence hack has gone,
but maybe it can be additionally simplified.
I have not renamed special bool type "ignorenulls", because it is probably
not acceptable way for calling extra version
of window functions (but it makes things very easy and it can reuse
frames), but I removed the other special bool type "fromlast".
Attached file is for PostgreSQL 13 (master git branch) and I will add it
now to a March commit fest, to be able to track changes.
Everything works and patch is in very good shape, all tests are passed and
also, I use it from some time for SQL analysis purposes
(because ignore nulls is one of the most needed feature in OLAP/BI area and
Oracle, Amazon Redshift, even Informix have it).
After patch review and suggestions about what to do with special bool type
and unreserved keywords, I will reimplement it, if needed.
На пн, 13.01.2020 г. в 18:19 Vik Fearing <vik.fearing@2ndquadrant.com>
написа:
Show quoted text
On 13/01/2020 15:19, Tom Lane wrote:
Krasiyan Andreev <krasiyan@gmail.com> writes:
I want to propose to you an old patch for Postgres 11, off-site
developed
by Oliver Ford,
but I have permission from him to publish it and to continue it's
development,
that allow distinct aggregates, like select sum(distinct nums) within a
window function.I started to respond by asking whether that's well-defined, but reading down further I see that that's not actually what the feature is: what it is is attaching DISTINCT to a window function itself. I'd still ask whether it's well-defined though, or even minimally sensible. Window functions are generally supposed to produce one row per input row --- how does that square with the implicit row merging of DISTINCT? They're also typically row-order-sensitive --- how does that work with DISTINCT?It's a little strange because the spec says:
<q>
If the window ordering clause or the window framing clause of the window
structure descriptor that describes the <window name or specification>
is present, then no <aggregate function> simply contained in <window
function> shall specify DISTINCT or <ordered set function>.
</q>So it seems to be well defined if all you have is a partition.
But then it also says:
<q>
DENSE_RANK() OVER WNS is equivalent to the <window function>:
COUNT (DISTINCT ROW ( VE 1 , ..., VE N ) )
OVER (WNS1 RANGE UNBOUNDED PRECEDING)
</q>And that kind of looks like a framing clause there.
Also, to the extent that
this is sensible, can't you get the same results already today
with appropriate use of window framing options?I don't see how.
I have sometimes wanted this feature so I am +1 on us getting at least a
minimal form of it.--
Vik
Attachments:
pg13_ignore_nulls.patchtext/x-patch; charset=US-ASCII; name=pg13_ignore_nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 28035f1635..3d73c96891 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15702,7 +15702,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<function>
lag(<replaceable class="parameter">value</replaceable> <type>anyelement</type>
[, <replaceable class="parameter">offset</replaceable> <type>integer</type>
- [, <replaceable class="parameter">default</replaceable> <type>anyelement</type> ]])
+ [, <replaceable class="parameter">default</replaceable> <type>anyelement</type> ]]) [null_treatment]
</function>
</entry>
<entry>
@@ -15731,7 +15731,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<function>
lead(<replaceable class="parameter">value</replaceable> <type>anyelement</type>
[, <replaceable class="parameter">offset</replaceable> <type>integer</type>
- [, <replaceable class="parameter">default</replaceable> <type>anyelement</type> ]])
+ [, <replaceable class="parameter">default</replaceable> <type>anyelement</type> ]]) [null_treatment]
</function>
</entry>
<entry>
@@ -15757,7 +15757,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<indexterm>
<primary>first_value</primary>
</indexterm>
- <function>first_value(<replaceable class="parameter">value</replaceable> <type>any</type>)</function>
+ <function>first_value(<replaceable class="parameter">value</replaceable> <type>any</type>) [null_treatment]</function>
</entry>
<entry>
<type>same type as <replaceable class="parameter">value</replaceable></type>
@@ -15773,7 +15773,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<indexterm>
<primary>last_value</primary>
</indexterm>
- <function>last_value(<replaceable class="parameter">value</replaceable> <type>any</type>)</function>
+ <function>last_value(<replaceable class="parameter">value</replaceable> <type>any</type>) [null_treatment]</function>
</entry>
<entry>
<type>same type as <replaceable class="parameter">value</replaceable></type>
@@ -15790,7 +15790,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<primary>nth_value</primary>
</indexterm>
<function>
- nth_value(<replaceable class="parameter">value</replaceable> <type>any</type>, <replaceable class="parameter">nth</replaceable> <type>integer</type>)
+ nth_value(<replaceable class="parameter">value</replaceable> <type>any</type>, <replaceable class="parameter">nth</replaceable> <type>integer</type>) [null_treatment]
</function>
</entry>
<entry>
@@ -15806,6 +15806,16 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
</tgroup>
</table>
+ <para>
+ In <xref linkend="functions-window-table"/>, <replaceable>null_treatment</replaceable> is one of:
+ <synopsis>
+ RESPECT NULLS
+ IGNORE NULLS
+ </synopsis>
+ <literal>RESPECT NULLS</literal> specifies the default behavior to include nulls in the result.
+ <literal>IGNORE NULLS</literal> ignores any null values when determining a result.
+ </para>
+
<para>
All of the functions listed in
<xref linkend="functions-window-table"/> depend on the sort ordering
@@ -15843,17 +15853,11 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
- The SQL standard defines a <literal>RESPECT NULLS</literal> or
- <literal>IGNORE NULLS</literal> option for <function>lead</function>, <function>lag</function>,
- <function>first_value</function>, <function>last_value</function>, and
- <function>nth_value</function>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</literal>.
- Likewise, the standard's <literal>FROM FIRST</literal> or <literal>FROM LAST</literal>
- option for <function>nth_value</function> is not implemented: only the
- default <literal>FROM FIRST</literal> behavior is supported. (You can achieve
- the result of <literal>FROM LAST</literal> by reversing the <literal>ORDER BY</literal>
- ordering.)
+ The SQL standard defines a <literal>FROM FIRST</literal> or <literal>FROM LAST</literal>
+ option for <function>nth_value</function>. This is not implemented in <productname>PostgreSQL</productname>:
+ only the default <literal>FROM FIRST</literal> behavior is supported.
+ (You can achieve the result of <literal>FROM LAST</literal> by using negative number for the position argument,
+ as is done in many languages to indicate a <literal>FROM END</literal> index.)
</para>
</note>
diff --git a/src/backend/catalog/sql_features.txt b/src/backend/catalog/sql_features.txt
index 9f840ddfd2..a355e379e7 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -508,9 +508,9 @@ T612 Advanced OLAP operations YES
T613 Sampling YES
T614 NTILE function YES
T615 LEAD and LAG functions YES
-T616 Null treatment option for LEAD and LAG functions NO
+T616 Null treatment option for LEAD and LAG functions YES
T617 FIRST_VALUE and LAST_VALUE function YES
-T618 NTH_VALUE function NO function exists, but some options missing
+T618 NTH_VALUE function YES FROM LAST is supported by using negative number for the position argument
T619 Nested window functions NO
T620 WINDOW clause: GROUPS option YES
T621 Enhanced numeric functions YES
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 96e7fdbcfe..db369b1f9a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -154,6 +154,7 @@ static Node *makeBitStringConst(char *str, int location);
static Node *makeNullAConst(int location);
static Node *makeAConst(Value *v, int location);
static Node *makeBoolAConst(bool state, int location);
+static Node *makeTypedBoolAConst(bool state, char *type, int location);
static RoleSpec *makeRoleSpec(RoleSpecType type, int location);
static void check_qualified_name(List *names, core_yyscan_t yyscanner);
static List *check_func_name(List *names, core_yyscan_t yyscanner);
@@ -569,7 +570,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> xml_namespace_list
%type <target> xml_namespace_el
-%type <node> func_application func_expr_common_subexpr
+%type <node> func_application func_expr_common_subexpr func_expr_respect_ignore
%type <node> func_expr func_expr_windowless
%type <node> common_table_expr
%type <with> with_clause opt_with_clause
@@ -577,6 +578,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> within_group_clause
%type <node> filter_clause
+%type <ival> null_treatment_clause
%type <list> window_clause window_definition_list opt_partition_clause
%type <windef> window_definition over_clause window_specification
opt_frame_clause frame_extent frame_bound
@@ -642,14 +644,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
EXCLUDE EXCLUDING EXCLUSIVE EXECUTE EXISTS EXPLAIN EXPRESSION
EXTENSION EXTERNAL EXTRACT
- FALSE_P FAMILY FETCH FILTER FIRST_P FLOAT_P FOLLOWING FOR
+ FALSE_P FAMILY FETCH FILTER FIRST_P FIRST_VALUE FLOAT_P FOLLOWING FOR
FORCE FOREIGN FORWARD FREEZE FROM FULL FUNCTION FUNCTIONS
GENERATED GLOBAL GRANT GRANTED GREATEST GROUP_P GROUPING GROUPS
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P INCLUDE
+ IDENTITY_P IF_P IGNORE_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P INCLUDE
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -658,14 +660,14 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
KEY
- LABEL LANGUAGE LARGE_P LAST_P LATERAL_P
- LEADING LEAKPROOF LEAST LEFT LEVEL LIKE LIMIT LISTEN LOAD LOCAL
+ LABEL LAG LANGUAGE LARGE_P LAST_P LAST_VALUE LATERAL_P
+ LEAD LEADING LEAKPROOF LEAST LEFT LEVEL LIKE LIMIT LISTEN LOAD LOCAL
LOCALTIME LOCALTIMESTAMP LOCATION LOCK_P LOCKED LOGGED
MAPPING MATCH MATERIALIZED MAXVALUE METHOD MINUTE_P MINVALUE MODE MONTH_P MOVE
NAME_P NAMES NATIONAL NATURAL NCHAR NEW NEXT NO NONE
- NOT NOTHING NOTIFY NOTNULL NOWAIT NULL_P NULLIF
+ NOT NOTHING NOTIFY NOTNULL NOWAIT NTH_VALUE NULL_P NULLIF
NULLS_P NUMERIC
OBJECT_P OF OFF OFFSET OIDS OLD ON ONLY OPERATOR OPTION OPTIONS OR
@@ -680,7 +682,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFERENCING
REFRESH REINDEX RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROLLUP
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROLLUP
ROUTINE ROUTINES ROW ROWS RULE
SAVEPOINT SCHEMA SCHEMAS SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -724,6 +726,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
/* Precedence: lowest to highest */
%nonassoc SET /* see relation_expr_opt_alias */
+%nonassoc FIRST_VALUE LAG LAST_VALUE LEAD NTH_VALUE
%left UNION EXCEPT
%left INTERSECT
%left OR
@@ -13769,6 +13772,10 @@ func_application: func_name '(' ')'
}
;
+null_treatment_clause:
+ RESPECT NULLS_P { $$ = 0; }
+ | IGNORE_P NULLS_P { $$ = WINFUNC_OPT_IGNORE_NULLS; }
+ ;
/*
* func_expr and its cousin func_expr_windowless are split out from c_expr just
@@ -13816,8 +13823,133 @@ func_expr: func_application within_group_clause filter_clause over_clause
}
| func_expr_common_subexpr
{ $$ = $1; }
+ | func_expr_respect_ignore over_clause
+ {
+ FuncCall *n = (FuncCall *) $1;
+ n->over = $2;
+ $$ = (Node *) n;
+ }
+ ;
+
+func_expr_respect_ignore:
+ FIRST_VALUE '(' func_arg_list opt_sort_clause ')' null_treatment_clause
+ {
+ FuncCall *n;
+ List *l = $3;
+ int winFuncArgs = $6;
+
+ /* Convert Ignore Nulls option to bool */
+ if (winFuncArgs & WINFUNC_OPT_IGNORE_NULLS)
+ l = lappend(l, makeTypedBoolAConst(true, "ignorenulls", @2));
+
+ n = makeFuncCall(list_make1(makeString("first_value")), l, @1);
+ n->agg_order = $4;
+ $$ = (Node *) n;
+ }
+ | FIRST_VALUE '(' func_arg_list opt_sort_clause ')'
+ {
+ FuncCall *n;
+ List *l = $3;
+
+ n = makeFuncCall(list_make1(makeString("first_value")), l, @1);
+ n->agg_order = $4;
+ $$ = (Node *) n;
+ }
+ | LAG '(' func_arg_list opt_sort_clause ')' null_treatment_clause
+ {
+ FuncCall *n;
+ List *l = $3;
+ int winFuncArgs = $6;
+
+ /* Convert Ignore Nulls option to bool */
+ if (winFuncArgs & WINFUNC_OPT_IGNORE_NULLS)
+ l = lappend(l, makeTypedBoolAConst(true, "ignorenulls", @2));
+
+ n = makeFuncCall(list_make1(makeString("lag")), l, @1);
+ n->agg_order = $4;
+ $$ = (Node *) n;
+ }
+ | LAG '(' func_arg_list opt_sort_clause ')'
+ {
+ FuncCall *n;
+ List *l = $3;
+
+ n = makeFuncCall(list_make1(makeString("lag")), l, @1);
+ n->agg_order = $4;
+ $$ = (Node *) n;
+ }
+ | LAST_VALUE '(' func_arg_list opt_sort_clause ')' null_treatment_clause
+ {
+ FuncCall *n;
+ List *l = $3;
+ int winFuncArgs = $6;
+
+ /* Convert Ignore Nulls option to bool */
+ if (winFuncArgs & WINFUNC_OPT_IGNORE_NULLS)
+ l = lappend(l, makeTypedBoolAConst(true, "ignorenulls", @2));
+
+ n = makeFuncCall(list_make1(makeString("last_value")), l, @1);
+ n->agg_order = $4;
+ $$ = (Node *) n;
+ }
+ | LAST_VALUE '(' func_arg_list opt_sort_clause ')'
+ {
+ FuncCall *n;
+ List *l = $3;
+
+ n = makeFuncCall(list_make1(makeString("last_value")), l, @1);
+ n->agg_order = $4;
+ $$ = (Node *) n;
+ }
+ | LEAD '(' func_arg_list opt_sort_clause ')' null_treatment_clause
+ {
+ FuncCall *n;
+ List *l = $3;
+ int winFuncArgs = $6;
+
+ /* Convert Ignore Nulls option to bool */
+ if (winFuncArgs & WINFUNC_OPT_IGNORE_NULLS)
+ l = lappend(l, makeTypedBoolAConst(true, "ignorenulls", @2));
+
+ n = makeFuncCall(list_make1(makeString("lead")), l, @1);
+ n->agg_order = $4;
+ $$ = (Node *) n;
+ }
+ | LEAD '(' func_arg_list opt_sort_clause ')'
+ {
+ FuncCall *n;
+ List *l = $3;
+
+ n = makeFuncCall(list_make1(makeString("lead")), l, @1);
+ n->agg_order = $4;
+ $$ = (Node *) n;
+ }
+ | NTH_VALUE '(' func_arg_list opt_sort_clause ')' null_treatment_clause
+ {
+ FuncCall *n;
+ List *l = $3;
+ int winFuncArgs = $6;
+
+ /* Convert Nulls option to bool */
+ if (winFuncArgs & WINFUNC_OPT_IGNORE_NULLS)
+ l = lappend(l, makeTypedBoolAConst(true, "ignorenulls", @2));
+
+ n = makeFuncCall(list_make1(makeString("nth_value")), l, @1);
+ n->agg_order = $4;
+ $$ = (Node *) n;
+ }
+ | NTH_VALUE '(' func_arg_list opt_sort_clause ')'
+ {
+ FuncCall *n;
+ List *l = $3;
+
+ n = makeFuncCall(list_make1(makeString("nth_value")), l, @1);
+ n->agg_order = $4;
+ $$ = (Node *) n;
+ }
;
+
/*
* As func_expr but does not accept WINDOW functions directly
* (but they can still be contained in arguments for functions etc).
@@ -15225,6 +15357,7 @@ unreserved_keyword:
| FAMILY
| FILTER
| FIRST_P
+ | FIRST_VALUE
| FOLLOWING
| FORCE
| FORWARD
@@ -15240,6 +15373,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE_P
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -15260,9 +15394,12 @@ unreserved_keyword:
| ISOLATION
| KEY
| LABEL
+ | LAG
| LANGUAGE
| LARGE_P
| LAST_P
+ | LAST_VALUE
+ | LEAD
| LEAKPROOF
| LEVEL
| LISTEN
@@ -15290,6 +15427,7 @@ unreserved_keyword:
| NOTHING
| NOTIFY
| NOWAIT
+ | NTH_VALUE
| NULLS_P
| OBJECT_P
| OF
@@ -15341,6 +15479,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
@@ -15815,6 +15954,15 @@ makeAConst(Value *v, int location)
*/
static Node *
makeBoolAConst(bool state, int location)
+{
+ return makeTypedBoolAConst(state, "bool", location);
+}
+
+/* makeTypedBoolAConst()
+ * Create an A_Const string node from a boolean and store inside the specified type.
+ */
+static Node *
+makeTypedBoolAConst(bool state, char *type, int location)
{
A_Const *n = makeNode(A_Const);
@@ -15822,7 +15970,7 @@ makeBoolAConst(bool state, int location)
n->val.val.str = (state ? "t" : "f");
n->location = location;
- return makeTypeCast((Node *)n, SystemTypeName("bool"), -1);
+ return makeTypeCast((Node *)n, SystemTypeName(type), -1);
}
/* makeRoleSpec
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 158784474d..ae7e821ab2 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -9437,6 +9437,7 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
int nargs;
List *argnames;
ListCell *l;
+ bool ignorenulls = false;
if (list_length(wfunc->args) > FUNC_MAX_ARGS)
ereport(ERROR,
@@ -9463,7 +9464,32 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
if (wfunc->winstar)
appendStringInfoChar(buf, '*');
else
- get_rule_expr((Node *) wfunc->args, context, true);
+ {
+ ListCell *arglist;
+ Node *argnode = (Node *) wfunc->args;
+
+ get_rule_expr(argnode, context, true);
+
+ /* Determine if IGNORE NULLS should be appended */
+ foreach(arglist, (List *) argnode)
+ {
+ Node *arg = (Node *) lfirst(arglist);
+ if (nodeTag(arg) == T_Const)
+ {
+ Const *constnode = (Const *) arg;
+ if (constnode->consttype == IGNORENULLSOID)
+ {
+ /* parser does not save RESPECT NULLS arguments */
+ ignorenulls = true;
+ buf->len -= 2;
+ }
+ }
+ }
+ }
+
+ appendStringInfoChar(buf, ')');
+ if (ignorenulls)
+ appendStringInfoString(buf, " IGNORE NULLS");
if (wfunc->aggfilter != NULL)
{
@@ -9471,7 +9497,7 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
get_rule_expr((Node *) wfunc->aggfilter, context, false);
}
- appendStringInfoString(buf, ") OVER ");
+ appendStringInfoString(buf, " OVER ");
foreach(l, context->windowClause)
{
@@ -9649,6 +9675,10 @@ get_const_expr(Const *constval, deparse_context *context, int showtype)
appendStringInfoString(buf, "false");
break;
+ case IGNORENULLSOID:
+ showtype = -1;
+ break;
+
default:
simple_quote_literal(buf, extval);
break;
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index f0c8ae686d..bc639b1883 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -38,7 +38,12 @@ typedef struct
static bool rank_up(WindowObject winobj);
static Datum leadlag_common(FunctionCallInfo fcinfo,
bool forward, bool withoffset, bool withdefault);
-
+static Datum leadlag_common_ignore_nulls(FunctionCallInfo fcinfo,
+ bool forward, bool withoffset, bool withdefault);
+static Datum
+window_nth_value_ignorenulls_common(FunctionCallInfo fcinfo, int32 nth);
+static Datum
+window_nth_value_respectnulls_common(FunctionCallInfo fcinfo, int32 nth);
/*
* utility routine for *_rank functions.
@@ -328,6 +333,79 @@ leadlag_common(FunctionCallInfo fcinfo,
PG_RETURN_DATUM(result);
}
+static Datum
+leadlag_common_ignore_nulls(FunctionCallInfo fcinfo,
+ bool forward, bool withoffset, bool withdefault)
+{
+ WindowObject winobj = PG_WINDOW_OBJECT();
+ int32 offset;
+ Datum result;
+ bool isnull;
+ bool isout = false;
+ int32 notnull_offset = 0, tmp_offset = 0;
+
+ if (withoffset)
+ {
+ offset = DatumGetInt32(WinGetFuncArgCurrent(winobj, 1, &isnull));
+ if (isnull)
+ PG_RETURN_NULL();
+ if (offset < 0)
+ {
+ offset = abs(offset);
+ forward = !forward;
+ } else if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0, 0,
+ WINDOW_SEEK_CURRENT,
+ false,
+ &isnull, &isout);
+ if (isnull || isout)
+ PG_RETURN_NULL();
+ else
+ PG_RETURN_DATUM(result);
+ }
+ }
+ else
+ offset = 1;
+
+ while (notnull_offset < offset)
+ {
+ tmp_offset++;
+ result = WinGetFuncArgInPartition(winobj, 0,
+ (forward ? tmp_offset : -tmp_offset),
+ WINDOW_SEEK_CURRENT,
+ false,
+ &isnull, &isout);
+ if (isout)
+ goto out_of_frame;
+ else if (!isnull)
+ notnull_offset++;
+ }
+
+ result = WinGetFuncArgInPartition(winobj, 0,
+ (forward ? tmp_offset : -tmp_offset),
+ WINDOW_SEEK_CURRENT,
+ false,
+ &isnull, &isout);
+ if (isout)
+ goto out_of_frame;
+ else
+ PG_RETURN_DATUM(result);
+
+ out_of_frame:
+ /*
+ * target row is out of the partition; supply default value if
+ * provided. Otherwise return NULL.
+ */
+ if (withdefault)
+ {
+ result = WinGetFuncArgCurrent(winobj, 2, &isnull);
+ PG_RETURN_DATUM(result);
+ }
+ else
+ PG_RETURN_NULL();
+}
+
/*
* lag
* returns the value of VE evaluated on a row that is 1
@@ -363,6 +441,24 @@ window_lag_with_offset_and_default(PG_FUNCTION_ARGS)
return leadlag_common(fcinfo, false, true, true);
}
+Datum
+window_lag_nulls_opt(PG_FUNCTION_ARGS)
+{
+ return leadlag_common_ignore_nulls(fcinfo, false, false, false);
+}
+
+Datum
+window_lag_with_offset_nulls_opt(PG_FUNCTION_ARGS)
+{
+ return leadlag_common_ignore_nulls(fcinfo, false, true, false);
+}
+
+Datum
+window_lag_with_offset_and_default_nulls_opt(PG_FUNCTION_ARGS)
+{
+ return leadlag_common_ignore_nulls(fcinfo, false, true, true);
+}
+
/*
* lead
* returns the value of VE evaluated on a row that is 1
@@ -398,6 +494,24 @@ window_lead_with_offset_and_default(PG_FUNCTION_ARGS)
return leadlag_common(fcinfo, true, true, true);
}
+Datum
+window_lead_nulls_opt(PG_FUNCTION_ARGS)
+{
+ return leadlag_common_ignore_nulls(fcinfo, true, false, false);
+}
+
+Datum
+window_lead_with_offset_nulls_opt(PG_FUNCTION_ARGS)
+{
+ return leadlag_common_ignore_nulls(fcinfo, true, true, false);
+}
+
+Datum
+window_lead_with_offset_and_default_nulls_opt(PG_FUNCTION_ARGS)
+{
+ return leadlag_common_ignore_nulls(fcinfo, true, true, true);
+}
+
/*
* first_value
* return the value of VE evaluated on the first row of the
@@ -419,6 +533,31 @@ window_first_value(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(result);
}
+Datum
+window_first_value_nulls_opt(PG_FUNCTION_ARGS)
+{
+ WindowObject winobj = PG_WINDOW_OBJECT();
+ Datum result;
+ bool isnull,
+ isout;
+ int64 pos;
+
+ isout = false;
+ pos = 0;
+
+ while (!isout)
+ {
+ result = WinGetFuncArgInFrame(winobj, 0,
+ pos, WINDOW_SEEK_HEAD, false,
+ &isnull, &isout);
+ if (!isnull)
+ PG_RETURN_DATUM(result);
+ pos++;
+ }
+
+ PG_RETURN_NULL();
+}
+
/*
* last_value
* return the value of VE evaluated on the last row of the
@@ -440,35 +579,149 @@ window_last_value(PG_FUNCTION_ARGS)
PG_RETURN_DATUM(result);
}
+Datum
+window_last_value_nulls_opt(PG_FUNCTION_ARGS)
+{
+ WindowObject winobj = PG_WINDOW_OBJECT();
+ Datum result;
+ bool isnull,
+ isout;
+ int64 pos;
+
+ isout = false;
+ pos = 0;
+
+ while (!isout)
+ {
+ result = WinGetFuncArgInFrame(winobj, 0,
+ pos, WINDOW_SEEK_TAIL, false,
+ &isnull, &isout);
+ if (!isnull)
+ PG_RETURN_DATUM(result);
+ pos--;
+ }
+
+ PG_RETURN_NULL();
+}
+
/*
* nth_value
* return the value of VE evaluated on the n-th row from the first
* row of the window frame, per spec.
*/
-Datum
-window_nth_value(PG_FUNCTION_ARGS)
+static Datum
+window_nth_value_respectnulls_common(FunctionCallInfo fcinfo, int32 nth)
{
WindowObject winobj = PG_WINDOW_OBJECT();
bool const_offset;
Datum result;
bool isnull;
- int32 nth;
+ bool fromlast;
nth = DatumGetInt32(WinGetFuncArgCurrent(winobj, 1, &isnull));
if (isnull)
PG_RETURN_NULL();
const_offset = get_fn_expr_arg_stable(fcinfo->flinfo, 1);
- if (nth <= 0)
+ if (nth == 0)
ereport(ERROR,
- (errcode(ERRCODE_INVALID_ARGUMENT_FOR_NTH_VALUE),
- errmsg("argument of nth_value must be greater than zero")));
+ (errcode(ERRCODE_INVALID_ARGUMENT_FOR_NTH_VALUE),
+ errmsg("argument of nth_value must be greater or less than zero")));
+ else if (nth < 0)
+ {
+ nth = abs(nth);
+ fromlast = true;
+ }
+ else
+ fromlast = false;
+
+ result = WinGetFuncArgInFrame(winobj,
+ 0,
+ nth - 1,
+ fromlast ? WINDOW_SEEK_TAIL : WINDOW_SEEK_HEAD, const_offset,
+ &isnull,
+ NULL);
- result = WinGetFuncArgInFrame(winobj, 0,
- nth - 1, WINDOW_SEEK_HEAD, const_offset,
- &isnull, NULL);
if (isnull)
PG_RETURN_NULL();
PG_RETURN_DATUM(result);
}
+
+static Datum
+window_nth_value_ignorenulls_common(FunctionCallInfo fcinfo, int32 nth)
+{
+ WindowObject winobj = PG_WINDOW_OBJECT();
+ Datum result;
+ bool isnull,
+ isout;
+ bool fromlast;
+ int32 tmp_offset, notnull_offset = 0;
+
+ nth = DatumGetInt32(WinGetFuncArgCurrent(winobj, 1, &isnull));
+ if (isnull)
+ PG_RETURN_NULL();
+
+ if (nth == 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_ARGUMENT_FOR_NTH_VALUE),
+ errmsg("argument of nth_value must be greater or less than zero")));
+ else if (nth < 0)
+ {
+ nth = abs(nth);
+ fromlast = true;
+ tmp_offset = 1;
+ }
+ else
+ {
+ fromlast = false;
+ tmp_offset = -1;
+ }
+
+ while (notnull_offset < nth)
+ {
+ fromlast ? tmp_offset-- : tmp_offset++;
+ result = WinGetFuncArgInFrame(winobj, 0,
+ tmp_offset, fromlast ? WINDOW_SEEK_TAIL : WINDOW_SEEK_HEAD,
+ false, &isnull, &isout);
+ if (isout)
+ PG_RETURN_NULL();
+ if (!isnull)
+ notnull_offset++;
+ }
+
+ result = WinGetFuncArgInFrame(winobj, 0,
+ tmp_offset, fromlast ? WINDOW_SEEK_TAIL : WINDOW_SEEK_HEAD,
+ false, &isnull, &isout);
+
+ if (isout || isnull)
+ PG_RETURN_NULL();
+
+ PG_RETURN_DATUM(result);
+}
+
+Datum
+window_nth_value(PG_FUNCTION_ARGS)
+{
+ WindowObject winobj = PG_WINDOW_OBJECT();
+ bool isnull;
+ int32 nth;
+
+ nth = DatumGetInt32(WinGetFuncArgCurrent(winobj, 1, &isnull));
+ if (isnull)
+ PG_RETURN_NULL();
+ PG_RETURN_DATUM(window_nth_value_respectnulls_common(fcinfo, nth));
+}
+
+Datum
+window_nth_value_with_nulls_opt(PG_FUNCTION_ARGS)
+{
+ WindowObject winobj = PG_WINDOW_OBJECT();
+ bool isnull;
+ int32 nth;
+
+ nth = DatumGetInt32(WinGetFuncArgCurrent(winobj, 1, &isnull));
+ if (isnull)
+ PG_RETURN_NULL();
+ PG_RETURN_DATUM(window_nth_value_ignorenulls_common(fcinfo, nth));
+}
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 07a86c7b7b..f07a9e442d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -9539,32 +9539,64 @@
{ oid => '3106', descr => 'fetch the preceding row value',
proname => 'lag', prokind => 'w', prorettype => 'anyelement',
proargtypes => 'anyelement', prosrc => 'window_lag' },
+{ oid => '4191', descr => 'fetch the preceding row value with nulls option',
+ proname => 'lag', prokind => 'w', prorettype => 'anyelement',
+ proargtypes => 'anyelement ignorenulls', prosrc => 'window_lag_nulls_opt' },
{ oid => '3107', descr => 'fetch the Nth preceding row value',
proname => 'lag', prokind => 'w', prorettype => 'anyelement',
proargtypes => 'anyelement int4', prosrc => 'window_lag_with_offset' },
+{ oid => '4192', descr => 'fetch the Nth preceding row value with nulls option',
+ proname => 'lag', prokind => 'w', prorettype => 'anyelement',
+ proargtypes => 'anyelement int4 ignorenulls',
+ prosrc => 'window_lag_with_offset_nulls_opt' },
{ oid => '3108', descr => 'fetch the Nth preceding row value with default',
proname => 'lag', prokind => 'w', prorettype => 'anyelement',
proargtypes => 'anyelement int4 anyelement',
prosrc => 'window_lag_with_offset_and_default' },
+{ oid => '4193', descr => 'fetch the Nth preceding row value with default and nulls option',
+ proname => 'lag', prokind => 'w', prorettype => 'anyelement',
+ proargtypes => 'anyelement int4 anyelement ignorenulls',
+ prosrc => 'window_lag_with_offset_and_default_nulls_opt' },
{ oid => '3109', descr => 'fetch the following row value',
proname => 'lead', prokind => 'w', prorettype => 'anyelement',
proargtypes => 'anyelement', prosrc => 'window_lead' },
+{ oid => '4194', descr => 'fetch the following row value with nulls option',
+ proname => 'lead', prokind => 'w', prorettype => 'anyelement',
+ proargtypes => 'anyelement ignorenulls', prosrc => 'window_lead_nulls_opt' },
{ oid => '3110', descr => 'fetch the Nth following row value',
proname => 'lead', prokind => 'w', prorettype => 'anyelement',
proargtypes => 'anyelement int4', prosrc => 'window_lead_with_offset' },
+{ oid => '4195', descr => 'fetch the Nth following row value with nulls option',
+ proname => 'lead', prokind => 'w', prorettype => 'anyelement',
+ proargtypes => 'anyelement int4 ignorenulls',
+ prosrc => 'window_lead_with_offset_nulls_opt' },
{ oid => '3111', descr => 'fetch the Nth following row value with default',
proname => 'lead', prokind => 'w', prorettype => 'anyelement',
proargtypes => 'anyelement int4 anyelement',
prosrc => 'window_lead_with_offset_and_default' },
+{ oid => '4196', descr => 'fetch the Nth following row value with default and nulls option',
+ proname => 'lead', prokind => 'w', prorettype => 'anyelement',
+ proargtypes => 'anyelement int4 anyelement ignorenulls',
+ prosrc => 'window_lead_with_offset_and_default_nulls_opt' },
{ oid => '3112', descr => 'fetch the first row value',
proname => 'first_value', prokind => 'w', prorettype => 'anyelement',
proargtypes => 'anyelement', prosrc => 'window_first_value' },
+{ oid => '4197', descr => 'fetch the first row value with nulls option',
+ proname => 'first_value', prokind => 'w', prorettype => 'anyelement',
+ proargtypes => 'anyelement ignorenulls', prosrc => 'window_first_value_nulls_opt' },
{ oid => '3113', descr => 'fetch the last row value',
proname => 'last_value', prokind => 'w', prorettype => 'anyelement',
proargtypes => 'anyelement', prosrc => 'window_last_value' },
+{ oid => '4198', descr => 'fetch the last row value with nulls option',
+ proname => 'last_value', prokind => 'w', prorettype => 'anyelement',
+ proargtypes => 'anyelement ignorenulls', prosrc => 'window_last_value_nulls_opt' },
{ oid => '3114', descr => 'fetch the Nth row value',
proname => 'nth_value', prokind => 'w', prorettype => 'anyelement',
proargtypes => 'anyelement int4', prosrc => 'window_nth_value' },
+{ oid => '4199', descr => 'fetch the Nth row value with nulls option',
+ proname => 'nth_value', prokind => 'w', prorettype => 'anyelement',
+ proargtypes => 'anyelement int4 ignorenulls',
+ prosrc => 'window_nth_value_with_nulls_opt' },
# functions for range types
{ oid => '3832', descr => 'I/O',
diff --git a/src/include/catalog/pg_type.dat b/src/include/catalog/pg_type.dat
index 4cf2b9df7b..3c847e6db5 100644
--- a/src/include/catalog/pg_type.dat
+++ b/src/include/catalog/pg_type.dat
@@ -594,5 +594,10 @@
typname => 'anyrange', typlen => '-1', typbyval => 'f', typtype => 'p',
typcategory => 'P', typinput => 'anyrange_in', typoutput => 'anyrange_out',
typreceive => '-', typsend => '-', typalign => 'd', typstorage => 'x' },
+{ oid => '4142',
+ descr => 'boolean wrapper, \'true\'/\'false\'',
+ typname => 'ignorenulls', typlen => '1', typbyval => 't', typtype => 'b',
+ typcategory => 'B', typinput => 'boolin', typoutput => 'boolout',
+ typreceive => 'boolrecv', typsend => 'boolsend', typalign => 'c' },
]
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index da0706add5..0899bddd6e 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -532,6 +532,11 @@ typedef struct WindowDef
(FRAMEOPTION_RANGE | FRAMEOPTION_START_UNBOUNDED_PRECEDING | \
FRAMEOPTION_END_CURRENT_ROW)
+/*
+ * Null Treatment option
+ */
+#define WINFUNC_OPT_IGNORE_NULLS 0x00001 /* IGNORE NULLS */
+
/*
* RangeSubselect - subquery appearing in a FROM clause
*/
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index b1184c2d15..a33ce1d0fd 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -164,6 +164,7 @@ PG_KEYWORD("family", FAMILY, UNRESERVED_KEYWORD)
PG_KEYWORD("fetch", FETCH, RESERVED_KEYWORD)
PG_KEYWORD("filter", FILTER, UNRESERVED_KEYWORD)
PG_KEYWORD("first", FIRST_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("first_value", FIRST_VALUE, UNRESERVED_KEYWORD)
PG_KEYWORD("float", FLOAT_P, COL_NAME_KEYWORD)
PG_KEYWORD("following", FOLLOWING, UNRESERVED_KEYWORD)
PG_KEYWORD("for", FOR, RESERVED_KEYWORD)
@@ -190,6 +191,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE_P, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -223,10 +225,13 @@ PG_KEYWORD("isolation", ISOLATION, UNRESERVED_KEYWORD)
PG_KEYWORD("join", JOIN, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("key", KEY, UNRESERVED_KEYWORD)
PG_KEYWORD("label", LABEL, UNRESERVED_KEYWORD)
+PG_KEYWORD("lag", LAG, UNRESERVED_KEYWORD)
PG_KEYWORD("language", LANGUAGE, UNRESERVED_KEYWORD)
PG_KEYWORD("large", LARGE_P, UNRESERVED_KEYWORD)
PG_KEYWORD("last", LAST_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("last_value", LAST_VALUE, UNRESERVED_KEYWORD)
PG_KEYWORD("lateral", LATERAL_P, RESERVED_KEYWORD)
+PG_KEYWORD("lead", LEAD, UNRESERVED_KEYWORD)
PG_KEYWORD("leading", LEADING, RESERVED_KEYWORD)
PG_KEYWORD("leakproof", LEAKPROOF, UNRESERVED_KEYWORD)
PG_KEYWORD("least", LEAST, COL_NAME_KEYWORD)
@@ -267,6 +272,7 @@ PG_KEYWORD("nothing", NOTHING, UNRESERVED_KEYWORD)
PG_KEYWORD("notify", NOTIFY, UNRESERVED_KEYWORD)
PG_KEYWORD("notnull", NOTNULL, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("nowait", NOWAIT, UNRESERVED_KEYWORD)
+PG_KEYWORD("nth_value", NTH_VALUE, UNRESERVED_KEYWORD)
PG_KEYWORD("null", NULL_P, RESERVED_KEYWORD)
PG_KEYWORD("nullif", NULLIF, COL_NAME_KEYWORD)
PG_KEYWORD("nulls", NULLS_P, UNRESERVED_KEYWORD)
@@ -336,6 +342,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/test/regress/expected/type_sanity.out b/src/test/regress/expected/type_sanity.out
index cd7fc03b04..0ac49263ec 100644
--- a/src/test/regress/expected/type_sanity.out
+++ b/src/test/regress/expected/type_sanity.out
@@ -73,7 +73,8 @@ WHERE p1.typtype not in ('c','d','p') AND p1.typname NOT LIKE E'\\_%'
3361 | pg_ndistinct
3402 | pg_dependencies
5017 | pg_mcv_list
-(4 rows)
+ 4142 | ignorenulls
+(5 rows)
-- Make sure typarray points to a varlena array type of our own base
SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype,
@@ -166,10 +167,11 @@ WHERE p1.typinput = p2.oid AND p1.typtype in ('b', 'p') AND NOT
(p1.typelem != 0 AND p1.typlen < 0) AND NOT
(p2.prorettype = p1.oid AND NOT p2.proretset)
ORDER BY 1;
- oid | typname | oid | proname
-------+-----------+-----+---------
- 1790 | refcursor | 46 | textin
-(1 row)
+ oid | typname | oid | proname
+------+-------------+------+---------
+ 1790 | refcursor | 46 | textin
+ 4142 | ignorenulls | 1242 | boolin
+(2 rows)
-- Varlena array types will point to array_in
-- Exception as of 8.1: int2vector and oidvector have their own I/O routines
@@ -217,10 +219,11 @@ WHERE p1.typoutput = p2.oid AND p1.typtype in ('b', 'p') AND NOT
(p2.oid = 'array_out'::regproc AND
p1.typelem != 0 AND p1.typlen = -1)))
ORDER BY 1;
- oid | typname | oid | proname
-------+-----------+-----+---------
- 1790 | refcursor | 47 | textout
-(1 row)
+ oid | typname | oid | proname
+------+-------------+------+---------
+ 1790 | refcursor | 47 | textout
+ 4142 | ignorenulls | 1243 | boolout
+(2 rows)
SELECT p1.oid, p1.typname, p2.oid, p2.proname
FROM pg_type AS p1, pg_proc AS p2
@@ -280,10 +283,11 @@ WHERE p1.typreceive = p2.oid AND p1.typtype in ('b', 'p') AND NOT
(p1.typelem != 0 AND p1.typlen < 0) AND NOT
(p2.prorettype = p1.oid AND NOT p2.proretset)
ORDER BY 1;
- oid | typname | oid | proname
-------+-----------+------+----------
- 1790 | refcursor | 2414 | textrecv
-(1 row)
+ oid | typname | oid | proname
+------+-------------+------+----------
+ 1790 | refcursor | 2414 | textrecv
+ 4142 | ignorenulls | 2436 | boolrecv
+(2 rows)
-- Varlena array types will point to array_recv
-- Exception as of 8.1: int2vector and oidvector have their own I/O routines
@@ -340,10 +344,11 @@ WHERE p1.typsend = p2.oid AND p1.typtype in ('b', 'p') AND NOT
(p2.oid = 'array_send'::regproc AND
p1.typelem != 0 AND p1.typlen = -1)))
ORDER BY 1;
- oid | typname | oid | proname
-------+-----------+------+----------
- 1790 | refcursor | 2415 | textsend
-(1 row)
+ oid | typname | oid | proname
+------+-------------+------+----------
+ 1790 | refcursor | 2415 | textsend
+ 4142 | ignorenulls | 2437 | boolsend
+(2 rows)
SELECT p1.oid, p1.typname, p2.oid, p2.proname
FROM pg_type AS p1, pg_proc AS p2
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index d5fd4045f9..54ded65906 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -2985,7 +2985,7 @@ LINE 1: SELECT generate_series(1, 100) OVER () FROM empsalary;
SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
-ERROR: argument of nth_value must be greater than zero
+ERROR: argument of nth_value must be greater or less than zero
-- filter
SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
sum(salary) FILTER (WHERE enroll_date > '2007-01-01')
@@ -3863,3 +3863,369 @@ SELECT * FROM pg_temp.f(2);
{5}
(5 rows)
+-- RESPECT NULLS and IGNORE NULLS tests
+CREATE TEMPORARY TABLE planets (
+ name text,
+ orbit int
+);
+INSERT INTO planets VALUES
+ ('mercury', 88),
+ ('venus', 224),
+ ('earth', NULL),
+ ('mars', NULL),
+ ('jupiter', 4332),
+ ('saturn', 24491),
+ ('uranus', NULL),
+ ('neptune', 60182),
+ ('pluto', 90560);
+ -- test view definitions are preserved
+CREATE TEMP VIEW v_planets AS
+ SELECT
+ name,
+ sum(orbit) OVER (order by orbit) as sum_rows,
+ lag(orbit, 1) RESPECT NULLS OVER (ORDER BY name DESC) AS lagged_by_1,
+ lag(orbit, 2) IGNORE NULLS OVER w AS lagged_by_2,
+ first_value(orbit) IGNORE NULLS OVER w AS first_value_ignore,
+ nth_value(orbit,2) IGNORE NULLS OVER w AS nth_first_ignore,
+ nth_value(orbit,-2) IGNORE NULLS OVER w AS nth_last_ignore
+ FROM planets
+ WINDOW w as (ORDER BY name ASC);
+SELECT pg_get_viewdef('v_planets');
+ pg_get_viewdef
+------------------------------------------------------------------------------------
+ SELECT planets.name, +
+ sum(planets.orbit) OVER (ORDER BY planets.orbit) AS sum_rows, +
+ lag(planets.orbit, 1) OVER (ORDER BY planets.name DESC) AS lagged_by_1, +
+ lag(planets.orbit, 2) IGNORE NULLS OVER w AS lagged_by_2, +
+ first_value(planets.orbit) IGNORE NULLS OVER w AS first_value_ignore, +
+ nth_value(planets.orbit, 2) IGNORE NULLS OVER w AS nth_first_ignore, +
+ nth_value(planets.orbit, '-2'::integer) IGNORE NULLS OVER w AS nth_last_ignore+
+ FROM planets +
+ WINDOW w AS (ORDER BY planets.name);
+(1 row)
+
+SELECT name, lag(orbit) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | lag
+---------+-------
+ earth |
+ jupiter |
+ mars | 4332
+ mercury |
+ neptune | 88
+ pluto | 60182
+ saturn | 90560
+ uranus | 24491
+ venus |
+(9 rows)
+
+SELECT name, lag(orbit) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | lag
+---------+-------
+ earth |
+ jupiter |
+ mars | 4332
+ mercury |
+ neptune | 88
+ pluto | 60182
+ saturn | 90560
+ uranus | 24491
+ venus |
+(9 rows)
+
+SELECT name, lag(orbit) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | lag
+---------+-------
+ earth |
+ jupiter |
+ mars | 4332
+ mercury | 4332
+ neptune | 88
+ pluto | 60182
+ saturn | 90560
+ uranus | 24491
+ venus | 24491
+(9 rows)
+
+SELECT name, lead(orbit) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | lead
+---------+-------
+ earth | 4332
+ jupiter |
+ mars | 88
+ mercury | 60182
+ neptune | 90560
+ pluto | 24491
+ saturn |
+ uranus | 224
+ venus |
+(9 rows)
+
+SELECT name, lead(orbit) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | lead
+---------+-------
+ earth | 4332
+ jupiter |
+ mars | 88
+ mercury | 60182
+ neptune | 90560
+ pluto | 24491
+ saturn |
+ uranus | 224
+ venus |
+(9 rows)
+
+SELECT name, lead(orbit) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | lead
+---------+-------
+ earth | 4332
+ jupiter | 88
+ mars | 88
+ mercury | 60182
+ neptune | 90560
+ pluto | 24491
+ saturn | 224
+ uranus | 224
+ venus |
+(9 rows)
+
+SELECT name, lag(orbit, -1) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | lag
+---------+-------
+ earth | 4332
+ jupiter | 88
+ mars | 88
+ mercury | 60182
+ neptune | 90560
+ pluto | 24491
+ saturn | 224
+ uranus | 224
+ venus |
+(9 rows)
+
+SELECT name, lead(orbit, -1) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | lead
+---------+-------
+ earth |
+ jupiter |
+ mars | 4332
+ mercury | 4332
+ neptune | 88
+ pluto | 60182
+ saturn | 90560
+ uranus | 24491
+ venus | 24491
+(9 rows)
+
+SELECT name, first_value(orbit) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | first_value
+---------+-------------
+ earth |
+ jupiter |
+ mars |
+ mercury |
+ neptune |
+ pluto |
+ saturn |
+ uranus |
+ venus |
+(9 rows)
+
+SELECT name, first_value(orbit) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | first_value
+---------+-------------
+ earth |
+ jupiter |
+ mars |
+ mercury |
+ neptune |
+ pluto |
+ saturn |
+ uranus |
+ venus |
+(9 rows)
+
+SELECT name, first_value(orbit) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | first_value
+---------+-------------
+ earth | 4332
+ jupiter | 4332
+ mars | 4332
+ mercury | 4332
+ neptune | 4332
+ pluto | 4332
+ saturn | 4332
+ uranus | 4332
+ venus | 4332
+(9 rows)
+
+SELECT name, last_value(orbit) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | last_value
+---------+------------
+ earth | 224
+ jupiter | 224
+ mars | 224
+ mercury | 224
+ neptune | 224
+ pluto | 224
+ saturn | 224
+ uranus | 224
+ venus | 224
+(9 rows)
+
+SELECT name, last_value(orbit) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | last_value
+---------+------------
+ earth | 224
+ jupiter | 224
+ mars | 224
+ mercury | 224
+ neptune | 224
+ pluto | 224
+ saturn | 224
+ uranus | 224
+ venus | 224
+(9 rows)
+
+SELECT name, last_value(orbit) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | last_value
+---------+------------
+ earth | 224
+ jupiter | 224
+ mars | 224
+ mercury | 224
+ neptune | 224
+ pluto | 224
+ saturn | 224
+ uranus | 224
+ venus | 224
+(9 rows)
+
+SELECT name, nth_value(orbit, 2) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | nth_value
+---------+-----------
+ earth | 4332
+ jupiter | 4332
+ mars | 4332
+ mercury | 4332
+ neptune | 4332
+ pluto | 4332
+ saturn | 4332
+ uranus | 4332
+ venus | 4332
+(9 rows)
+
+SELECT name, nth_value(orbit, 2) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | nth_value
+---------+-----------
+ earth | 88
+ jupiter | 88
+ mars | 88
+ mercury | 88
+ neptune | 88
+ pluto | 88
+ saturn | 88
+ uranus | 88
+ venus | 88
+(9 rows)
+
+SELECT name, nth_value(orbit, 2) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | nth_value
+---------+-----------
+ earth | 4332
+ jupiter | 4332
+ mars | 4332
+ mercury | 4332
+ neptune | 4332
+ pluto | 4332
+ saturn | 4332
+ uranus | 4332
+ venus | 4332
+(9 rows)
+
+SELECT name, nth_value(orbit, 2) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | nth_value
+---------+-----------
+ earth | 4332
+ jupiter | 4332
+ mars | 4332
+ mercury | 4332
+ neptune | 4332
+ pluto | 4332
+ saturn | 4332
+ uranus | 4332
+ venus | 4332
+(9 rows)
+
+SELECT name, nth_value(orbit, 2) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | nth_value
+---------+-----------
+ earth | 88
+ jupiter | 88
+ mars | 88
+ mercury | 88
+ neptune | 88
+ pluto | 88
+ saturn | 88
+ uranus | 88
+ venus | 88
+(9 rows)
+
+SELECT name, nth_value(orbit, 2) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | nth_value
+---------+-----------
+ earth | 4332
+ jupiter | 4332
+ mars | 4332
+ mercury | 4332
+ neptune | 4332
+ pluto | 4332
+ saturn | 4332
+ uranus | 4332
+ venus | 4332
+(9 rows)
+
+SELECT name, nth_value(orbit, -2) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | nth_value
+---------+-----------
+ earth |
+ jupiter |
+ mars |
+ mercury |
+ neptune |
+ pluto |
+ saturn |
+ uranus |
+ venus |
+(9 rows)
+
+SELECT name, nth_value(orbit, -2) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | nth_value
+---------+-----------
+ earth | 24491
+ jupiter | 24491
+ mars | 24491
+ mercury | 24491
+ neptune | 24491
+ pluto | 24491
+ saturn | 24491
+ uranus | 24491
+ venus | 24491
+(9 rows)
+
+SELECT name, nth_value(orbit, -2) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+ name | nth_value
+---------+-----------
+ earth |
+ jupiter |
+ mars |
+ mercury |
+ neptune |
+ pluto |
+ saturn |
+ uranus |
+ venus |
+(9 rows)
+
+--cleanup
+DROP TABLE planets CASCADE;
+NOTICE: drop cascades to view v_planets
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index fe273aa31e..548902d482 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -1276,3 +1276,67 @@ $$ LANGUAGE SQL STABLE;
EXPLAIN (costs off) SELECT * FROM pg_temp.f(2);
SELECT * FROM pg_temp.f(2);
+
+-- RESPECT NULLS and IGNORE NULLS tests
+CREATE TEMPORARY TABLE planets (
+ name text,
+ orbit int
+);
+
+INSERT INTO planets VALUES
+ ('mercury', 88),
+ ('venus', 224),
+ ('earth', NULL),
+ ('mars', NULL),
+ ('jupiter', 4332),
+ ('saturn', 24491),
+ ('uranus', NULL),
+ ('neptune', 60182),
+ ('pluto', 90560);
+
+ -- test view definitions are preserved
+CREATE TEMP VIEW v_planets AS
+ SELECT
+ name,
+ sum(orbit) OVER (order by orbit) as sum_rows,
+ lag(orbit, 1) RESPECT NULLS OVER (ORDER BY name DESC) AS lagged_by_1,
+ lag(orbit, 2) IGNORE NULLS OVER w AS lagged_by_2,
+ first_value(orbit) IGNORE NULLS OVER w AS first_value_ignore,
+ nth_value(orbit,2) IGNORE NULLS OVER w AS nth_first_ignore,
+ nth_value(orbit,-2) IGNORE NULLS OVER w AS nth_last_ignore
+ FROM planets
+ WINDOW w as (ORDER BY name ASC);
+SELECT pg_get_viewdef('v_planets');
+
+SELECT name, lag(orbit) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, lag(orbit) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, lag(orbit) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+
+SELECT name, lead(orbit) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, lead(orbit) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, lead(orbit) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+
+SELECT name, lag(orbit, -1) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, lead(orbit, -1) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+
+SELECT name, first_value(orbit) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, first_value(orbit) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, first_value(orbit) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+
+SELECT name, last_value(orbit) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, last_value(orbit) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, last_value(orbit) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+
+SELECT name, nth_value(orbit, 2) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, nth_value(orbit, 2) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, nth_value(orbit, 2) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+
+SELECT name, nth_value(orbit, 2) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, nth_value(orbit, 2) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, nth_value(orbit, 2) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, nth_value(orbit, -2) OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, nth_value(orbit, -2) IGNORE NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+SELECT name, nth_value(orbit, -2) RESPECT NULLS OVER (ORDER BY name range between unbounded preceding and unbounded following) FROM planets ORDER BY name;
+
+--cleanup
+DROP TABLE planets CASCADE;
On Thu, Mar 5, 2020 at 4:17 AM Krasiyan Andreev <krasiyan@gmail.com> wrote:
I have currently suspended development of this patch, based on it's
review,
but I will continue development of the other Oliver Ford's work about
adding support of respect/ignore nulls
for lag(),lead(),first_value(),last_value() and nth_value() and from
first/last for nth_value() patch,
but I am not sure how to proceed with it's implementation and any feedback
will be very helpful.
* I applied your patch on top of 58c47ccfff20b8c125903 . It applied cleanly
, compiled, make check pass, but it have white space errors:
*Added functions on windowfuncs.c have no comments so it's not easily
understandable.
* Regression test addition seems huge to me. Can you reduce that? You can
use existing tables and fewer records.
* I don’t understand why this patch has to change makeBoolAConst? It
already make “bool” constant node
regards
Surafel
Thank you very much.
I think that Vik Fearing's patch about "Implement <null treatment> for
window functions" is much clear, better and has a chance to be committed.
For me it's not important which patch will go into PostgreSQL, because it's
a much needed feature.
In mine patch, there is also a feature about using negative indexes, to be
able to reverse order in exact same window frame for "FROM FIRST/FROM LAST",
but I am not sure, is such non-standard usage is acceptable (it's the same
as some array functions in programming language), if it's acceptable, it
can be easy ported to Vik's patch.
I am thinking also to concentrate on Vik's patch, if it has a clear design
point of view, clear design, I can withdraw mine patch.
На ср, 16.09.2020 г. в 11:19 Surafel Temesgen <surafel3000@gmail.com>
написа:
Show quoted text
On Thu, Mar 5, 2020 at 4:17 AM Krasiyan Andreev <krasiyan@gmail.com>
wrote:I have currently suspended development of this patch, based on it's
review,
but I will continue development of the other Oliver Ford's work about
adding support of respect/ignore nulls
for lag(),lead(),first_value(),last_value() and nth_value() and from
first/last for nth_value() patch,
but I am not sure how to proceed with it's implementation and any
feedback will be very helpful.* I applied your patch on top of 58c47ccfff20b8c125903 . It applied
cleanly , compiled, make check pass, but it have white space errors:*Added functions on windowfuncs.c have no comments so it's not easily
understandable.* Regression test addition seems huge to me. Can you reduce that? You can
use existing tables and fewer records.* I don’t understand why this patch has to change makeBoolAConst? It
already make “bool” constant noderegards
Surafel