Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls
The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead,
lag, [...]. This is not implemented in PostgreSQL
(http://www.postgresql.org/docs/devel/static/functions-window.html)
I've had a go at implementing this, and I've attached the resulting patch.
It's not finished yet, but I was hoping to find out if my solution is along
the right lines.
In particular, I'm storing the ignore-nulls flag in the frameOptions of a
window function definition, and am adding a function to the windowapi.h to
get at these options. I'm keeping the last non-null value in
WinGetPartitionLocalMemory (which I hope is the right place), but I'm not
using any of the *GetDatum macros to access it.
An example of my change's behaviour:
nwhite=# select *, lag(num,0) ignore nulls over (order by generate_series)
from
nwhite-# (select generate_series from generate_series(0,10)) s
nwhite-# left outer join
nwhite-# numbers n
nwhite-# on (s.generate_series = n.num);
generate_series | num | lag
-----------------+-----+-----
0 | |
1 | 1 | 1
2 | | 1
3 | | 1
4 | 4 | 4
5 | 5 | 5
6 | | 5
7 | | 5
8 | | 5
9 | 9 | 9
10 | | 9
(11 rows)
I'd find this feature really useful, so I hope you can help me get my patch
to a contributable state.
Thanks -
Nick
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 896c08c..75cb36e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12003,6 +12003,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [respect nulls]|[ignore nulls]
</function>
</entry>
<entry>
@@ -12017,7 +12018,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified and a previous evalution in the
+ current window has returned a non-null value then that value will be
+ returned instead.
</entry>
</row>
@@ -12030,6 +12034,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [respect nulls]|[ignore nulls]
</function>
</entry>
<entry>
@@ -12044,7 +12049,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <literal>IGNORE NULLS</> is specified and a previous evalution in the
+ current window has returned a non-null value then that value will be
+ returned instead.
</entry>
</row>
@@ -12138,11 +12145,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 3bc42ba..548c506 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -1996,6 +1996,16 @@ WinGetCurrentPosition(WindowObject winobj)
Assert(WindowObjectIsValid(winobj));
return winobj->winstate->currentpos;
}
+/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
/*
* WinGetPartitionRowCount
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 9d07f30..c6c2584 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -496,6 +496,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> window_clause window_definition_list opt_partition_clause
%type <windef> window_definition over_clause window_specification
opt_frame_clause frame_extent frame_bound
+ over_specification
%type <str> opt_existing_window_name
%type <boolean> opt_if_not_exists
@@ -551,7 +552,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -581,7 +582,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -11785,7 +11786,8 @@ window_definition:
}
;
-over_clause: OVER window_specification
+over_specification:
+ OVER window_specification
{ $$ = $2; }
| OVER ColId
{
@@ -11800,9 +11802,18 @@ over_clause: OVER window_specification
n->location = @2;
$$ = n;
}
- | /*EMPTY*/
- { $$ = NULL; }
- ;
+ ;
+
+over_clause: RESPECT NULLS_P over_specification { $$ = $3; }
+ | IGNORE NULLS_P over_specification
+ {
+ if($3)
+ $3->frameOptions |= FRAMEOPTION_IGNORE_NULLS;
+ $$ = $3;
+ }
+ | over_specification
+ | /*EMPTY*/ { $$ = NULL; }
+ ;
window_specification: '(' opt_existing_window_name opt_partition_clause
opt_sort_clause opt_frame_clause ')'
@@ -13007,6 +13018,7 @@ type_func_name_keyword:
| CURRENT_SCHEMA
| FREEZE
| FULL
+ | IGNORE
| ILIKE
| INNER_P
| IS
@@ -13019,6 +13031,7 @@ type_func_name_keyword:
| OUTER_P
| OVER
| OVERLAPS
+ | RESPECT
| RIGHT
| SIMILAR
| VERBOSE
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index 2f171ac..3144fd7 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -292,6 +292,7 @@ leadlag_common(FunctionCallInfo fcinfo,
Datum result;
bool isnull;
bool isout;
+ bool ignore_nulls;
if (withoffset)
{
@@ -322,8 +323,29 @@ leadlag_common(FunctionCallInfo fcinfo,
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
}
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
+ if(ignore_nulls)
+ {
+ /*
+ * We'll keep the last non-null value we've seen in our per-partition chunk
+ * of memory, so it gets cleaned up for us.
+ */
+ Datum* stash = (Datum*) WinGetPartitionLocalMemory(winobj, sizeof(Datum));
+ if(isnull)
+ {
+ result = *stash;
+ isnull = result == 0;
+ }
+ else
+ {
+ *stash = result;
+ }
+ }
+
if (isnull)
+ {
PG_RETURN_NULL();
+ }
PG_RETURN_DATUM(result);
}
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2229ef0..a13c58b 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -435,6 +435,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000 /* lead/lag/nth */
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 68a13b7..99f96a5 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 752c7b4..bcc9140 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,20 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null),
+('sales', 1, 5000, '2006-10-01', null),
+('personnel', 5, 3500, '2007-12-10', null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22'),
+('personnel', 2, 3900, '2006-12-23', null),
+('develop', 7, 4200, '2008-01-01', null),
+('develop', 9, 4500, '2008-01-01', null),
+('sales', 3, 4800, '2007-08-01', '2009-03-05'),
+('develop', 8, 6000, '2006-10-01', '2009-11-17'),
+('develop', 11, 5200, '2007-08-15', null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -1020,5 +1021,96 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour
+SELECT lag(term_date) OVER (ORDER BY empno) FROM empsalary;
+ lag
+------------
+
+
+
+ 03-05-2009
+ 09-22-2010
+
+
+ 11-17-2009
+
+
+(10 rows)
+
+SELECT lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+ lag
+------------
+
+
+
+ 03-05-2009
+ 09-22-2010
+
+
+ 11-17-2009
+
+
+(10 rows)
+
+SELECT lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ lag
+------------
+
+
+
+ 03-05-2009
+ 09-22-2010
+ 09-22-2010
+ 09-22-2010
+ 11-17-2009
+ 11-17-2009
+ 11-17-2009
+(10 rows)
+
+SELECT lead(term_date) OVER (ORDER BY empno) FROM empsalary;
+ lead
+------------
+
+ 03-05-2009
+ 09-22-2010
+
+
+ 11-17-2009
+
+
+
+
+(10 rows)
+
+SELECT lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+ lead
+------------
+
+ 03-05-2009
+ 09-22-2010
+
+
+ 11-17-2009
+
+
+
+
+(10 rows)
+
+SELECT lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ lead
+------------
+
+ 03-05-2009
+ 09-22-2010
+ 09-22-2010
+ 09-22-2010
+ 11-17-2009
+ 11-17-2009
+ 11-17-2009
+ 11-17-2009
+ 11-17-2009
+(10 rows)
+
-- cleanup
DROP TABLE empsalary;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..cc9b583 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null),
+('sales', 1, 5000, '2006-10-01', null),
+('personnel', 5, 3500, '2007-12-10', null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22'),
+('personnel', 2, 3900, '2006-12-23', null),
+('develop', 7, 4200, '2008-01-01', null),
+('develop', 9, 4500, '2008-01-01', null),
+('sales', 3, 4800, '2007-08-01', '2009-03-05'),
+('develop', 8, 6000, '2006-10-01', '2009-11-17'),
+('develop', 11, 5200, '2007-08-15', null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -264,5 +265,18 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour
+SELECT lag(term_date) OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lead(term_date) OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
-- cleanup
DROP TABLE empsalary;
Nicholas White <n.j.white@gmail.com> writes:
The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead,
lag, [...]. This is not implemented in PostgreSQL
(http://www.postgresql.org/docs/devel/static/functions-window.html)
I've had a go at implementing this, and I've attached the resulting patch.
It's not finished yet, but I was hoping to find out if my solution is along
the right lines.
Since we're trying to get 9.3 to closure, this patch probably isn't
going to get much attention until the 9.4 development cycle starts
(in a couple of months, likely). In the meantime, please add it to
the next commitfest list so we remember to come back to it:
https://commitfest.postgresql.org/action/commitfest_view?id=18
One comment just from a quick eyeball look is that we really hate
adding new keywords that aren't UNRESERVED, because that risks
breaking existing applications. Please see if you can refactor the
grammar to make those new entries unreserved.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Thanks - I've added it here:
https://commitfest.postgresql.org/action/patch_view?id=1096 .
I've also attached a revised version that makes IGNORE and RESPECT
UNRESERVED keywords (following the pattern of NULLS_FIRST and NULLS_LAST).
Nick
On 23 March 2013 14:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Nicholas White <n.j.white@gmail.com> writes:
The SQL standard defines a RESPECT NULLS or IGNORE NULLS option for lead,
lag, [...]. This is not implemented in PostgreSQL
(http://www.postgresql.org/docs/devel/static/functions-window.html)
I've had a go at implementing this, and I've attached the resultingpatch.
It's not finished yet, but I was hoping to find out if my solution is
along
the right lines.
Since we're trying to get 9.3 to closure, this patch probably isn't
going to get much attention until the 9.4 development cycle starts
(in a couple of months, likely). In the meantime, please add it to
the next commitfest list so we remember to come back to it:
https://commitfest.postgresql.org/action/commitfest_view?id=18One comment just from a quick eyeball look is that we really hate
adding new keywords that aren't UNRESERVED, because that risks
breaking existing applications. Please see if you can refactor the
grammar to make those new entries unreserved.regards, tom lane
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 896c08c..75cb36e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12003,6 +12003,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [respect nulls]|[ignore nulls]
</function>
</entry>
<entry>
@@ -12017,7 +12018,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified and a previous evalution in the
+ current window has returned a non-null value then that value will be
+ returned instead.
</entry>
</row>
@@ -12030,6 +12034,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [respect nulls]|[ignore nulls]
</function>
</entry>
<entry>
@@ -12044,7 +12049,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <literal>IGNORE NULLS</> is specified and a previous evalution in the
+ current window has returned a non-null value then that value will be
+ returned instead.
</entry>
</row>
@@ -12138,11 +12145,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 3bc42ba..548c506 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -1996,6 +1996,16 @@ WinGetCurrentPosition(WindowObject winobj)
Assert(WindowObjectIsValid(winobj));
return winobj->winstate->currentpos;
}
+/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
/*
* WinGetPartitionRowCount
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 9d07f30..6dda644 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -496,6 +496,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> window_clause window_definition_list opt_partition_clause
%type <windef> window_definition over_clause window_specification
opt_frame_clause frame_extent frame_bound
+ over_specification
%type <str> opt_existing_window_name
%type <boolean> opt_if_not_exists
@@ -551,7 +552,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -581,7 +582,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -615,6 +616,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
* creates these tokens when required.
*/
%token NULLS_FIRST NULLS_LAST WITH_TIME
+%token RESPECT_NULLS IGNORE_NULLS
/* Precedence: lowest to highest */
@@ -11785,7 +11787,8 @@ window_definition:
}
;
-over_clause: OVER window_specification
+over_specification:
+ OVER window_specification
{ $$ = $2; }
| OVER ColId
{
@@ -11800,6 +11803,18 @@ over_clause: OVER window_specification
n->location = @2;
$$ = n;
}
+ ;
+
+over_clause: over_specification
+ { $$ = $1; }
+ | RESPECT_NULLS over_specification
+ { $$ = $2; }
+ | IGNORE_NULLS over_specification
+ {
+ if($2)
+ $2->frameOptions |= FRAMEOPTION_IGNORE_NULLS;
+ $$ = $2;
+ }
| /*EMPTY*/
{ $$ = NULL; }
;
@@ -12765,6 +12780,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12852,6 +12868,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index b8ec790..25d09e0 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -156,6 +156,33 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
}
break;
+ /*
+ * Window functions can use RESPECT NULLS or IGNORE NULLS to
+ * modify their behaviour
+ */
+ case RESPECT:
+ cur_yylval = lvalp->core_yystype;
+ cur_yylloc = *llocp;
+ next_token = core_yylex(&(lvalp->core_yystype), llocp, yyscanner);
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = RESPECT_NULLS;
+ break;
+ }
+ break;
+ case IGNORE:
+ cur_yylval = lvalp->core_yystype;
+ cur_yylloc = *llocp;
+ next_token = core_yylex(&(lvalp->core_yystype), llocp, yyscanner);
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = IGNORE_NULLS;
+ break;
+ }
+ break;
+
default:
break;
}
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index 2f171ac..3144fd7 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -292,6 +292,7 @@ leadlag_common(FunctionCallInfo fcinfo,
Datum result;
bool isnull;
bool isout;
+ bool ignore_nulls;
if (withoffset)
{
@@ -322,8 +323,29 @@ leadlag_common(FunctionCallInfo fcinfo,
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
}
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
+ if(ignore_nulls)
+ {
+ /*
+ * We'll keep the last non-null value we've seen in our per-partition chunk
+ * of memory, so it gets cleaned up for us.
+ */
+ Datum* stash = (Datum*) WinGetPartitionLocalMemory(winobj, sizeof(Datum));
+ if(isnull)
+ {
+ result = *stash;
+ isnull = result == 0;
+ }
+ else
+ {
+ *stash = result;
+ }
+ }
+
if (isnull)
+ {
PG_RETURN_NULL();
+ }
PG_RETURN_DATUM(result);
}
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2229ef0..a13c58b 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -435,6 +435,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000 /* lead/lag/nth */
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 68a13b7..2acf073 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl
index f4b51d6..fe5dcb3 100644
--- a/src/interfaces/ecpg/preproc/parse.pl
+++ b/src/interfaces/ecpg/preproc/parse.pl
@@ -45,6 +45,8 @@ my %replace_string = (
'WITH_TIME' => 'with time',
'NULLS_FIRST' => 'nulls first',
'NULLS_LAST' => 'nulls last',
+ 'RESPECT_NULLS' => 'respect nulls',
+ 'IGNORE_NULLS' => 'ignore nulls',
'TYPECAST' => '::',
'DOT_DOT' => '..',
'COLON_EQUALS' => ':=',);
diff --git a/src/interfaces/ecpg/preproc/parser.c b/src/interfaces/ecpg/preproc/parser.c
index 2ce9dd9..53f4167 100644
--- a/src/interfaces/ecpg/preproc/parser.c
+++ b/src/interfaces/ecpg/preproc/parser.c
@@ -121,6 +121,53 @@ filtered_base_yylex(void)
}
break;
+ /*
+ * Window functions can use RESPECT NULLS or IGNORE NULLS to
+ * modify their behaviour
+ */
+ case RESPECT:
+ cur_yylval = base_yylval;
+ cur_yylloc = base_yylloc;
+ next_token = base_yylex();
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = RESPECT_NULLS;
+ break;
+ default:
+ /* save the lookahead token for next time */
+ lookahead_token = next_token;
+ lookahead_yylval = base_yylval;
+ lookahead_yylloc = base_yylloc;
+ have_lookahead = true;
+ /* and back up the output info to cur_token */
+ base_yylval = cur_yylval;
+ base_yylloc = cur_yylloc;
+ break;
+ }
+ break;
+ case IGNORE:
+ cur_yylval = base_yylval;
+ cur_yylloc = base_yylloc;
+ next_token = base_yylex();
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = IGNORE_NULLS;
+ break;
+ default:
+ /* save the lookahead token for next time */
+ lookahead_token = next_token;
+ lookahead_yylval = base_yylval;
+ lookahead_yylloc = base_yylloc;
+ have_lookahead = true;
+ /* and back up the output info to cur_token */
+ base_yylval = cur_yylval;
+ base_yylloc = cur_yylloc;
+ break;
+ }
+ break;
+
default:
break;
}
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 752c7b4..bcc9140 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,20 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null),
+('sales', 1, 5000, '2006-10-01', null),
+('personnel', 5, 3500, '2007-12-10', null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22'),
+('personnel', 2, 3900, '2006-12-23', null),
+('develop', 7, 4200, '2008-01-01', null),
+('develop', 9, 4500, '2008-01-01', null),
+('sales', 3, 4800, '2007-08-01', '2009-03-05'),
+('develop', 8, 6000, '2006-10-01', '2009-11-17'),
+('develop', 11, 5200, '2007-08-15', null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -1020,5 +1021,96 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour
+SELECT lag(term_date) OVER (ORDER BY empno) FROM empsalary;
+ lag
+------------
+
+
+
+ 03-05-2009
+ 09-22-2010
+
+
+ 11-17-2009
+
+
+(10 rows)
+
+SELECT lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+ lag
+------------
+
+
+
+ 03-05-2009
+ 09-22-2010
+
+
+ 11-17-2009
+
+
+(10 rows)
+
+SELECT lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ lag
+------------
+
+
+
+ 03-05-2009
+ 09-22-2010
+ 09-22-2010
+ 09-22-2010
+ 11-17-2009
+ 11-17-2009
+ 11-17-2009
+(10 rows)
+
+SELECT lead(term_date) OVER (ORDER BY empno) FROM empsalary;
+ lead
+------------
+
+ 03-05-2009
+ 09-22-2010
+
+
+ 11-17-2009
+
+
+
+
+(10 rows)
+
+SELECT lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+ lead
+------------
+
+ 03-05-2009
+ 09-22-2010
+
+
+ 11-17-2009
+
+
+
+
+(10 rows)
+
+SELECT lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ lead
+------------
+
+ 03-05-2009
+ 09-22-2010
+ 09-22-2010
+ 09-22-2010
+ 11-17-2009
+ 11-17-2009
+ 11-17-2009
+ 11-17-2009
+ 11-17-2009
+(10 rows)
+
-- cleanup
DROP TABLE empsalary;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..cc9b583 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null),
+('sales', 1, 5000, '2006-10-01', null),
+('personnel', 5, 3500, '2007-12-10', null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22'),
+('personnel', 2, 3900, '2006-12-23', null),
+('develop', 7, 4200, '2008-01-01', null),
+('develop', 9, 4500, '2008-01-01', null),
+('sales', 3, 4800, '2007-08-01', '2009-03-05'),
+('develop', 8, 6000, '2006-10-01', '2009-11-17'),
+('develop', 11, 5200, '2007-08-15', null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -264,5 +265,18 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour
+SELECT lag(term_date) OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lead(term_date) OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
-- cleanup
DROP TABLE empsalary;
On Sat, Mar 23, 2013 at 3:25 PM, Nicholas White <n.j.white@gmail.com> wrote:
Thanks - I've added it here:
https://commitfest.postgresql.org/action/patch_view?id=1096 .I've also attached a revised version that makes IGNORE and RESPECT
UNRESERVED keywords (following the pattern of NULLS_FIRST and NULLS_LAST).
Hm, you made another lookahead in base_yylex to make them unreserved --
looks ok, but not sure if there was no way to do it.
You might want to try byref types such as text. It seems you need to copy
the datum to save the value in appropriate memory context. Also, try to
create a view on those expressions. I don't think it correctly preserves
it.
Thanks,
--
Hitoshi Harada
Thanks for the feedback.
For the parsing changes, it seems I can either make RESPECT and IGNORE
reserved keywords, or add a lookahead to construct synthetic RESPECT NULLS
and IGNORE NULLS keywords. The grammar wouldn't compile if RESPECT and
IGNORE were just normal unreserved keywords due to ambiguities after a
function definition (e.g. select abs(1) respect; - which is currently a
valid statement).
I've redone the leadlag function changes to use datumCopy as you suggested.
However, I've had to remove the NOT_USED #ifdef around datumFree so I can
use it to avoid building up large numbers of copies of Datums in the memory
context while a query is executing. I've attached the revised patch...
Thanks -
Nick
On 24 March 2013 03:43, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
Show quoted text
On Sat, Mar 23, 2013 at 3:25 PM, Nicholas White <n.j.white@gmail.com>wrote:
Thanks - I've added it here:
https://commitfest.postgresql.org/action/patch_view?id=1096 .I've also attached a revised version that makes IGNORE and RESPECT
UNRESERVED keywords (following the pattern of NULLS_FIRST and NULLS_LAST).Hm, you made another lookahead in base_yylex to make them unreserved --
looks ok, but not sure if there was no way to do it.You might want to try byref types such as text. It seems you need to copy
the datum to save the value in appropriate memory context. Also, try to
create a view on those expressions. I don't think it correctly preserves
it.Thanks,
--
Hitoshi Harada
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/src/.DS_Store b/src/.DS_Store
new file mode 100644
index 0000000..535a8dc
Binary files /dev/null and b/src/.DS_Store differ
diff --git a/src/backend/.DS_Store b/src/backend/.DS_Store
new file mode 100644
index 0000000..327792b
Binary files /dev/null and b/src/backend/.DS_Store differ
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 3bc42ba..eb42901 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -1986,6 +1986,40 @@ WinGetPartitionLocalMemory(WindowObject winobj, Size sz)
}
/*
+ * WinGetResultDatumCopy
+ * Gets a copy of the given datum.
+ *
+ * This uses the window's per-function ByVal and TypeLen information
+ * when copying the datum.
+ */
+Datum
+WinGetResultDatumCopy(WindowObject winobj, Datum datum)
+{
+ WindowStatePerFunc perfunc;
+
+ Assert(WindowObjectIsValid(winobj));
+ perfunc = winobj->winstate->perfunc;
+ return datumCopy(datum, perfunc->resulttypeByVal, perfunc->resulttypeLen);
+}
+
+/*
+ * WinFreeResultDatumCopy
+ * Frees a Datum previously created by WinGetResultDatumCopy.
+ *
+ * This uses the window's per-function ByVal and TypeLen information
+ * when copying the datum.
+ */
+void
+WinFreeResultDatumCopy(WindowObject winobj, Datum datum)
+{
+ WindowStatePerFunc perfunc;
+
+ Assert(WindowObjectIsValid(winobj));
+ perfunc = winobj->winstate->perfunc;
+ datumFree(datum, perfunc->resulttypeByVal, perfunc->resulttypeLen);
+}
+
+/*
* WinGetCurrentPosition
* Return the current row's position (counting from 0) within the current
* partition.
@@ -1996,6 +2030,16 @@ WinGetCurrentPosition(WindowObject winobj)
Assert(WindowObjectIsValid(winobj));
return winobj->winstate->currentpos;
}
+/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
/*
* WinGetPartitionRowCount
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 9d07f30..6dda644 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -496,6 +496,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> window_clause window_definition_list opt_partition_clause
%type <windef> window_definition over_clause window_specification
opt_frame_clause frame_extent frame_bound
+ over_specification
%type <str> opt_existing_window_name
%type <boolean> opt_if_not_exists
@@ -551,7 +552,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -581,7 +582,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -615,6 +616,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
* creates these tokens when required.
*/
%token NULLS_FIRST NULLS_LAST WITH_TIME
+%token RESPECT_NULLS IGNORE_NULLS
/* Precedence: lowest to highest */
@@ -11785,7 +11787,8 @@ window_definition:
}
;
-over_clause: OVER window_specification
+over_specification:
+ OVER window_specification
{ $$ = $2; }
| OVER ColId
{
@@ -11800,6 +11803,18 @@ over_clause: OVER window_specification
n->location = @2;
$$ = n;
}
+ ;
+
+over_clause: over_specification
+ { $$ = $1; }
+ | RESPECT_NULLS over_specification
+ { $$ = $2; }
+ | IGNORE_NULLS over_specification
+ {
+ if($2)
+ $2->frameOptions |= FRAMEOPTION_IGNORE_NULLS;
+ $$ = $2;
+ }
| /*EMPTY*/
{ $$ = NULL; }
;
@@ -12765,6 +12780,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12852,6 +12868,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index b8ec790..25d09e0 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -156,6 +156,33 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
}
break;
+ /*
+ * Window functions can use RESPECT NULLS or IGNORE NULLS to
+ * modify their behaviour
+ */
+ case RESPECT:
+ cur_yylval = lvalp->core_yystype;
+ cur_yylloc = *llocp;
+ next_token = core_yylex(&(lvalp->core_yystype), llocp, yyscanner);
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = RESPECT_NULLS;
+ break;
+ }
+ break;
+ case IGNORE:
+ cur_yylval = lvalp->core_yystype;
+ cur_yylloc = *llocp;
+ next_token = core_yylex(&(lvalp->core_yystype), llocp, yyscanner);
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = IGNORE_NULLS;
+ break;
+ }
+ break;
+
default:
break;
}
diff --git a/src/backend/utils/.DS_Store b/src/backend/utils/.DS_Store
new file mode 100644
index 0000000..cbbea58
Binary files /dev/null and b/src/backend/utils/.DS_Store differ
diff --git a/src/backend/utils/adt/datum.c b/src/backend/utils/adt/datum.c
index 612b7ef..e72f5a9 100644
--- a/src/backend/utils/adt/datum.c
+++ b/src/backend/utils/adt/datum.c
@@ -129,7 +129,7 @@ datumCopy(Datum value, bool typByVal, int typLen)
realSize = datumGetSize(value, typByVal, typLen);
s = (char *) palloc(realSize);
- memcpy(s, DatumGetPointer(value), realSize);
+ memcpy(s, (value), realSize);
res = PointerGetDatum(s);
}
return res;
@@ -144,7 +144,6 @@ datumCopy(Datum value, bool typByVal, int typLen)
* ONLY datums created by "datumCopy" can be freed!
*-------------------------------------------------------------------------
*/
-#ifdef NOT_USED
void
datumFree(Datum value, bool typByVal, int typLen)
{
@@ -155,7 +154,6 @@ datumFree(Datum value, bool typByVal, int typLen)
pfree(s);
}
}
-#endif
/*-------------------------------------------------------------------------
* datumIsEqual
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index 2f171ac..fea772f 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -25,6 +25,15 @@ typedef struct rank_context
} rank_context;
/*
+ * structure for IGNORE NULLS / RESPECT NULLS semantics
+ */
+typedef struct leadlag_context
+{
+ Datum last; /* last non-null result */
+ bool last_isnull;
+} leadlag_context;
+
+/*
* ntile process information
*/
typedef struct
@@ -292,6 +301,8 @@ leadlag_common(FunctionCallInfo fcinfo,
Datum result;
bool isnull;
bool isout;
+ bool ignore_nulls;
+ leadlag_context* context;
if (withoffset)
{
@@ -322,8 +333,47 @@ leadlag_common(FunctionCallInfo fcinfo,
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
}
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
+ if(ignore_nulls)
+ {
+ /*
+ * We'll keep the last non-null value we've seen in our per-partition chunk
+ * of memory, so it gets cleaned up for us.
+ */
+ context = (leadlag_context *)
+ WinGetPartitionLocalMemory(winobj, sizeof(leadlag_context));
if(isnull)
+ {
+ if(context->last != NULL)
+ {
+ /* restore the stashed copy */
+ result = context->last;
+ isnull = context->last_isnull;
+ }
+ }
+ else
+ {
+ if(context->last != NULL)
+ {
+ /*
+ * This step is not strictly necessary as the Datum copies are
+ * allocated in a context that'll be discarded after this query.
+ * However, we'd like to avoid a large memory spike during the
+ * query (as we'd get if we kept a copy of all the non-null
+ * results for the duration of the query) so we'll free the
+ * Datum copies as we go along:
+ */
+ WinFreeResultDatumCopy(winobj, context->last);
+ }
+ context->last = WinGetResultDatumCopy(winobj, result);
+ context->last_isnull = isnull;
+ }
+ }
+
+ if (isnull)
+ {
PG_RETURN_NULL();
+ }
PG_RETURN_DATUM(result);
}
diff --git a/src/include/.DS_Store b/src/include/.DS_Store
new file mode 100644
index 0000000..d5711fe
Binary files /dev/null and b/src/include/.DS_Store differ
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 2229ef0..a13c58b 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -435,6 +435,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000 /* lead/lag/nth */
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 68a13b7..2acf073 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..73524ca 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
@@ -61,4 +63,7 @@ extern Datum WinGetFuncArgInFrame(WindowObject winobj, int argno,
extern Datum WinGetFuncArgCurrent(WindowObject winobj, int argno,
bool *isnull);
+extern Datum WinGetResultDatumCopy(WindowObject winobj, Datum datum);
+extern void WinFreeResultDatumCopy(WindowObject winobj, Datum datum);
+
#endif /* WINDOWAPI_H */
diff --git a/src/interfaces/.DS_Store b/src/interfaces/.DS_Store
new file mode 100644
index 0000000..c8d03e0
Binary files /dev/null and b/src/interfaces/.DS_Store differ
diff --git a/src/interfaces/ecpg/.DS_Store b/src/interfaces/ecpg/.DS_Store
new file mode 100644
index 0000000..b3e3f35
Binary files /dev/null and b/src/interfaces/ecpg/.DS_Store differ
diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl
index f4b51d6..fe5dcb3 100644
--- a/src/interfaces/ecpg/preproc/parse.pl
+++ b/src/interfaces/ecpg/preproc/parse.pl
@@ -45,6 +45,8 @@ my %replace_string = (
'WITH_TIME' => 'with time',
'NULLS_FIRST' => 'nulls first',
'NULLS_LAST' => 'nulls last',
+ 'RESPECT_NULLS' => 'respect nulls',
+ 'IGNORE_NULLS' => 'ignore nulls',
'TYPECAST' => '::',
'DOT_DOT' => '..',
'COLON_EQUALS' => ':=',);
diff --git a/src/interfaces/ecpg/preproc/parser.c b/src/interfaces/ecpg/preproc/parser.c
index 2ce9dd9..53f4167 100644
--- a/src/interfaces/ecpg/preproc/parser.c
+++ b/src/interfaces/ecpg/preproc/parser.c
@@ -121,6 +121,53 @@ filtered_base_yylex(void)
}
break;
+ /*
+ * Window functions can use RESPECT NULLS or IGNORE NULLS to
+ * modify their behaviour
+ */
+ case RESPECT:
+ cur_yylval = base_yylval;
+ cur_yylloc = base_yylloc;
+ next_token = base_yylex();
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = RESPECT_NULLS;
+ break;
+ default:
+ /* save the lookahead token for next time */
+ lookahead_token = next_token;
+ lookahead_yylval = base_yylval;
+ lookahead_yylloc = base_yylloc;
+ have_lookahead = true;
+ /* and back up the output info to cur_token */
+ base_yylval = cur_yylval;
+ base_yylloc = cur_yylloc;
+ break;
+ }
+ break;
+ case IGNORE:
+ cur_yylval = base_yylval;
+ cur_yylloc = base_yylloc;
+ next_token = base_yylex();
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = IGNORE_NULLS;
+ break;
+ default:
+ /* save the lookahead token for next time */
+ lookahead_token = next_token;
+ lookahead_yylval = base_yylval;
+ lookahead_yylloc = base_yylloc;
+ have_lookahead = true;
+ /* and back up the output info to cur_token */
+ base_yylval = cur_yylval;
+ base_yylloc = cur_yylloc;
+ break;
+ }
+ break;
+
default:
break;
}
diff --git a/src/test/.DS_Store b/src/test/.DS_Store
new file mode 100644
index 0000000..7c9864e
Binary files /dev/null and b/src/test/.DS_Store differ
diff --git a/src/test/regress/.DS_Store b/src/test/regress/.DS_Store
new file mode 100644
index 0000000..a623e62
Binary files /dev/null and b/src/test/regress/.DS_Store differ
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 752c7b4..41d76a6 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ favourite_animal text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -1020,5 +1022,114 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour: (1) lags
+SELECT lag(term_date) OVER (ORDER BY empno) FROM empsalary;
+ lag
+------------
+
+
+
+ 03-05-2009
+ 09-22-2010
+
+
+ 11-17-2009
+
+
+(10 rows)
+
+SELECT lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+ lag
+------------
+
+
+
+ 03-05-2009
+ 09-22-2010
+
+
+ 11-17-2009
+
+
+(10 rows)
+
+-- a numeric (date) column
+SELECT lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ lag
+------------
+
+
+
+ 03-05-2009
+ 09-22-2010
+ 09-22-2010
+ 09-22-2010
+ 11-17-2009
+ 11-17-2009
+ 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT lag(favourite_animal) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ lag
+---------
+
+ frog
+ frog
+ frog
+ chicken
+ chicken
+ chicken
+ tiger
+ gorilla
+ gorilla
+(10 rows)
+
+-- (2) leads
+SELECT lead(term_date) OVER (ORDER BY empno) FROM empsalary;
+ lead
+------------
+
+ 03-05-2009
+ 09-22-2010
+
+
+ 11-17-2009
+
+
+
+
+(10 rows)
+
+SELECT lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+ lead
+------------
+
+ 03-05-2009
+ 09-22-2010
+
+
+ 11-17-2009
+
+
+
+
+(10 rows)
+
+SELECT lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ lead
+------------
+
+ 03-05-2009
+ 09-22-2010
+ 09-22-2010
+ 09-22-2010
+ 11-17-2009
+ 11-17-2009
+ 11-17-2009
+ 11-17-2009
+ 11-17-2009
+(10 rows)
+
-- cleanup
DROP TABLE empsalary;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..67e9a9a 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ favourite_animal text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -264,5 +266,25 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour: (1) lags
+
+SELECT lag(term_date) OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+
+-- a numeric (date) column
+SELECT lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
+-- a text column
+SELECT lag(favourite_animal) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
+-- (2) leads
+
+SELECT lead(term_date) OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+
+SELECT lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
-- cleanup
DROP TABLE empsalary;
On Sun, 2013-03-24 at 20:15 -0400, Nicholas White wrote:
I've redone the leadlag function changes to use datumCopy as you
suggested. However, I've had to remove the NOT_USED #ifdef around
datumFree so I can use it to avoid building up large numbers of copies
of Datums in the memory context while a query is executing. I've
attached the revised patch...
Comments:
WinGetResultDatumCopy() calls datumCopy, which will just copy in the
current memory context. I think you want it in the per-partition memory
context, otherwise the last value in each partition will stick around
until the query is done (so many partitions could be a problem). That
should be easy enough to do by switching to the
winobj->winstate->partcontext memory context before calling datumCopy,
and then switching back.
For that matter, why store the datum again at all? You can just store
the offset of the last non-NULL value in that partition, and then fetch
it using WinGetFuncArgInPartition(), right? We really want to avoid any
per-tuple allocations.
Alternatively, you might look into setting a mark when you get a
non-NULL value. Then you could just always fetch the oldest one.
Unfortunately, I think that only works with const_offset=true... so the
previous idea might be better.
I'll leave it to someone else to review the grammar changes.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Nicholas White escribi�:
For the parsing changes, it seems I can either make RESPECT and IGNORE
reserved keywords, or add a lookahead to construct synthetic RESPECT NULLS
and IGNORE NULLS keywords. The grammar wouldn't compile if RESPECT and
IGNORE were just normal unreserved keywords due to ambiguities after a
function definition (e.g. select abs(1) respect; - which is currently a
valid statement).
Well, making them reserved keywords is not that great, so maybe the
lookahead thingy is better. However, this patch introduces the third
and fourth uses of the "save the lookahead token" pattern in the
"default" switch cases. Can we refactor that bit somehow, to avoid so
many duplicates? (For a minute I thought that Andrew Gierth's WITH
ORDINALITY patch would add another one, but it seems not.)
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Jun 15, 2013 at 9:37 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
Nicholas White escribió:
For the parsing changes, it seems I can either make RESPECT and IGNORE
reserved keywords, or add a lookahead to construct synthetic RESPECT NULLS
and IGNORE NULLS keywords. The grammar wouldn't compile if RESPECT and
IGNORE were just normal unreserved keywords due to ambiguities after a
function definition (e.g. select abs(1) respect; - which is currently a
valid statement).Well, making them reserved keywords is not that great, so maybe the
lookahead thingy is better. However, this patch introduces the third
and fourth uses of the "save the lookahead token" pattern in the
"default" switch cases. Can we refactor that bit somehow, to avoid so
many duplicates? (For a minute I thought that Andrew Gierth's WITH
ORDINALITY patch would add another one, but it seems not.)
Making things reserved keywords is painful and I don't like it, but
I've started to become skeptical of shifting the problem to the lexer,
too. Sometimes special case logic in the lexer about token combining
can have surprising consequences in other parts of the grammar. For
example, with a lexer hack, what will happen if someone has a column
named RESPECT and does SELECT ... ORDER BY respect NULLS LAST? I
haven't studied the code in detail so maybe it's fine, but it's
something to think about.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Jun 15, 2013 at 1:30 PM, Jeff Davis <pgsql@j-davis.com> wrote:
On Sun, 2013-03-24 at 20:15 -0400, Nicholas White wrote:
I've redone the leadlag function changes to use datumCopy as you
suggested. However, I've had to remove the NOT_USED #ifdef around
datumFree so I can use it to avoid building up large numbers of copies
of Datums in the memory context while a query is executing. I've
attached the revised patch...Comments:
WinGetResultDatumCopy() calls datumCopy, which will just copy in the
current memory context. I think you want it in the per-partition memory
context, otherwise the last value in each partition will stick around
until the query is done (so many partitions could be a problem). That
should be easy enough to do by switching to the
winobj->winstate->partcontext memory context before calling datumCopy,
and then switching back.For that matter, why store the datum again at all? You can just store
the offset of the last non-NULL value in that partition, and then fetch
it using WinGetFuncArgInPartition(), right? We really want to avoid any
per-tuple allocations.
I believe WinGetFuncArgInPartition is a bit slow if the offset is far from
the current row. So it might make sense to store the last-seen value, but
I'm not sure if we need to copy datum every time. I haven't looked into
the new patch.
Thanks,
--
Hitoshi Harada
Thanks for the reviews. I've attached a revised patch that has the lexer
refactoring Alvaro mentions (arbitarily using a goto rather than a bool
flag) and uses Jeff's idea of just storing the index of the last non-null
value (if there is one) in the window function's context (and not the Datum
itself).
However, Robert's right that SELECT ... ORDER BY respect NULLS LAST will
now fail. An earlier iteration of the patch had RESPECT and IGNORE as
reserved, but that would have broken tables with columns called "respect"
(etc.), which the current version allows. Is this backwards incompatibility
acceptable? If not, maybe I should try doing a two-token lookahead in the
token-aggregating code between the lexer and the parser (i.e. make a
RESPECT_NULLS token out of a sequence of RESPECT NULLS OVER tokens,
remembering to replace the OVER token)? Or what about adding an %expect
statement to the Bison grammar, confirming that the shift / reduce
conflicts caused by using the RESPECT, IGNORE & NULL_P tokens the in
out_clause rule are OK?
Thanks -
Nick
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index d9f0e79..e1a1020 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2000,6 +2000,16 @@ WinGetCurrentPosition(WindowObject winobj)
Assert(WindowObjectIsValid(winobj));
return winobj->winstate->currentpos;
}
+/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
/*
* WinGetPartitionRowCount
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 5094226..0309a99 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -488,6 +488,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> window_clause window_definition_list opt_partition_clause
%type <windef> window_definition over_clause window_specification
opt_frame_clause frame_extent frame_bound
+ over_specification
%type <str> opt_existing_window_name
%type <boolean> opt_if_not_exists
@@ -543,7 +544,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -573,7 +574,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -607,6 +608,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
* creates these tokens when required.
*/
%token NULLS_FIRST NULLS_LAST WITH_TIME
+%token RESPECT_NULLS IGNORE_NULLS
/* Precedence: lowest to highest */
@@ -11752,7 +11754,8 @@ window_definition:
}
;
-over_clause: OVER window_specification
+over_specification:
+ OVER window_specification
{ $$ = $2; }
| OVER ColId
{
@@ -11767,6 +11770,18 @@ over_clause: OVER window_specification
n->location = @2;
$$ = n;
}
+ ;
+
+over_clause: over_specification
+ { $$ = $1; }
+ | RESPECT_NULLS over_specification
+ { $$ = $2; }
+ | IGNORE_NULLS over_specification
+ {
+ if($2)
+ $2->frameOptions |= FRAMEOPTION_IGNORE_NULLS;
+ $$ = $2;
+ }
| /*EMPTY*/
{ $$ = NULL; }
;
@@ -12740,6 +12755,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12827,6 +12843,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index b8ec790..1b2b2de 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -118,15 +118,7 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
cur_token = NULLS_LAST;
break;
default:
- /* save the lookahead token for next time */
- yyextra->lookahead_token = next_token;
- yyextra->lookahead_yylval = lvalp->core_yystype;
- yyextra->lookahead_yylloc = *llocp;
- yyextra->have_lookahead = true;
- /* and back up the output info to cur_token */
- lvalp->core_yystype = cur_yylval;
- *llocp = cur_yylloc;
- break;
+ goto restore_next_token;
}
break;
@@ -144,15 +136,38 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
cur_token = WITH_TIME;
break;
default:
- /* save the lookahead token for next time */
- yyextra->lookahead_token = next_token;
- yyextra->lookahead_yylval = lvalp->core_yystype;
- yyextra->lookahead_yylloc = *llocp;
- yyextra->have_lookahead = true;
- /* and back up the output info to cur_token */
- lvalp->core_yystype = cur_yylval;
- *llocp = cur_yylloc;
+ goto restore_next_token;
+ }
+ break;
+
+ /*
+ * Window functions can use RESPECT NULLS or IGNORE NULLS to
+ * modify their behaviour
+ */
+ case RESPECT:
+ cur_yylval = lvalp->core_yystype;
+ cur_yylloc = *llocp;
+ next_token = core_yylex(&(lvalp->core_yystype), llocp, yyscanner);
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = RESPECT_NULLS;
+ break;
+ default:
+ goto restore_next_token;
+ }
+ break;
+ case IGNORE:
+ cur_yylval = lvalp->core_yystype;
+ cur_yylloc = *llocp;
+ next_token = core_yylex(&(lvalp->core_yystype), llocp, yyscanner);
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = IGNORE_NULLS;
break;
+ default:
+ goto restore_next_token;
}
break;
@@ -161,4 +176,16 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
}
return cur_token;
+
+restore_next_token:
+ /* save the lookahead token for next time */
+ yyextra->lookahead_token = next_token;
+ yyextra->lookahead_yylval = lvalp->core_yystype;
+ yyextra->lookahead_yylloc = *llocp;
+ yyextra->have_lookahead = true;
+ /* and back up the output info to cur_token */
+ lvalp->core_yystype = cur_yylval;
+ *llocp = cur_yylloc;
+
+ return cur_token;
}
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..84110ab 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -25,6 +25,15 @@ typedef struct rank_context
} rank_context;
/*
+ * structure for IGNORE NULLS / RESPECT NULLS semantics
+ */
+typedef struct leadlag_context
+{
+ int64 last; /* last non-null result, initially 0 */
+ bool seen_one; /* true iff we can output the row in "last" now */
+} leadlag_context;
+
+/*
* ntile process information
*/
typedef struct
@@ -292,6 +301,15 @@ leadlag_common(FunctionCallInfo fcinfo,
Datum result;
bool isnull;
bool isout;
+ bool ignore_nulls;
+ leadlag_context* context;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but we can't move the mark
+ * beyond the last non-null tuple!
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,11 +323,15 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if(!forward)
+ {
+ offset = -offset;
+ }
result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
+ offset,
WINDOW_SEEK_CURRENT,
- const_offset,
+ const_offset && !ignore_nulls,
&isnull, &isout);
if (isout)
@@ -322,6 +344,34 @@ leadlag_common(FunctionCallInfo fcinfo,
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
}
+ if (ignore_nulls)
+ {
+ /*
+ * We'll keep the last non-null value we've seen in our per-partition chunk
+ * of memory, so it gets cleaned up for us.
+ */
+ context = (leadlag_context *)
+ WinGetPartitionLocalMemory(winobj, sizeof(leadlag_context));
+ if (isnull)
+ {
+ if (context->seen_one)
+ {
+ /* restore the datum at the stashed index */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ context->last,
+ WINDOW_SEEK_HEAD,
+ const_offset, /* drag mark up after us */
+ &isnull, &isout);
+ }
+ }
+ else
+ {
+ /* work out which tuple we just loaded */
+ context->last = WinGetCurrentPosition(winobj) + offset;
+ context->seen_one = true;
+ }
+ }
+
if (isnull)
PG_RETURN_NULL();
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6723647..59fc635 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -435,6 +435,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000 /* lead/lag/nth */
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 68a13b7..2acf073 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl
index f4b51d6..fe5dcb3 100644
--- a/src/interfaces/ecpg/preproc/parse.pl
+++ b/src/interfaces/ecpg/preproc/parse.pl
@@ -45,6 +45,8 @@ my %replace_string = (
'WITH_TIME' => 'with time',
'NULLS_FIRST' => 'nulls first',
'NULLS_LAST' => 'nulls last',
+ 'RESPECT_NULLS' => 'respect nulls',
+ 'IGNORE_NULLS' => 'ignore nulls',
'TYPECAST' => '::',
'DOT_DOT' => '..',
'COLON_EQUALS' => ':=',);
diff --git a/src/interfaces/ecpg/preproc/parser.c b/src/interfaces/ecpg/preproc/parser.c
index 2ce9dd9..3780652 100644
--- a/src/interfaces/ecpg/preproc/parser.c
+++ b/src/interfaces/ecpg/preproc/parser.c
@@ -83,15 +83,7 @@ filtered_base_yylex(void)
cur_token = NULLS_LAST;
break;
default:
- /* save the lookahead token for next time */
- lookahead_token = next_token;
- lookahead_yylval = base_yylval;
- lookahead_yylloc = base_yylloc;
- have_lookahead = true;
- /* and back up the output info to cur_token */
- base_yylval = cur_yylval;
- base_yylloc = cur_yylloc;
- break;
+ goto restore_next_token;
}
break;
@@ -109,15 +101,38 @@ filtered_base_yylex(void)
cur_token = WITH_TIME;
break;
default:
- /* save the lookahead token for next time */
- lookahead_token = next_token;
- lookahead_yylval = base_yylval;
- lookahead_yylloc = base_yylloc;
- have_lookahead = true;
- /* and back up the output info to cur_token */
- base_yylval = cur_yylval;
- base_yylloc = cur_yylloc;
+ goto restore_next_token;
+ }
+ break;
+
+ /*
+ * Window functions can use RESPECT NULLS or IGNORE NULLS to
+ * modify their behaviour
+ */
+ case RESPECT:
+ cur_yylval = base_yylval;
+ cur_yylloc = base_yylloc;
+ next_token = base_yylex();
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = RESPECT_NULLS;
+ break;
+ default:
+ goto restore_next_token;
+ }
+ break;
+ case IGNORE:
+ cur_yylval = base_yylval;
+ cur_yylloc = base_yylloc;
+ next_token = base_yylex();
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = IGNORE_NULLS;
break;
+ default:
+ goto restore_next_token;
}
break;
@@ -126,4 +141,16 @@ filtered_base_yylex(void)
}
return cur_token;
+
+restore_next_token:
+ /* save the lookahead token for next time */
+ lookahead_token = next_token;
+ lookahead_yylval = base_yylval;
+ lookahead_yylloc = base_yylloc;
+ have_lookahead = true;
+ /* and back up the output info to cur_token */
+ base_yylval = cur_yylval;
+ base_yylloc = cur_yylloc;
+
+ return cur_token;
}
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 752c7b4..47e3ba2 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -1020,5 +1022,114 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 09-22-2010
+ | 09-22-2010
+ | 11-17-2009
+ 11-17-2009 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
-- cleanup
DROP TABLE empsalary;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..eb0817b 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -264,5 +266,25 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
-- cleanup
DROP TABLE empsalary;
On Tue, Jun 18, 2013 at 6:27 PM, Nicholas White <n.j.white@gmail.com> wrote:
Thanks for the reviews. I've attached a revised patch that has the lexer
refactoring Alvaro mentions (arbitarily using a goto rather than a bool
flag) and uses Jeff's idea of just storing the index of the last non-null
value (if there is one) in the window function's context (and not the Datum
itself).However, Robert's right that SELECT ... ORDER BY respect NULLS LAST will now
fail. An earlier iteration of the patch had RESPECT and IGNORE as reserved,
but that would have broken tables with columns called "respect" (etc.),
which the current version allows. Is this backwards incompatibility
acceptable?
I think it's better to add new partially reserved keywords than to
have this kind of random breakage. When you make something a
partially-reserved keyword, then things break, but in a fairly
well-defined way. Lexer hacks can break things in ways that are much
subtler, which we may not even realize for a long time, and which in
that case would mean that the word "respect" needs to be quoted in
some contexts but not others. That's going to cause a lot of
headaches, because pg_dump etc. know about quoting reserved keywords,
but they don't know anything about quoting unreserved keywords in
contexts where they might happen to be followed by the wrong next
word.
If not, maybe I should try doing a two-token lookahead in the
token-aggregating code between the lexer and the parser (i.e. make a
RESPECT_NULLS token out of a sequence of RESPECT NULLS OVER tokens,
remembering to replace the OVER token)? Or what about adding an %expect
statement to the Bison grammar, confirming that the shift / reduce conflicts
caused by using the RESPECT, IGNORE & NULL_P tokens the in out_clause rule
are OK?
These lines of inquiry don't seem promising to me. It's going to be
complicated and unmaintainable and may just move the failure scenarios
to cases that are too obscure for us to reason about.
I think the question is whether this feature is really worth adding
new reserved keywords for. I have a hard time saying we shouldn't
support something that's part of the SQL standard, but personally,
it's not something I've seen come up prior to this thread.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, 2013-06-20 at 10:03 -0400, Robert Haas wrote:
I think the question is whether this feature is really worth adding
new reserved keywords for. I have a hard time saying we shouldn't
support something that's part of the SQL standard, but personally,
it's not something I've seen come up prior to this thread.
What's the next step here?
The feature sounds useful to me. If the grammar is unacceptable, does
someone have an alternative idea, like using new function names instead
of grammar? If so, what are reasonable names to use?
Also, I think someone mentioned this already, but what about
first_value() and last_value()? Shouldn't we do those at the same time?
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jun 21, 2013 at 12:18 AM, Jeff Davis <pgsql@j-davis.com> wrote:
On Thu, 2013-06-20 at 10:03 -0400, Robert Haas wrote:
I think the question is whether this feature is really worth adding
new reserved keywords for. I have a hard time saying we shouldn't
support something that's part of the SQL standard, but personally,
it's not something I've seen come up prior to this thread.What's the next step here?
Well, ideally, some other people weigh in on the value of the feature
vs. the pain of reserving the keywords.
The feature sounds useful to me.
...and there's one person with an opinion now! :-)
The other question here is - do we actually have the grammar right?
As in, is this actually the syntax we're supposed to be implementing?
It looks different from what's shown here, where the IGNORE NULLS is
inside the function's parentheses, rather than afterwards:
http://rwijk.blogspot.com/2010/06/simulating-laglead-ignore-nulls.html
IBM seems to think it's legal either inside or outside the parentheses:
Regardless of what syntax we settle on, we should also make sure that
the conflict is intrinsic to the grammar and can't be factored out, as
Tom suggested upthread. It's not obvious to me what the actual
ambiguity is here. If you've seen "select lag(num,0)" and the
lookahead token is "respect", what's the problem? It sort of looks
like it could be a column label, but not even unreserved keywords can
be column labels, so that's not it. Probably deserves a bit more
investigation...
If the grammar is unacceptable, does
someone have an alternative idea, like using new function names instead
of grammar? If so, what are reasonable names to use?
We could just add additional, optional Boolean argument to the
existing functions. It's non-standard, but we avoid adding keywords.
Also, I think someone mentioned this already, but what about
first_value() and last_value()? Shouldn't we do those at the same time?
Not sure.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, 2013-06-21 at 09:21 -0400, Robert Haas wrote:
The other question here is - do we actually have the grammar right?
As in, is this actually the syntax we're supposed to be implementing?
It looks different from what's shown here, where the IGNORE NULLS is
inside the function's parentheses, rather than afterwards:http://rwijk.blogspot.com/2010/06/simulating-laglead-ignore-nulls.html
IBM seems to think it's legal either inside or outside the parentheses:
The spec seems pretty clear that it falls outside of the parentheses,
unless I am missing something.
Regardless of what syntax we settle on, we should also make sure that
the conflict is intrinsic to the grammar and can't be factored out, as
Tom suggested upthread. It's not obvious to me what the actual
ambiguity is here. If you've seen "select lag(num,0)" and the
lookahead token is "respect", what's the problem? It sort of looks
like it could be a column label, but not even unreserved keywords can
be column labels, so that's not it. Probably deserves a bit more
investigation...
I think the problem is when the function is used as a table function;
e.g.:
SELECT * FROM generate_series(1,10) respect;
We could just add additional, optional Boolean argument to the
existing functions. It's non-standard, but we avoid adding keywords.
I thought about that, but it is awkward because the argument would have
to be constant (or, if not, it would be quite strange).
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jun 21, 2013 at 11:33 AM, Jeff Davis <pgsql@j-davis.com> wrote:
Regardless of what syntax we settle on, we should also make sure that
the conflict is intrinsic to the grammar and can't be factored out, as
Tom suggested upthread. It's not obvious to me what the actual
ambiguity is here. If you've seen "select lag(num,0)" and the
lookahead token is "respect", what's the problem? It sort of looks
like it could be a column label, but not even unreserved keywords can
be column labels, so that's not it. Probably deserves a bit more
investigation...I think the problem is when the function is used as a table function;
e.g.:SELECT * FROM generate_series(1,10) respect;
Ah ha. Well, there's probably not much help for that. Disallowing
keywords as table aliases would be a cure worse than the disease, I
think. I suppose the good news is that there probably aren't many
people using RESPECT as a column name, but I have a feeling that we're
almost certain to get complaints about reserving IGNORE. I think that
will have to be quoted as a PL/pgsql variable name as well. :-(
We could just add additional, optional Boolean argument to the
existing functions. It's non-standard, but we avoid adding keywords.I thought about that, but it is awkward because the argument would have
to be constant (or, if not, it would be quite strange).
True... but e.g. string_agg() has the same issue.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello all
I've been examining PostgreSQL to gain a greater understanding
of RDBMS. (Thanks for a nice, very educational system!)
In the process I've been looking into a few problems and the
complications of this patch appeared relatively uninvolved, so I
tried to look for a solution.
I found the following:
The grammar conflict appears to be because of ambiguities in:
1. table_ref (used exclusively in FROM clauses)
2. index_elem (used exclusively in INDEX creation statements).
Now, this doesn't seem to make much sense, as AFAICT window functions
are explicitly disallowed in these contexts (transformWindowFuncCall
will yield errors, and I can't really wrap my head around what a
window function call would mean there).
I therefore propose a simple rearrangement of the grammar,
syntactically disallowing window functions in the outer part of those
contexts (a_expr's inside can't and shouldn't be done much about)
which will allow both RESPECT and IGNORE to become unreserved
keywords, without doing any lexer hacking or abusing the grammar.
I've attached a patch which will add RESPECT NULLS and IGNORE NULLS to
the grammar in the right place. Also the window frame options are set
but nothing more, so this patch needs to be merged with Nicholas White's
original patch.
One problem I see with this approach to the grammar is that the
error messages will change when putting window functions in any of the
forbidden places. The new error messages are I think worse and less
specific than the old ones. I suppose that can be fixed though, or
maybe the problem isn't so severe.
Example of old error message:
window functions are not allowed in functions in FROM
New error message:
syntax error at or near "OVER"
in addition I think the original patch as it stands has a few
problems that I haven't seen discussed:
1. The result of the current patch using lead
create table test_table (
id serial,
val integer);
insert into test_table (val) select * from unnest(ARRAY[1,2,3,4,NULL, NULL,
NULL, 5, 6, 7]);
select val, lead(val, 2) ignore nulls over (order by id) from test_table;
val | lead
-----+------
1 | 3
2 | 4
3 | 4
4 | 4
| 4
| 5
| 6
5 | 7
6 | 7
7 | 7
(10 rows)
I would expect it to output:
select val, lead(val, 2) ignore nulls over (order by id) from test_table;
val | lead
-----+------
1 | 3
2 | 4
3 | 5
4 | 6
| 6
| 6
| 6
5 | 7
6 |
7 |
(10 rows)
That is: skip two rows forward not counting null rows.
The lag behavior works well as far as I can see.
2. It would be nice if an error was given when ignore nulls was used
on a function for which it had no effect. Perhaps this should be up to
the different window function themselves to do though.
Apart from those points I think the original patch is nice and provides a
functionality
that's definitely nice to have.
Kind Regards
Troels Nielsen
On Fri, Jun 21, 2013 at 8:11 PM, Robert Haas <robertmhaas@gmail.com> wrote:
Show quoted text
On Fri, Jun 21, 2013 at 11:33 AM, Jeff Davis <pgsql@j-davis.com> wrote:
Regardless of what syntax we settle on, we should also make sure that
the conflict is intrinsic to the grammar and can't be factored out, as
Tom suggested upthread. It's not obvious to me what the actual
ambiguity is here. If you've seen "select lag(num,0)" and the
lookahead token is "respect", what's the problem? It sort of looks
like it could be a column label, but not even unreserved keywords can
be column labels, so that's not it. Probably deserves a bit more
investigation...I think the problem is when the function is used as a table function;
e.g.:SELECT * FROM generate_series(1,10) respect;
Ah ha. Well, there's probably not much help for that. Disallowing
keywords as table aliases would be a cure worse than the disease, I
think. I suppose the good news is that there probably aren't many
people using RESPECT as a column name, but I have a feeling that we're
almost certain to get complaints about reserving IGNORE. I think that
will have to be quoted as a PL/pgsql variable name as well. :-(We could just add additional, optional Boolean argument to the
existing functions. It's non-standard, but we avoid adding keywords.I thought about that, but it is awkward because the argument would have
to be constant (or, if not, it would be quite strange).True... but e.g. string_agg() has the same issue.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Attachments:
respect_nulls_and_ignore_nulls_grammar.patchapplication/octet-stream; name=respect_nulls_and_ignore_nulls_grammar.patchDownload
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 5094226..aae35d8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -401,7 +402,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> columnDef columnOptions
%type <defelt> def_elem reloption_elem old_aggr_elem
%type <node> def_arg columnElem where_clause where_or_current_clause
- a_expr b_expr c_expr func_expr AexprConst indirection_el
+ a_expr b_expr c_expr AexprConst indirection_el
columnref in_expr having_clause func_table array_expr
ExclusionWhereClause
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -481,6 +482,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <ival> document_or_content
%type <boolean> xml_whitespace_option
+%type <node> func_application func_expr_common_subexpr
+%type <node> func_expr func_expr_windowless
%type <node> common_table_expr
%type <with> with_clause opt_with_clause
%type <list> cte_list
@@ -543,7 +546,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -573,7 +576,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -6132,7 +6135,7 @@ index_elem: ColId opt_collate opt_class opt_asc_desc opt_nulls_order
$$->ordering = $4;
$$->nulls_ordering = $5;
}
- | func_expr opt_collate opt_class opt_asc_desc opt_nulls_order
+ | func_expr_windowless opt_collate opt_class opt_asc_desc opt_nulls_order
{
$$ = makeNode(IndexElem);
$$->name = NULL;
@@ -9894,11 +9897,9 @@ relation_expr_opt_alias: relation_expr %prec UMINUS
}
;
-
-func_table: func_expr { $$ = $1; }
+func_table: func_expr_windowless { $$ = $1; }
;
-
where_clause:
WHERE a_expr { $$ = $2; }
| /*EMPTY*/ { $$ = NULL; }
@@ -11079,15 +11080,7 @@ c_expr: columnref { $$ = $1; }
}
;
-/*
- * func_expr is split out from c_expr just so that we have a classification
- * for "everything that is a function call or looks like one". This isn't
- * very important, but it saves us having to document which variants are
- * legal in the backwards-compatible functional-index syntax for CREATE INDEX.
- * (Note that many of the special SQL functions wouldn't actually make any
- * sense as functional index entries, but we ignore that consideration here.)
- */
-func_expr: func_name '(' ')' over_clause
+func_application: func_name '(' ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11096,11 +11089,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = FALSE;
- n->over = $4;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' func_arg_list ')' over_clause
+ | func_name '(' func_arg_list ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11109,11 +11102,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = FALSE;
- n->over = $5;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' VARIADIC func_arg_expr ')' over_clause
+ | func_name '(' VARIADIC func_arg_expr ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11122,11 +11115,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = TRUE;
- n->over = $6;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' func_arg_list ',' VARIADIC func_arg_expr ')' over_clause
+ | func_name '(' func_arg_list ',' VARIADIC func_arg_expr ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11135,11 +11128,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = TRUE;
- n->over = $8;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' func_arg_list sort_clause ')' over_clause
+ | func_name '(' func_arg_list sort_clause ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11148,11 +11141,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = FALSE;
- n->over = $6;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' ALL func_arg_list opt_sort_clause ')' over_clause
+ | func_name '(' ALL func_arg_list opt_sort_clause ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11165,11 +11158,11 @@ func_expr: func_name '(' ')' over_clause
* for that in FuncCall at the moment.
*/
n->func_variadic = FALSE;
- n->over = $7;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' DISTINCT func_arg_list opt_sort_clause ')' over_clause
+ | func_name '(' DISTINCT func_arg_list opt_sort_clause ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11178,11 +11171,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = TRUE;
n->func_variadic = FALSE;
- n->over = $7;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' '*' ')' over_clause
+ | func_name '(' '*' ')'
{
/*
* We consider AGGREGATE(*) to invoke a parameterless
@@ -11201,11 +11194,48 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = TRUE;
n->agg_distinct = FALSE;
n->func_variadic = FALSE;
- n->over = $5;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | COLLATION FOR '(' a_expr ')'
+ ;
+
+
+/*
+ * func_expr and its cousin func_expr_windowless is split out from c_expr just
+ * so that we have classifications for "everything that is a function call or
+ * looks like one". This isn't very important, but it saves us having to document
+ * which variants are legal in the backwards-compatible functional-index syntax
+ * for CREATE INDEX.
+ * (Note that many of the special SQL functions wouldn't actually make any
+ * sense as functional index entries, but we ignore that consideration here.)
+ */
+func_expr: func_application over_clause
+ {
+ FuncCall *n = (FuncCall*)$1;
+ n->over = $2;
+ $$ = (Node*)n;
+ }
+ | func_expr_common_subexpr
+ { $$ = $1; }
+ ;
+
+/*
+ * As func_expr but does not accept WINDOW functions directly (they
+ * can still be contained in arguments for functions etc.)
+ * Use this when window expressions are not allowed, so to decomplicate
+ * the grammar. (e.g. in CREATE INDEX)
+ */
+func_expr_windowless:
+ func_application { $$ = $1; }
+ | func_expr_common_subexpr { $$ = $1; }
+ ;
+
+/*
+ * Special expression
+ */
+func_expr_common_subexpr:
+ COLLATION FOR '(' a_expr ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = SystemFuncName("pg_collation_for");
@@ -11752,16 +11782,25 @@ window_definition:
}
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
n->location = @2;
@@ -12740,6 +12779,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12827,6 +12867,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6723647..71b44d5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -435,6 +435,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 68a13b7..2acf073 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 752c7b4..cb34a7c 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -989,9 +989,9 @@ ERROR: window functions are not allowed in GROUP BY
LINE 1: SELECT rank() OVER (ORDER BY 1), count(*) FROM empsalary GRO...
^
SELECT * FROM rank() OVER (ORDER BY random());
-ERROR: window functions are not allowed in functions in FROM
+ERROR: syntax error at or near "OVER"
LINE 1: SELECT * FROM rank() OVER (ORDER BY random());
- ^
+ ^
DELETE FROM empsalary WHERE (rank() OVER (ORDER BY random())) > 10;
ERROR: window functions are not allowed in WHERE
LINE 1: DELETE FROM empsalary WHERE (rank() OVER (ORDER BY random())...
Good catch - I've attached a patch to address your point 1. It now returns
the below (i.e. correctly doesn't fill in the saved value if the index is
out of the window. However, I'm not sure whether (e.g.) lead-2-ignore-nulls
means count forwards two rows, and if that's null use the last one you've
seen (the current implementation) or count forwards two non-null rows (as
you suggest). The behaviour isn't specified in a (free) draft of the 2003
standard (http://www.wiscorp.com/sql_2003_standard.zip), and I don't have
access to the (non-free) final version. Could someone who does have access
to it clarify this? I've also added your example to the regression test
cases.
select val, lead(val, 2) ignore nulls over (order by id) from test_table;
val | lead
-----+------
1 | 3
2 | 4
3 | 4
4 | 4
| 4
| 5
| 6
5 | 7
6 |
7 |
(10 rows)
If the other reviewers are happy with your grammar changes then I'll merge
them into the patch. Alternatively, if departing from the standard is OK
then we could reorder the keywords so that a window function is like SELECT
lag(x,1) OVER RESPECT NULLS (ORDER BY y) - i.e. putting the respect /
ignore tokens after the OVER reserved keyword. Although non-standard it'd
make the grammar change trivial.
Also, I think someone mentioned this already, but what about
first_value() and last_value()? Shouldn't we do those at the same time?
I didn't include this functionality for the first / last value window
functions as their implementation is currently a bit different; they just
call WinGetFuncArgInFrame to pick out a single value. Making these
functions respect nulls would involve changing the single lookup to a walk
through the tuples to find the first non-null version, and keeping track of
this index in a struct in the context. As this change is reasonably
orthogonal I was going to submit it as a separate patch.
Thanks -
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4c5af4b..89d28b2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12266,6 +12266,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [respect nulls]|[ignore nulls]
</function>
</entry>
<entry>
@@ -12280,7 +12281,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified and a previous evalution in the
+ current window has returned a non-null value then that value will be
+ returned instead.
</entry>
</row>
@@ -12293,6 +12297,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [respect nulls]|[ignore nulls]
</function>
</entry>
<entry>
@@ -12307,7 +12312,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <literal>IGNORE NULLS</> is specified and a previous evalution in the
+ current window has returned a non-null value then that value will be
+ returned instead.
</entry>
</row>
@@ -12401,11 +12408,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index d9f0e79..e1a1020 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2000,6 +2000,16 @@ WinGetCurrentPosition(WindowObject winobj)
Assert(WindowObjectIsValid(winobj));
return winobj->winstate->currentpos;
}
+/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
/*
* WinGetPartitionRowCount
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 5094226..0309a99 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -488,6 +488,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> window_clause window_definition_list opt_partition_clause
%type <windef> window_definition over_clause window_specification
opt_frame_clause frame_extent frame_bound
+ over_specification
%type <str> opt_existing_window_name
%type <boolean> opt_if_not_exists
@@ -543,7 +544,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -573,7 +574,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -607,6 +608,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
* creates these tokens when required.
*/
%token NULLS_FIRST NULLS_LAST WITH_TIME
+%token RESPECT_NULLS IGNORE_NULLS
/* Precedence: lowest to highest */
@@ -11752,7 +11754,8 @@ window_definition:
}
;
-over_clause: OVER window_specification
+over_specification:
+ OVER window_specification
{ $$ = $2; }
| OVER ColId
{
@@ -11767,6 +11770,18 @@ over_clause: OVER window_specification
n->location = @2;
$$ = n;
}
+ ;
+
+over_clause: over_specification
+ { $$ = $1; }
+ | RESPECT_NULLS over_specification
+ { $$ = $2; }
+ | IGNORE_NULLS over_specification
+ {
+ if($2)
+ $2->frameOptions |= FRAMEOPTION_IGNORE_NULLS;
+ $$ = $2;
+ }
| /*EMPTY*/
{ $$ = NULL; }
;
@@ -12740,6 +12755,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12827,6 +12843,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index b8ec790..1b2b2de 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -118,15 +118,7 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
cur_token = NULLS_LAST;
break;
default:
- /* save the lookahead token for next time */
- yyextra->lookahead_token = next_token;
- yyextra->lookahead_yylval = lvalp->core_yystype;
- yyextra->lookahead_yylloc = *llocp;
- yyextra->have_lookahead = true;
- /* and back up the output info to cur_token */
- lvalp->core_yystype = cur_yylval;
- *llocp = cur_yylloc;
- break;
+ goto restore_next_token;
}
break;
@@ -144,15 +136,38 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
cur_token = WITH_TIME;
break;
default:
- /* save the lookahead token for next time */
- yyextra->lookahead_token = next_token;
- yyextra->lookahead_yylval = lvalp->core_yystype;
- yyextra->lookahead_yylloc = *llocp;
- yyextra->have_lookahead = true;
- /* and back up the output info to cur_token */
- lvalp->core_yystype = cur_yylval;
- *llocp = cur_yylloc;
+ goto restore_next_token;
+ }
+ break;
+
+ /*
+ * Window functions can use RESPECT NULLS or IGNORE NULLS to
+ * modify their behaviour
+ */
+ case RESPECT:
+ cur_yylval = lvalp->core_yystype;
+ cur_yylloc = *llocp;
+ next_token = core_yylex(&(lvalp->core_yystype), llocp, yyscanner);
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = RESPECT_NULLS;
+ break;
+ default:
+ goto restore_next_token;
+ }
+ break;
+ case IGNORE:
+ cur_yylval = lvalp->core_yystype;
+ cur_yylloc = *llocp;
+ next_token = core_yylex(&(lvalp->core_yystype), llocp, yyscanner);
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = IGNORE_NULLS;
break;
+ default:
+ goto restore_next_token;
}
break;
@@ -161,4 +176,16 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
}
return cur_token;
+
+restore_next_token:
+ /* save the lookahead token for next time */
+ yyextra->lookahead_token = next_token;
+ yyextra->lookahead_yylval = lvalp->core_yystype;
+ yyextra->lookahead_yylloc = *llocp;
+ yyextra->have_lookahead = true;
+ /* and back up the output info to cur_token */
+ lvalp->core_yystype = cur_yylval;
+ *llocp = cur_yylloc;
+
+ return cur_token;
}
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..8985149 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -25,6 +25,15 @@ typedef struct rank_context
} rank_context;
/*
+ * structure for IGNORE NULLS / RESPECT NULLS semantics
+ */
+typedef struct leadlag_context
+{
+ int64 last; /* last non-null result, initially 0 */
+ bool seen_one; /* true iff we can output the row in "last" now */
+} leadlag_context;
+
+/*
* ntile process information
*/
typedef struct
@@ -292,6 +301,15 @@ leadlag_common(FunctionCallInfo fcinfo,
Datum result;
bool isnull;
bool isout;
+ bool ignore_nulls;
+ leadlag_context* context;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but we can't move the mark
+ * beyond the last non-null tuple!
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,11 +323,15 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if(!forward)
+ {
+ offset = -offset;
+ }
result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
+ offset,
WINDOW_SEEK_CURRENT,
- const_offset,
+ const_offset && !ignore_nulls,
&isnull, &isout);
if (isout)
@@ -322,6 +344,39 @@ leadlag_common(FunctionCallInfo fcinfo,
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
}
+ /*
+ * If the row's out of the partition we don't want to propagate the
+ * last non-null value if we're RESPECTing NULLS - we'll just leave
+ * the default value (if there was one).
+ */
+ if (ignore_nulls && !isout)
+ {
+ /*
+ * We'll keep the last non-null value we've seen in our per-partition chunk
+ * of memory, so it gets cleaned up for us.
+ */
+ context = (leadlag_context *)
+ WinGetPartitionLocalMemory(winobj, sizeof(leadlag_context));
+ if (isnull)
+ {
+ if (context->seen_one)
+ {
+ /* restore the datum at the stashed index */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ context->last,
+ WINDOW_SEEK_HEAD,
+ const_offset, /* drag mark up after us */
+ &isnull, &isout);
+ }
+ }
+ else
+ {
+ /* work out which tuple we just loaded */
+ context->last = WinGetCurrentPosition(winobj) + offset;
+ context->seen_one = true;
+ }
+ }
+
if (isnull)
PG_RETURN_NULL();
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6723647..59fc635 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -435,6 +435,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000 /* lead/lag/nth */
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 68a13b7..2acf073 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/interfaces/ecpg/preproc/parse.pl b/src/interfaces/ecpg/preproc/parse.pl
index f4b51d6..fe5dcb3 100644
--- a/src/interfaces/ecpg/preproc/parse.pl
+++ b/src/interfaces/ecpg/preproc/parse.pl
@@ -45,6 +45,8 @@ my %replace_string = (
'WITH_TIME' => 'with time',
'NULLS_FIRST' => 'nulls first',
'NULLS_LAST' => 'nulls last',
+ 'RESPECT_NULLS' => 'respect nulls',
+ 'IGNORE_NULLS' => 'ignore nulls',
'TYPECAST' => '::',
'DOT_DOT' => '..',
'COLON_EQUALS' => ':=',);
diff --git a/src/interfaces/ecpg/preproc/parser.c b/src/interfaces/ecpg/preproc/parser.c
index 2ce9dd9..3780652 100644
--- a/src/interfaces/ecpg/preproc/parser.c
+++ b/src/interfaces/ecpg/preproc/parser.c
@@ -83,15 +83,7 @@ filtered_base_yylex(void)
cur_token = NULLS_LAST;
break;
default:
- /* save the lookahead token for next time */
- lookahead_token = next_token;
- lookahead_yylval = base_yylval;
- lookahead_yylloc = base_yylloc;
- have_lookahead = true;
- /* and back up the output info to cur_token */
- base_yylval = cur_yylval;
- base_yylloc = cur_yylloc;
- break;
+ goto restore_next_token;
}
break;
@@ -109,15 +101,38 @@ filtered_base_yylex(void)
cur_token = WITH_TIME;
break;
default:
- /* save the lookahead token for next time */
- lookahead_token = next_token;
- lookahead_yylval = base_yylval;
- lookahead_yylloc = base_yylloc;
- have_lookahead = true;
- /* and back up the output info to cur_token */
- base_yylval = cur_yylval;
- base_yylloc = cur_yylloc;
+ goto restore_next_token;
+ }
+ break;
+
+ /*
+ * Window functions can use RESPECT NULLS or IGNORE NULLS to
+ * modify their behaviour
+ */
+ case RESPECT:
+ cur_yylval = base_yylval;
+ cur_yylloc = base_yylloc;
+ next_token = base_yylex();
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = RESPECT_NULLS;
+ break;
+ default:
+ goto restore_next_token;
+ }
+ break;
+ case IGNORE:
+ cur_yylval = base_yylval;
+ cur_yylloc = base_yylloc;
+ next_token = base_yylex();
+ switch (next_token)
+ {
+ case NULLS_P:
+ cur_token = IGNORE_NULLS;
break;
+ default:
+ goto restore_next_token;
}
break;
@@ -126,4 +141,16 @@ filtered_base_yylex(void)
}
return cur_token;
+
+restore_next_token:
+ /* save the lookahead token for next time */
+ lookahead_token = next_token;
+ lookahead_yylval = base_yylval;
+ lookahead_yylloc = base_yylloc;
+ have_lookahead = true;
+ /* and back up the output info to cur_token */
+ base_yylval = cur_yylval;
+ base_yylloc = cur_yylloc;
+
+ return cur_token;
}
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 752c7b4..9a80158 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -1020,5 +1022,135 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 09-22-2010
+ | 09-22-2010
+ | 11-17-2009
+ 11-17-2009 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ |
+(10 rows)
+
-- cleanup
DROP TABLE empsalary;
+-- some more test cases
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 4
+ 4 | 4
+ | 4
+ | 5
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..6dba56e 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -264,5 +266,35 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table;
+
+DROP TABLE test_table;
On Fri, Jun 21, 2013 at 6:29 PM, Troels Nielsen <bn.troels@gmail.com> wrote:
The grammar conflict appears to be because of ambiguities in:
1. table_ref (used exclusively in FROM clauses)
2. index_elem (used exclusively in INDEX creation statements).Now, this doesn't seem to make much sense, as AFAICT window functions
are explicitly disallowed in these contexts (transformWindowFuncCall
will yield errors, and I can't really wrap my head around what a
window function call would mean there).I therefore propose a simple rearrangement of the grammar,
syntactically disallowing window functions in the outer part of those
contexts (a_expr's inside can't and shouldn't be done much about)
which will allow both RESPECT and IGNORE to become unreserved
keywords, without doing any lexer hacking or abusing the grammar.
I reviewed this today and I think this is a very nice approach.
Thanks for working on it!
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
OK - I've attached another iteration of the patch with Troels' grammar
changes. I think the only issue remaining is what the standard says about
lead semantics. Thanks -
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4c5af4b..89d28b2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12266,6 +12266,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [respect nulls]|[ignore nulls]
</function>
</entry>
<entry>
@@ -12280,7 +12281,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified and a previous evalution in the
+ current window has returned a non-null value then that value will be
+ returned instead.
</entry>
</row>
@@ -12293,6 +12297,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [respect nulls]|[ignore nulls]
</function>
</entry>
<entry>
@@ -12307,7 +12312,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <literal>IGNORE NULLS</> is specified and a previous evalution in the
+ current window has returned a non-null value then that value will be
+ returned instead.
</entry>
</row>
@@ -12401,11 +12408,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index d9f0e79..e1a1020 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2000,6 +2000,16 @@ WinGetCurrentPosition(WindowObject winobj)
Assert(WindowObjectIsValid(winobj));
return winobj->winstate->currentpos;
}
+/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
/*
* WinGetPartitionRowCount
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 5094226..aae35d8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -401,7 +402,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> columnDef columnOptions
%type <defelt> def_elem reloption_elem old_aggr_elem
%type <node> def_arg columnElem where_clause where_or_current_clause
- a_expr b_expr c_expr func_expr AexprConst indirection_el
+ a_expr b_expr c_expr AexprConst indirection_el
columnref in_expr having_clause func_table array_expr
ExclusionWhereClause
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -481,6 +482,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <ival> document_or_content
%type <boolean> xml_whitespace_option
+%type <node> func_application func_expr_common_subexpr
+%type <node> func_expr func_expr_windowless
%type <node> common_table_expr
%type <with> with_clause opt_with_clause
%type <list> cte_list
@@ -543,7 +546,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -573,7 +576,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -6132,7 +6135,7 @@ index_elem: ColId opt_collate opt_class opt_asc_desc opt_nulls_order
$$->ordering = $4;
$$->nulls_ordering = $5;
}
- | func_expr opt_collate opt_class opt_asc_desc opt_nulls_order
+ | func_expr_windowless opt_collate opt_class opt_asc_desc opt_nulls_order
{
$$ = makeNode(IndexElem);
$$->name = NULL;
@@ -9894,11 +9897,9 @@ relation_expr_opt_alias: relation_expr %prec UMINUS
}
;
-
-func_table: func_expr { $$ = $1; }
+func_table: func_expr_windowless { $$ = $1; }
;
-
where_clause:
WHERE a_expr { $$ = $2; }
| /*EMPTY*/ { $$ = NULL; }
@@ -11079,15 +11080,7 @@ c_expr: columnref { $$ = $1; }
}
;
-/*
- * func_expr is split out from c_expr just so that we have a classification
- * for "everything that is a function call or looks like one". This isn't
- * very important, but it saves us having to document which variants are
- * legal in the backwards-compatible functional-index syntax for CREATE INDEX.
- * (Note that many of the special SQL functions wouldn't actually make any
- * sense as functional index entries, but we ignore that consideration here.)
- */
-func_expr: func_name '(' ')' over_clause
+func_application: func_name '(' ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11096,11 +11089,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = FALSE;
- n->over = $4;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' func_arg_list ')' over_clause
+ | func_name '(' func_arg_list ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11109,11 +11102,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = FALSE;
- n->over = $5;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' VARIADIC func_arg_expr ')' over_clause
+ | func_name '(' VARIADIC func_arg_expr ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11122,11 +11115,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = TRUE;
- n->over = $6;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' func_arg_list ',' VARIADIC func_arg_expr ')' over_clause
+ | func_name '(' func_arg_list ',' VARIADIC func_arg_expr ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11135,11 +11128,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = TRUE;
- n->over = $8;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' func_arg_list sort_clause ')' over_clause
+ | func_name '(' func_arg_list sort_clause ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11148,11 +11141,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = FALSE;
- n->over = $6;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' ALL func_arg_list opt_sort_clause ')' over_clause
+ | func_name '(' ALL func_arg_list opt_sort_clause ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11165,11 +11158,11 @@ func_expr: func_name '(' ')' over_clause
* for that in FuncCall at the moment.
*/
n->func_variadic = FALSE;
- n->over = $7;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' DISTINCT func_arg_list opt_sort_clause ')' over_clause
+ | func_name '(' DISTINCT func_arg_list opt_sort_clause ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11178,11 +11171,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = TRUE;
n->func_variadic = FALSE;
- n->over = $7;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' '*' ')' over_clause
+ | func_name '(' '*' ')'
{
/*
* We consider AGGREGATE(*) to invoke a parameterless
@@ -11201,11 +11194,48 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = TRUE;
n->agg_distinct = FALSE;
n->func_variadic = FALSE;
- n->over = $5;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | COLLATION FOR '(' a_expr ')'
+ ;
+
+
+/*
+ * func_expr and its cousin func_expr_windowless is split out from c_expr just
+ * so that we have classifications for "everything that is a function call or
+ * looks like one". This isn't very important, but it saves us having to document
+ * which variants are legal in the backwards-compatible functional-index syntax
+ * for CREATE INDEX.
+ * (Note that many of the special SQL functions wouldn't actually make any
+ * sense as functional index entries, but we ignore that consideration here.)
+ */
+func_expr: func_application over_clause
+ {
+ FuncCall *n = (FuncCall*)$1;
+ n->over = $2;
+ $$ = (Node*)n;
+ }
+ | func_expr_common_subexpr
+ { $$ = $1; }
+ ;
+
+/*
+ * As func_expr but does not accept WINDOW functions directly (they
+ * can still be contained in arguments for functions etc.)
+ * Use this when window expressions are not allowed, so to decomplicate
+ * the grammar. (e.g. in CREATE INDEX)
+ */
+func_expr_windowless:
+ func_application { $$ = $1; }
+ | func_expr_common_subexpr { $$ = $1; }
+ ;
+
+/*
+ * Special expression
+ */
+func_expr_common_subexpr:
+ COLLATION FOR '(' a_expr ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = SystemFuncName("pg_collation_for");
@@ -11752,16 +11782,25 @@ window_definition:
}
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
n->location = @2;
@@ -12740,6 +12779,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12827,6 +12867,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..8985149 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -25,6 +25,15 @@ typedef struct rank_context
} rank_context;
/*
+ * structure for IGNORE NULLS / RESPECT NULLS semantics
+ */
+typedef struct leadlag_context
+{
+ int64 last; /* last non-null result, initially 0 */
+ bool seen_one; /* true iff we can output the row in "last" now */
+} leadlag_context;
+
+/*
* ntile process information
*/
typedef struct
@@ -292,6 +301,15 @@ leadlag_common(FunctionCallInfo fcinfo,
Datum result;
bool isnull;
bool isout;
+ bool ignore_nulls;
+ leadlag_context* context;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but we can't move the mark
+ * beyond the last non-null tuple!
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,11 +323,15 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if(!forward)
+ {
+ offset = -offset;
+ }
result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
+ offset,
WINDOW_SEEK_CURRENT,
- const_offset,
+ const_offset && !ignore_nulls,
&isnull, &isout);
if (isout)
@@ -322,6 +344,39 @@ leadlag_common(FunctionCallInfo fcinfo,
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
}
+ /*
+ * If the row's out of the partition we don't want to propagate the
+ * last non-null value if we're RESPECTing NULLS - we'll just leave
+ * the default value (if there was one).
+ */
+ if (ignore_nulls && !isout)
+ {
+ /*
+ * We'll keep the last non-null value we've seen in our per-partition chunk
+ * of memory, so it gets cleaned up for us.
+ */
+ context = (leadlag_context *)
+ WinGetPartitionLocalMemory(winobj, sizeof(leadlag_context));
+ if (isnull)
+ {
+ if (context->seen_one)
+ {
+ /* restore the datum at the stashed index */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ context->last,
+ WINDOW_SEEK_HEAD,
+ const_offset, /* drag mark up after us */
+ &isnull, &isout);
+ }
+ }
+ else
+ {
+ /* work out which tuple we just loaded */
+ context->last = WinGetCurrentPosition(winobj) + offset;
+ context->seen_one = true;
+ }
+ }
+
if (isnull)
PG_RETURN_NULL();
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6723647..71b44d5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -435,6 +435,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 68a13b7..2acf073 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 752c7b4..c6f72f8 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -989,9 +991,9 @@ ERROR: window functions are not allowed in GROUP BY
LINE 1: SELECT rank() OVER (ORDER BY 1), count(*) FROM empsalary GRO...
^
SELECT * FROM rank() OVER (ORDER BY random());
-ERROR: window functions are not allowed in functions in FROM
+ERROR: syntax error at or near "OVER"
LINE 1: SELECT * FROM rank() OVER (ORDER BY random());
- ^
+ ^
DELETE FROM empsalary WHERE (rank() OVER (ORDER BY random())) > 10;
ERROR: window functions are not allowed in WHERE
LINE 1: DELETE FROM empsalary WHERE (rank() OVER (ORDER BY random())...
@@ -1020,5 +1022,135 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 09-22-2010
+ | 09-22-2010
+ | 11-17-2009
+ 11-17-2009 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ |
+(10 rows)
+
-- cleanup
DROP TABLE empsalary;
+-- some more test cases
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 4
+ 4 | 4
+ | 4
+ | 5
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..6dba56e 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -264,5 +266,35 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table;
+
+DROP TABLE test_table;
The result of the current patch using lead
Actually, I think I agree with you (and, FWIW, so does Oracle:
http://docs.oracle.com/cd/E11882_01/server.112/e25554/analysis.htm#autoId18).
I've refactored the window function's implementation so that (e.g.) lead(5)
means the 5th non-null value away in front of the current row (the previous
implementation was the last non-null value returned if the 5th rows in
front was null). These semantics are slower, as the require the function to
scan through the tuples discarding non-null ones. I've made the
implementation use a bitmap in the partition context to cache whether or
not a given tuple produces a null. This seems correct (it passes the
regression tests) but as it stores row offsets (which are int64s) I was
careful not to use bitmap methods that use ints to refer to set members.
I've added more explanation in the code's comments. Thanks -
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4c5af4b..89d28b2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12266,6 +12266,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [respect nulls]|[ignore nulls]
</function>
</entry>
<entry>
@@ -12280,7 +12281,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified and a previous evalution in the
+ current window has returned a non-null value then that value will be
+ returned instead.
</entry>
</row>
@@ -12293,6 +12297,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [respect nulls]|[ignore nulls]
</function>
</entry>
<entry>
@@ -12307,7 +12312,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <literal>IGNORE NULLS</> is specified and a previous evalution in the
+ current window has returned a non-null value then that value will be
+ returned instead.
</entry>
</row>
@@ -12401,11 +12408,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index d9f0e79..e1a1020 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2000,6 +2000,16 @@ WinGetCurrentPosition(WindowObject winobj)
Assert(WindowObjectIsValid(winobj));
return winobj->winstate->currentpos;
}
+/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
/*
* WinGetPartitionRowCount
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index b18b7a5..70e84d1 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -26,9 +26,6 @@
#define WORDNUM(x) ((x) / BITS_PER_BITMAPWORD)
#define BITNUM(x) ((x) % BITS_PER_BITMAPWORD)
-#define BITMAPSET_SIZE(nwords) \
- (offsetof(Bitmapset, words) + (nwords) * sizeof(bitmapword))
-
/*----------
* This is a well-known cute trick for isolating the rightmost one-bit
* in a word. It assumes two's complement arithmetic. Consider any
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 5094226..aae35d8 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -401,7 +402,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <node> columnDef columnOptions
%type <defelt> def_elem reloption_elem old_aggr_elem
%type <node> def_arg columnElem where_clause where_or_current_clause
- a_expr b_expr c_expr func_expr AexprConst indirection_el
+ a_expr b_expr c_expr AexprConst indirection_el
columnref in_expr having_clause func_table array_expr
ExclusionWhereClause
%type <list> ExclusionConstraintList ExclusionConstraintElem
@@ -481,6 +482,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <ival> document_or_content
%type <boolean> xml_whitespace_option
+%type <node> func_application func_expr_common_subexpr
+%type <node> func_expr func_expr_windowless
%type <node> common_table_expr
%type <with> with_clause opt_with_clause
%type <list> cte_list
@@ -543,7 +546,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -573,7 +576,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -6132,7 +6135,7 @@ index_elem: ColId opt_collate opt_class opt_asc_desc opt_nulls_order
$$->ordering = $4;
$$->nulls_ordering = $5;
}
- | func_expr opt_collate opt_class opt_asc_desc opt_nulls_order
+ | func_expr_windowless opt_collate opt_class opt_asc_desc opt_nulls_order
{
$$ = makeNode(IndexElem);
$$->name = NULL;
@@ -9894,11 +9897,9 @@ relation_expr_opt_alias: relation_expr %prec UMINUS
}
;
-
-func_table: func_expr { $$ = $1; }
+func_table: func_expr_windowless { $$ = $1; }
;
-
where_clause:
WHERE a_expr { $$ = $2; }
| /*EMPTY*/ { $$ = NULL; }
@@ -11079,15 +11080,7 @@ c_expr: columnref { $$ = $1; }
}
;
-/*
- * func_expr is split out from c_expr just so that we have a classification
- * for "everything that is a function call or looks like one". This isn't
- * very important, but it saves us having to document which variants are
- * legal in the backwards-compatible functional-index syntax for CREATE INDEX.
- * (Note that many of the special SQL functions wouldn't actually make any
- * sense as functional index entries, but we ignore that consideration here.)
- */
-func_expr: func_name '(' ')' over_clause
+func_application: func_name '(' ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11096,11 +11089,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = FALSE;
- n->over = $4;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' func_arg_list ')' over_clause
+ | func_name '(' func_arg_list ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11109,11 +11102,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = FALSE;
- n->over = $5;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' VARIADIC func_arg_expr ')' over_clause
+ | func_name '(' VARIADIC func_arg_expr ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11122,11 +11115,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = TRUE;
- n->over = $6;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' func_arg_list ',' VARIADIC func_arg_expr ')' over_clause
+ | func_name '(' func_arg_list ',' VARIADIC func_arg_expr ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11135,11 +11128,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = TRUE;
- n->over = $8;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' func_arg_list sort_clause ')' over_clause
+ | func_name '(' func_arg_list sort_clause ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11148,11 +11141,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = FALSE;
n->func_variadic = FALSE;
- n->over = $6;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' ALL func_arg_list opt_sort_clause ')' over_clause
+ | func_name '(' ALL func_arg_list opt_sort_clause ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11165,11 +11158,11 @@ func_expr: func_name '(' ')' over_clause
* for that in FuncCall at the moment.
*/
n->func_variadic = FALSE;
- n->over = $7;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' DISTINCT func_arg_list opt_sort_clause ')' over_clause
+ | func_name '(' DISTINCT func_arg_list opt_sort_clause ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = $1;
@@ -11178,11 +11171,11 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = FALSE;
n->agg_distinct = TRUE;
n->func_variadic = FALSE;
- n->over = $7;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | func_name '(' '*' ')' over_clause
+ | func_name '(' '*' ')'
{
/*
* We consider AGGREGATE(*) to invoke a parameterless
@@ -11201,11 +11194,48 @@ func_expr: func_name '(' ')' over_clause
n->agg_star = TRUE;
n->agg_distinct = FALSE;
n->func_variadic = FALSE;
- n->over = $5;
+ n->over = NULL;
n->location = @1;
$$ = (Node *)n;
}
- | COLLATION FOR '(' a_expr ')'
+ ;
+
+
+/*
+ * func_expr and its cousin func_expr_windowless is split out from c_expr just
+ * so that we have classifications for "everything that is a function call or
+ * looks like one". This isn't very important, but it saves us having to document
+ * which variants are legal in the backwards-compatible functional-index syntax
+ * for CREATE INDEX.
+ * (Note that many of the special SQL functions wouldn't actually make any
+ * sense as functional index entries, but we ignore that consideration here.)
+ */
+func_expr: func_application over_clause
+ {
+ FuncCall *n = (FuncCall*)$1;
+ n->over = $2;
+ $$ = (Node*)n;
+ }
+ | func_expr_common_subexpr
+ { $$ = $1; }
+ ;
+
+/*
+ * As func_expr but does not accept WINDOW functions directly (they
+ * can still be contained in arguments for functions etc.)
+ * Use this when window expressions are not allowed, so to decomplicate
+ * the grammar. (e.g. in CREATE INDEX)
+ */
+func_expr_windowless:
+ func_application { $$ = $1; }
+ | func_expr_common_subexpr { $$ = $1; }
+ ;
+
+/*
+ * Special expression
+ */
+func_expr_common_subexpr:
+ COLLATION FOR '(' a_expr ')'
{
FuncCall *n = makeNode(FuncCall);
n->funcname = SystemFuncName("pg_collation_for");
@@ -11752,16 +11782,25 @@ window_definition:
}
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
n->location = @2;
@@ -12740,6 +12779,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12827,6 +12867,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..b14491c 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "nodes/bitmapset.h"
#include "utils/builtins.h"
#include "windowapi.h"
@@ -25,6 +26,13 @@ typedef struct rank_context
} rank_context;
/*
+ * lead-lag process helpers
+ */
+ #define ISNULL_INDEX(i) (2 * (i))
+ #define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+ #define SET_WITHOUT_RESIZING(b, i) b->words[(i) / BITS_PER_BITMAPWORD] |= (bitmapword) 1 << (i) % BITS_PER_BITMAPWORD
+
+/*
* ntile process information
*/
typedef struct
@@ -280,7 +288,8 @@ window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
- * withdefault indicates we have a default third argument.
+ * withdefault indicates we have a default third argument. We'll only
+ * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
@@ -290,8 +299,18 @@ leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
- bool isnull;
- bool isout;
+ bool isnull = false;
+ bool isout = false;
+ bool ignore_nulls;
+ Bitmapset* null_values;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't
+ * mark at all).
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,12 +324,134 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if(!forward)
+ {
+ offset = -offset;
+ }
+
+ if (ignore_nulls)
+ {
+ int64 bits_needed, scanning, words_needed, current = WinGetCurrentPosition(winobj);
+ bool scanForward;
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't exist".
+ * This means that we'll scan from the current row an 'offset' number of
+ * non-null rows, and then return that one.
+ */
- result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
- WINDOW_SEEK_CURRENT,
- const_offset,
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones we've
+ * accessed (more specifically, if they're null or not). We'll need one
+ * bit for whether the value is null and one bit for whether we've checked
+ * that tuple or not. We'll keep these two bits together (as opposed to
+ * having two separate bitmaps) to improve cache locality.
+ */
+ bits_needed = 2 * WinGetPartitionRowCount(winobj);
+ words_needed = (bits_needed / BITS_PER_BITMAPWORD) + 1;
+
+ null_values = (Bitmapset *) WinGetPartitionLocalMemory(
+ winobj,
+ BITMAPSET_SIZE(words_needed));
+ Assert(null_values);
+
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be in the
+ * opposite direction to the way we're scanning. We'll then force offset to
+ * be positive to make counting down the rows easier.
+ */
+ scanForward = offset == 0 ? forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; scanForward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the default
+ * value, but not whatever's left in result. We'll use the isnull
+ * flag to say "ignore it"!
+ */
+ isnull = true;
+
+ break;
+ }
+
+ /* look in the bitmap cache - do we know if this index is null? */
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), null_values))
+ {
+ isnull = bms_is_member(ISNULL_INDEX(scanning), null_values);
+ }
+ else
+ {
+ /* first time we've accessed this index; let's see if it's null: */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
&isnull, &isout);
+ if (isout)
+ break;
+
+ /* update our bitmap with this result */
+ SET_WITHOUT_RESIZING(null_values, HAVESCANNED_INDEX(scanning));
+ if (isnull)
+ {
+ SET_WITHOUT_RESIZING(null_values, ISNULL_INDEX(scanning));
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a chance
+ * that we may stop iterating here:
+ */
+ if (!isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*
+ * A slight edge case. Consider:
+ *
+ * A | lag(A, 1)
+ * 1 |
+ * 2 | 1
+ * | ?
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ */
+ --offset;
+ }
+ }
+ }
+ else
+ {
+ /*
+ * We don't care about nulls; just get the row at the required offset.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ &isnull, &isout);
+ }
if (isout)
{
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 2a4b41d..710000f 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -34,7 +34,8 @@ typedef struct Bitmapset
int nwords; /* number of words in array */
bitmapword words[1]; /* really [nwords] */
} Bitmapset; /* VARIABLE LENGTH STRUCT */
-
+#define BITMAPSET_SIZE(nwords) \
+ (offsetof(Bitmapset, words) + (nwords) * sizeof(bitmapword))
/* result of bms_subset_compare */
typedef enum
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6723647..71b44d5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -435,6 +435,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 68a13b7..2acf073 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 752c7b4..f136a56 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -989,9 +991,9 @@ ERROR: window functions are not allowed in GROUP BY
LINE 1: SELECT rank() OVER (ORDER BY 1), count(*) FROM empsalary GRO...
^
SELECT * FROM rank() OVER (ORDER BY random());
-ERROR: window functions are not allowed in functions in FROM
+ERROR: syntax error at or near "OVER"
LINE 1: SELECT * FROM rank() OVER (ORDER BY random());
- ^
+ ^
DELETE FROM empsalary WHERE (rank() OVER (ORDER BY random())) > 10;
ERROR: window functions are not allowed in WHERE
LINE 1: DELETE FROM empsalary WHERE (rank() OVER (ORDER BY random())...
@@ -1020,5 +1022,135 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
-- cleanup
DROP TABLE empsalary;
+-- some more test cases
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 5
+ 4 | 6
+ | 6
+ | 6
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..92166d7 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -264,5 +266,35 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+
+DROP TABLE test_table;
On Thu, Jun 27, 2013 at 8:52 PM, Nicholas White <n.j.white@gmail.com> wrote:
The result of the current patch using lead
Actually, I think I agree with you (and, FWIW, so does Oracle:
http://docs.oracle.com/cd/E11882_01/server.112/e25554/analysis.htm#autoId18).
I've refactored the window function's implementation so that (e.g.) lead(5)
means the 5th non-null value away in front of the current row (the previous
implementation was the last non-null value returned if the 5th rows in front
was null). These semantics are slower, as the require the function to scan
through the tuples discarding non-null ones. I've made the implementation
use a bitmap in the partition context to cache whether or not a given tuple
produces a null. This seems correct (it passes the regression tests) but as
it stores row offsets (which are int64s) I was careful not to use bitmap
methods that use ints to refer to set members. I've added more explanation
in the code's comments. Thanks -
The documentation you've added reads kind of funny to me:
+ [respect nulls]|[ignore nulls]
Wouldn't we normally write that as [ [ RESPECT | IGNORE ] NULLS ] ?
I've committed the changes from Troels to avoid the grammar conflicts,
and I also took the opportunity to make OVER unreserved, as allowed by
his refactoring and per related discussion on other threads. This
patch will need to be rebased over those changes (sorry), but
hopefully it'll help the review of this patch focus in on the issues
that are specific to RESPECT/IGNORE NULLS.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas escribi�:
On Thu, Jun 27, 2013 at 8:52 PM, Nicholas White <n.j.white@gmail.com> wrote:
The documentation you've added reads kind of funny to me:
+ [respect nulls]|[ignore nulls]
Wouldn't we normally write that as [ [ RESPECT | IGNORE ] NULLS ] ?
I think it should be
[ { RESPECT | IGNORE } NULLS ]
One of the leading keywords must be present.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jun 28, 2013 at 11:41 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
Robert Haas escribió:
On Thu, Jun 27, 2013 at 8:52 PM, Nicholas White <n.j.white@gmail.com> wrote:
The documentation you've added reads kind of funny to me:
+ [respect nulls]|[ignore nulls]
Wouldn't we normally write that as [ [ RESPECT | IGNORE ] NULLS ] ?
I think it should be
[ { RESPECT | IGNORE } NULLS ]
One of the leading keywords must be present.
Oh, yeah. What he said.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
This patch will need to be rebased over those changes
See attached -
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7c009d8..9f96fe3 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12275,6 +12275,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12289,7 +12290,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12302,6 +12305,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [respect nulls]|[ignore nulls]
</function>
</entry>
<entry>
@@ -12316,7 +12320,8 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12410,11 +12415,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index d9f0e79..e1a1020 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2000,6 +2000,16 @@ WinGetCurrentPosition(WindowObject winobj)
Assert(WindowObjectIsValid(winobj));
return winobj->winstate->currentpos;
}
+/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
/*
* WinGetPartitionRowCount
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index b18b7a5..70e84d1 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -26,9 +26,6 @@
#define WORDNUM(x) ((x) / BITS_PER_BITMAPWORD)
#define BITNUM(x) ((x) % BITS_PER_BITMAPWORD)
-#define BITMAPSET_SIZE(nwords) \
- (offsetof(Bitmapset, words) + (nwords) * sizeof(bitmapword))
-
/*----------
* This is a well-known cute trick for isolating the rightmost one-bit
* in a word. It assumes two's complement arithmetic. Consider any
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c41f1b5..917e233 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -545,7 +546,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -575,7 +576,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -11782,16 +11783,25 @@ window_definition:
}
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
n->location = @2;
@@ -12770,6 +12780,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12858,6 +12869,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..b14491c 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "nodes/bitmapset.h"
#include "utils/builtins.h"
#include "windowapi.h"
@@ -25,6 +26,13 @@ typedef struct rank_context
} rank_context;
/*
+ * lead-lag process helpers
+ */
+ #define ISNULL_INDEX(i) (2 * (i))
+ #define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+ #define SET_WITHOUT_RESIZING(b, i) b->words[(i) / BITS_PER_BITMAPWORD] |= (bitmapword) 1 << (i) % BITS_PER_BITMAPWORD
+
+/*
* ntile process information
*/
typedef struct
@@ -280,7 +288,8 @@ window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
- * withdefault indicates we have a default third argument.
+ * withdefault indicates we have a default third argument. We'll only
+ * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
@@ -290,8 +299,18 @@ leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
- bool isnull;
- bool isout;
+ bool isnull = false;
+ bool isout = false;
+ bool ignore_nulls;
+ Bitmapset* null_values;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't
+ * mark at all).
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,12 +324,134 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if(!forward)
+ {
+ offset = -offset;
+ }
+
+ if (ignore_nulls)
+ {
+ int64 bits_needed, scanning, words_needed, current = WinGetCurrentPosition(winobj);
+ bool scanForward;
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't exist".
+ * This means that we'll scan from the current row an 'offset' number of
+ * non-null rows, and then return that one.
+ */
- result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
- WINDOW_SEEK_CURRENT,
- const_offset,
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones we've
+ * accessed (more specifically, if they're null or not). We'll need one
+ * bit for whether the value is null and one bit for whether we've checked
+ * that tuple or not. We'll keep these two bits together (as opposed to
+ * having two separate bitmaps) to improve cache locality.
+ */
+ bits_needed = 2 * WinGetPartitionRowCount(winobj);
+ words_needed = (bits_needed / BITS_PER_BITMAPWORD) + 1;
+
+ null_values = (Bitmapset *) WinGetPartitionLocalMemory(
+ winobj,
+ BITMAPSET_SIZE(words_needed));
+ Assert(null_values);
+
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be in the
+ * opposite direction to the way we're scanning. We'll then force offset to
+ * be positive to make counting down the rows easier.
+ */
+ scanForward = offset == 0 ? forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; scanForward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the default
+ * value, but not whatever's left in result. We'll use the isnull
+ * flag to say "ignore it"!
+ */
+ isnull = true;
+
+ break;
+ }
+
+ /* look in the bitmap cache - do we know if this index is null? */
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), null_values))
+ {
+ isnull = bms_is_member(ISNULL_INDEX(scanning), null_values);
+ }
+ else
+ {
+ /* first time we've accessed this index; let's see if it's null: */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
&isnull, &isout);
+ if (isout)
+ break;
+
+ /* update our bitmap with this result */
+ SET_WITHOUT_RESIZING(null_values, HAVESCANNED_INDEX(scanning));
+ if (isnull)
+ {
+ SET_WITHOUT_RESIZING(null_values, ISNULL_INDEX(scanning));
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a chance
+ * that we may stop iterating here:
+ */
+ if (!isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*
+ * A slight edge case. Consider:
+ *
+ * A | lag(A, 1)
+ * 1 |
+ * 2 | 1
+ * | ?
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ */
+ --offset;
+ }
+ }
+ }
+ else
+ {
+ /*
+ * We don't care about nulls; just get the row at the required offset.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ &isnull, &isout);
+ }
if (isout)
{
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 2a4b41d..710000f 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -34,7 +34,8 @@ typedef struct Bitmapset
int nwords; /* number of words in array */
bitmapword words[1]; /* really [nwords] */
} Bitmapset; /* VARIABLE LENGTH STRUCT */
-
+#define BITMAPSET_SIZE(nwords) \
+ (offsetof(Bitmapset, words) + (nwords) * sizeof(bitmapword))
/* result of bms_subset_compare */
typedef enum
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6723647..71b44d5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -435,6 +435,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index b3d72a9..dd7396e 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index ecc1c2c..473c7f3 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -1020,5 +1022,135 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
-- cleanup
DROP TABLE empsalary;
+-- some more test cases
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 5
+ 4 | 6
+ | 6
+ | 6
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..92166d7 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -264,5 +266,35 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+
+DROP TABLE test_table;
On Fri, Jun 28, 2013 at 12:29 PM, Nicholas White <n.j.white@gmail.com> wrote:
This patch will need to be rebased over those changes
See attached -
Thanks. But I see a few issues...
+ [respect nulls]|[ignore nulls]
You fixed one of these but missed the other.
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <literal>IGNORE NULLS</> is specified then the function will
be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
This looks like you dropped a line during cut-and-paste.
+ null_values = (Bitmapset *) WinGetPartitionLocalMemory(
+ winobj,
+ BITMAPSET_SIZE(words_needed));
+ Assert(null_values);
This certainly seems ugly - isn't there a way to accomplish this
without having to violate the Bitmapset abstraction?
+ * A slight edge case. Consider:
+ *
+ * A | lag(A, 1)
+ * 1 |
+ * 2 | 1
+ * | ?
pgindent will reformat this into oblivion. Use ----- markers around
the comment block as done elsewhere in the code where this is an
issue, or don't use ASCII art.
I haven't really reviewed the windowing-related code in depth; I
thought Jeff might jump back in for that part of it. Jeff, is that
something you're planning to do?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I've fixed the problems you mentioned (see attached) - sorry, I was a bit
careless with the docs.
+ null_values = (Bitmapset *) WinGetPartitionLocalMemory( + winobj, + BITMAPSET_SIZE(words_needed)); + Assert(null_values);This certainly seems ugly - isn't there a way to accomplish this
without having to violate the Bitmapset abstraction?
Indeed, it's ugly. I've revised it slightly to:
null_values = (Bitmapset *) WinGetPartitionLocalMemory(
winobj,
BITMAPSET_SIZE(words_needed));
null_values->nwords = (int) words_needed;
...which gives a proper bitmap. It's hard to break this into a factory
method like bms_make_singleton as I'm getting the (zero'ed) partition local
memory from one call, then forcing a correct bitmap's structure on it.
Maybe bitmapset.h needs an bms_initialise(void *, int num_words) factory
method? You'd still have to use the BITMAPSET_SIZE macro to get the correct
amount of memory for the void*. Maybe the factory method could take a
function pointer that would allocate memory of the given size (e.g.
Bitmapset* initialize(void* (allocator)(Size_t), int num_words) ) - this
means I could still use the partition's local memory.
I don't think the solution would be tidier if I re-instated the
leadlag_context struct with a single Bitmapset member. Other Bitmapset
usage seems to just call bms_make_singleton then bms_add_member over and
over again - which afaict will call palloc every BITS_PER_BITMAPWORD calls,
which is not really what I want.
Thanks -
Nick
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7c009d8..740e713 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12275,6 +12275,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12289,7 +12290,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12302,6 +12305,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12316,7 +12320,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12410,11 +12416,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index d9f0e79..e1a1020 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2000,6 +2000,16 @@ WinGetCurrentPosition(WindowObject winobj)
Assert(WindowObjectIsValid(winobj));
return winobj->winstate->currentpos;
}
+/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
/*
* WinGetPartitionRowCount
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index b18b7a5..70e84d1 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -26,9 +26,6 @@
#define WORDNUM(x) ((x) / BITS_PER_BITMAPWORD)
#define BITNUM(x) ((x) % BITS_PER_BITMAPWORD)
-#define BITMAPSET_SIZE(nwords) \
- (offsetof(Bitmapset, words) + (nwords) * sizeof(bitmapword))
-
/*----------
* This is a well-known cute trick for isolating the rightmost one-bit
* in a word. It assumes two's complement arithmetic. Consider any
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c41f1b5..917e233 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -545,7 +546,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -575,7 +576,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -11782,16 +11783,25 @@ window_definition:
}
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
n->location = @2;
@@ -12770,6 +12780,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12858,6 +12869,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..12cab3c 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "nodes/bitmapset.h"
#include "utils/builtins.h"
#include "windowapi.h"
@@ -25,6 +26,13 @@ typedef struct rank_context
} rank_context;
/*
+ * lead-lag process helpers
+ */
+ #define ISNULL_INDEX(i) (2 * (i))
+ #define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+ #define SET_WITHOUT_RESIZING(b, i) b->words[(i) / BITS_PER_BITMAPWORD] |= (bitmapword) 1 << (i) % BITS_PER_BITMAPWORD
+
+/*
* ntile process information
*/
typedef struct
@@ -280,7 +288,8 @@ window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
- * withdefault indicates we have a default third argument.
+ * withdefault indicates we have a default third argument. We'll only
+ * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
@@ -290,8 +299,18 @@ leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
- bool isnull;
- bool isout;
+ bool isnull = false;
+ bool isout = false;
+ bool ignore_nulls;
+ Bitmapset* null_values;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't
+ * mark at all).
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,12 +324,136 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if(!forward)
+ {
+ offset = -offset;
+ }
+
+ if (ignore_nulls)
+ {
+ int64 bits_needed, scanning, words_needed, current = WinGetCurrentPosition(winobj);
+ bool scanForward;
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't exist".
+ * This means that we'll scan from the current row an 'offset' number of
+ * non-null rows, and then return that one.
+ */
- result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
- WINDOW_SEEK_CURRENT,
- const_offset,
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones we've
+ * accessed (more specifically, if they're null or not). We'll need one
+ * bit for whether the value is null and one bit for whether we've checked
+ * that tuple or not. We'll keep these two bits together (as opposed to
+ * having two separate bitmaps) to improve cache locality.
+ */
+ bits_needed = 2 * WinGetPartitionRowCount(winobj);
+ words_needed = (bits_needed / BITS_PER_BITMAPWORD) + 1;
+
+ null_values = (Bitmapset *) WinGetPartitionLocalMemory(
+ winobj,
+ BITMAPSET_SIZE(words_needed));
+ null_values->nwords = (int) words_needed;
+
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be in the
+ * opposite direction to the way we're scanning. We'll then force offset to
+ * be positive to make counting down the rows easier.
+ */
+ scanForward = offset == 0 ? forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; scanForward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the default
+ * value, but not whatever's left in result. We'll use the isnull
+ * flag to say "ignore it"!
+ */
+ isnull = true;
+
+ break;
+ }
+
+ /* look in the bitmap cache - do we know if this index is null? */
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), null_values))
+ {
+ isnull = bms_is_member(ISNULL_INDEX(scanning), null_values);
+ }
+ else
+ {
+ /* first time we've accessed this index; let's see if it's null: */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
&isnull, &isout);
+ if (isout)
+ break;
+
+ /* update our bitmap with this result */
+ SET_WITHOUT_RESIZING(null_values, HAVESCANNED_INDEX(scanning));
+ if (isnull)
+ {
+ SET_WITHOUT_RESIZING(null_values, ISNULL_INDEX(scanning));
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a chance
+ * that we may stop iterating here:
+ */
+ if (!isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*
+ * A slight edge case. Consider:
+ *
+ * ----------
+ * A | lag(A, 1)
+ * 1 | NULL
+ * 2 | 1
+ * NULL | ?
+ * ----------
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ */
+ --offset;
+ }
+ }
+ }
+ else
+ {
+ /*
+ * We don't care about nulls; just get the row at the required offset.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ &isnull, &isout);
+ }
if (isout)
{
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 2a4b41d..710000f 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -34,7 +34,8 @@ typedef struct Bitmapset
int nwords; /* number of words in array */
bitmapword words[1]; /* really [nwords] */
} Bitmapset; /* VARIABLE LENGTH STRUCT */
-
+#define BITMAPSET_SIZE(nwords) \
+ (offsetof(Bitmapset, words) + (nwords) * sizeof(bitmapword))
/* result of bms_subset_compare */
typedef enum
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6723647..71b44d5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -435,6 +435,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index b3d72a9..dd7396e 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index ecc1c2c..473c7f3 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -1020,5 +1022,135 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
-- cleanup
DROP TABLE empsalary;
+-- some more test cases
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 5
+ 4 | 6
+ | 6
+ | 6
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..92166d7 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -264,5 +266,35 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+
+DROP TABLE test_table;
On Mon, 2013-06-24 at 18:01 +0100, Nicholas White wrote:
Good catch - I've attached a patch to address your point 1. It now
returns the below (i.e. correctly doesn't fill in the saved value if
the index is out of the window. However, I'm not sure whether (e.g.)
lead-2-ignore-nulls means count forwards two rows, and if that's null
use the last one you've seen (the current implementation) or count
forwards two non-null rows (as you suggest). The behaviour isn't
specified in a (free) draft of the 2003 standard
(http://www.wiscorp.com/sql_2003_standard.zip), and I don't have
access to the (non-free) final version. Could someone who does have
access to it clarify this? I've also added your example to the
regression test cases.
Reading a later version of the draft, it is specified, but is still
slightly unclear.
As I see it, the standard describes the behavior in terms of eliminating
the NULL rows entirely before applying the offset. This matches Troels's
interpretation. Are you aware of any implementations that do something
different?
I didn't include this functionality for the first / last value window
functions as their implementation is currently a bit different; they
just call WinGetFuncArgInFrame to pick out a single value. Making
these functions respect nulls would involve changing the single lookup
to a walk through the tuples to find the first non-null version, and
keeping track of this index in a struct in the context. As this change
is reasonably orthogonal I was going to submit it as a separate patch.
Sounds good.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 29 June 2013 17:30, Jeff Davis <pgsql@j-davis.com> wrote:
On Mon, 2013-06-24 at 18:01 +0100, Nicholas White wrote:
Good catch - I've attached a patch to address your point 1. It now
returns the below (i.e. correctly doesn't fill in the saved value if
the index is out of the window. However, I'm not sure whether (e.g.)
lead-2-ignore-nulls means count forwards two rows, and if that's null
use the last one you've seen (the current implementation) or count
forwards two non-null rows (as you suggest). The behaviour isn't
specified in a (free) draft of the 2003 standard
(http://www.wiscorp.com/sql_2003_standard.zip), and I don't have
access to the (non-free) final version. Could someone who does have
access to it clarify this? I've also added your example to the
regression test cases.Reading a later version of the draft, it is specified, but is still
slightly unclear.As I see it, the standard describes the behavior in terms of eliminating
the NULL rows entirely before applying the offset. This matches Troels's
interpretation. Are you aware of any implementations that do something
different?I didn't include this functionality for the first / last value window
functions as their implementation is currently a bit different; they
just call WinGetFuncArgInFrame to pick out a single value. Making
these functions respect nulls would involve changing the single lookup
to a walk through the tuples to find the first non-null version, and
keeping track of this index in a struct in the context. As this change
is reasonably orthogonal I was going to submit it as a separate patch.Sounds good.
I took a quick look at this and I think there are still a few problems:
1). The ignore/respect nulls flag needs to be per-window-function
data, not a window frame option, because the same window may be shared
by multiple window function calls. For example, the following test
causes a crash:
SELECT val,
lead(val, 2) IGNORE NULLS OVER w,
lead(val, 2) RESPECT NULLS OVER w
FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
WINDOW w as ();
The connection to the server was lost. Attempting reset: Failed.
2). As Troels Nielsen said up-thread, I think this should throw a
FEATURE_NOT_SUPPORTED error if it is used for window functions that
don't support it, rather than silently ignoring the flag.
3). Similarly, the parser accepts ignore/respect nulls for arbitrary
aggregate functions over a window, so maybe this should also throw a
FEATURE_NOT_SUPPORTED error. Alternatively, it might be trivial to
make all aggregate functions work with ignore nulls in a window
context, simply by using the existing code for strict aggregate
transition functions. That might be quite handy to support things like
array_agg(val) IGNORE NULLS OVER(...).
Regards,
Dean
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
this should throw a FEATURE_NOT_SUPPORTED error if it is used for window
functions that don't support it
arbitrary aggregate functions over a window ... should also throw a
FEATURE_NOT_SUPPORTED error.
Fixed (with test cases) in the attached patch.
because the same window may be shared by multiple window function calls.
Ah, your example gives the stack trace below. As the respect / ignore nulls
frame option is part of the window definition your example should cause two
windows to be created (both based on w, but one with the respect-nulls flag
set), but instead it fails an assert as one window definition can't have
two sets of frame options. It might take me a day or two to solve this -
let me know if this approach (making the parser create two window objects)
seems wrong.
#2 0x0000000100cdb68b in ExceptionalCondition (conditionName=Could not
find the frame base for "ExceptionalCondition".
) at /Users/xxx/postgresql/src/backend/utils/error/assert.c:54
#3 0x00000001009a3c03 in transformWindowFuncCall (pstate=0x7f88228362c8,
wfunc=0x7f8822948ec0, windef=0x7f88228353a8) at
/Users/xxx/postgresql/src/backend/parser/parse_agg.c:573
Thanks -
Nick
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7c009d8..740e713 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12275,6 +12275,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12289,7 +12290,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12302,6 +12305,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12316,7 +12320,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12410,11 +12416,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index d9f0e79..e1a1020 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2000,6 +2000,16 @@ WinGetCurrentPosition(WindowObject winobj)
Assert(WindowObjectIsValid(winobj));
return winobj->winstate->currentpos;
}
+/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
/*
* WinGetPartitionRowCount
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index b18b7a5..70e84d1 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -26,9 +26,6 @@
#define WORDNUM(x) ((x) / BITS_PER_BITMAPWORD)
#define BITNUM(x) ((x) % BITS_PER_BITMAPWORD)
-#define BITMAPSET_SIZE(nwords) \
- (offsetof(Bitmapset, words) + (nwords) * sizeof(bitmapword))
-
/*----------
* This is a well-known cute trick for isolating the rightmost one-bit
* in a word. It assumes two's complement arithmetic. Consider any
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c41f1b5..917e233 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -545,7 +546,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -575,7 +576,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -11782,16 +11783,25 @@ window_definition:
}
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
n->location = @2;
@@ -12770,6 +12780,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12858,6 +12869,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index ae7d195..1eeeb97 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -482,6 +482,23 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
NameListToString(funcname)),
parser_errposition(pstate, location)));
+ if (over->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ {
+ /*
+ * As this is only implemented for the lead & lag window functions
+ * we'll filter out all aggregate functions too.
+ */
+ if (fdresult != FUNCDETAIL_WINDOWFUNC
+ || (strcmp("lead", strVal(llast(funcname))) != 0 &&
+ strcmp("lag", strVal(llast(funcname))) != 0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("RESPECT NULLS is only implemented for the lead and lag window functions"),
+ parser_errposition(pstate, location)));
+ }
+ }
+
/*
* ordered aggs not allowed in windows yet
*/
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..12cab3c 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "nodes/bitmapset.h"
#include "utils/builtins.h"
#include "windowapi.h"
@@ -25,6 +26,13 @@ typedef struct rank_context
} rank_context;
/*
+ * lead-lag process helpers
+ */
+ #define ISNULL_INDEX(i) (2 * (i))
+ #define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+ #define SET_WITHOUT_RESIZING(b, i) b->words[(i) / BITS_PER_BITMAPWORD] |= (bitmapword) 1 << (i) % BITS_PER_BITMAPWORD
+
+/*
* ntile process information
*/
typedef struct
@@ -280,7 +288,8 @@ window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
- * withdefault indicates we have a default third argument.
+ * withdefault indicates we have a default third argument. We'll only
+ * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
@@ -290,8 +299,18 @@ leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
- bool isnull;
- bool isout;
+ bool isnull = false;
+ bool isout = false;
+ bool ignore_nulls;
+ Bitmapset* null_values;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't
+ * mark at all).
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,12 +324,136 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if(!forward)
+ {
+ offset = -offset;
+ }
+
+ if (ignore_nulls)
+ {
+ int64 bits_needed, scanning, words_needed, current = WinGetCurrentPosition(winobj);
+ bool scanForward;
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't exist".
+ * This means that we'll scan from the current row an 'offset' number of
+ * non-null rows, and then return that one.
+ */
- result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
- WINDOW_SEEK_CURRENT,
- const_offset,
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones we've
+ * accessed (more specifically, if they're null or not). We'll need one
+ * bit for whether the value is null and one bit for whether we've checked
+ * that tuple or not. We'll keep these two bits together (as opposed to
+ * having two separate bitmaps) to improve cache locality.
+ */
+ bits_needed = 2 * WinGetPartitionRowCount(winobj);
+ words_needed = (bits_needed / BITS_PER_BITMAPWORD) + 1;
+
+ null_values = (Bitmapset *) WinGetPartitionLocalMemory(
+ winobj,
+ BITMAPSET_SIZE(words_needed));
+ null_values->nwords = (int) words_needed;
+
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be in the
+ * opposite direction to the way we're scanning. We'll then force offset to
+ * be positive to make counting down the rows easier.
+ */
+ scanForward = offset == 0 ? forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; scanForward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the default
+ * value, but not whatever's left in result. We'll use the isnull
+ * flag to say "ignore it"!
+ */
+ isnull = true;
+
+ break;
+ }
+
+ /* look in the bitmap cache - do we know if this index is null? */
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), null_values))
+ {
+ isnull = bms_is_member(ISNULL_INDEX(scanning), null_values);
+ }
+ else
+ {
+ /* first time we've accessed this index; let's see if it's null: */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
&isnull, &isout);
+ if (isout)
+ break;
+
+ /* update our bitmap with this result */
+ SET_WITHOUT_RESIZING(null_values, HAVESCANNED_INDEX(scanning));
+ if (isnull)
+ {
+ SET_WITHOUT_RESIZING(null_values, ISNULL_INDEX(scanning));
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a chance
+ * that we may stop iterating here:
+ */
+ if (!isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*
+ * A slight edge case. Consider:
+ *
+ * ----------
+ * A | lag(A, 1)
+ * 1 | NULL
+ * 2 | 1
+ * NULL | ?
+ * ----------
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ */
+ --offset;
+ }
+ }
+ }
+ else
+ {
+ /*
+ * We don't care about nulls; just get the row at the required offset.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ &isnull, &isout);
+ }
if (isout)
{
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 2a4b41d..710000f 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -34,7 +34,8 @@ typedef struct Bitmapset
int nwords; /* number of words in array */
bitmapword words[1]; /* really [nwords] */
} Bitmapset; /* VARIABLE LENGTH STRUCT */
-
+#define BITMAPSET_SIZE(nwords) \
+ (offsetof(Bitmapset, words) + (nwords) * sizeof(bitmapword))
/* result of bms_subset_compare */
typedef enum
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6723647..71b44d5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -435,6 +435,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index b3d72a9..dd7396e 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index ecc1c2c..3e67cc0 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -1020,5 +1022,151 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, first_value(term_date) IGNORE NULLS OVER (...
+ ^
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY...
+ ^
-- cleanup
DROP TABLE empsalary;
+-- some more test cases:
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 5
+ 4 | 6
+ | 6
+ | 6
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w,
+ lead(val, 2) RESPECT NULLS OVER w
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..6018502 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -264,5 +266,47 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases:
+
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+DROP TABLE test_table;
+
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w,
+ lead(val, 2) RESPECT NULLS OVER w
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+
I've attached another iteration of the patch that fixes the multiple-window
bug and adds (& uses) a function to create a Bitmapset using a custom
allocator. I don't think there's any outstanding problems with it now.
Alternatively, it might be trivial to make all aggregate functions work
with ignore nulls in a window context
This is a good idea, but I'd like to keep the scope of this patch limited
for the time being - I'll look at doing this (along with the first / last /
nth value window functions) for a later release.
Thanks -
Nick
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7c009d8..740e713 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12275,6 +12275,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12289,7 +12290,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12302,6 +12305,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12316,7 +12320,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12410,11 +12416,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index d9f0e79..e1a1020 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2000,6 +2000,16 @@ WinGetCurrentPosition(WindowObject winobj)
Assert(WindowObjectIsValid(winobj));
return winobj->winstate->currentpos;
}
+/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
/*
* WinGetPartitionRowCount
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index b18b7a5..1650382 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -866,3 +866,35 @@ bms_hash_value(const Bitmapset *a)
return DatumGetUInt32(hash_any((const unsigned char *) a->words,
(lastword + 1) * sizeof(bitmapword)));
}
+
+
+/*
+ * bms_initialize - initialize a Bitmapset using a custom memory allocator
+ *
+ * allocator
+ * A function pointer that will be called once to initialize the
+ * required amount of (zeroed-out) memory
+ * allocator_arg
+ * An argument that will be passed unmodified to the allocator
+ * function. Use this to pass any state the allocator requires.
+ * nbits
+ * The maximum capacity of the Bitmapset. An int64 as a Bitmapset with
+ * INT_MAX words can store more than INT_MAX bits.
+ */
+Bitmapset *
+bms_initialize(
+ void *(*allocator) (void *arg, Size sz),
+ void *allocator_arg,
+ int64 nbits)
+{
+ int nwords;
+ Bitmapset * b;
+
+ nwords = (nbits / BITS_PER_BITMAPWORD) + 1;
+ b = (Bitmapset *) allocator(allocator_arg, BITMAPSET_SIZE(nwords));
+
+ /* set up the Bitmapset's state */
+ b->nwords = nwords;
+
+ return b;
+}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c41f1b5..917e233 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -545,7 +546,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -575,7 +576,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -11782,16 +11783,25 @@ window_definition:
}
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
n->location = @2;
@@ -12770,6 +12780,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12858,6 +12869,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7380618..7d8890a 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -572,8 +572,7 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
Assert(windef->refname == NULL &&
windef->partitionClause == NIL &&
- windef->orderClause == NIL &&
- windef->frameOptions == FRAMEOPTION_DEFAULTS);
+ windef->orderClause == NIL);
foreach(lc, pstate->p_windowdefs)
{
@@ -582,7 +581,38 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
winref++;
if (refwin->name && strcmp(refwin->name, windef->name) == 0)
{
- wfunc->winref = winref;
+ /*
+ * This is the window we want - but we may have to tweak the
+ * definition slightly (e.g. to support the IGNORE NULLS
+ * frame option). If so, clone it (but reset the name so
+ * subsequent calls continue to use the original definition)
+ */
+ if (windef->frameOptions == FRAMEOPTION_DEFAULTS)
+ wfunc->winref = winref;
+ else
+ {
+ WindowDef *clone;
+ char *name;
+
+ /*
+ * We want to avoid cloning the window name when we copy
+ * the object as we're not going to use it.
+ */
+ name = refwin->name;
+ refwin->name = NULL;
+ clone = (WindowDef *) copyObject(refwin);
+ refwin->name = name;
+
+ /* Override the clone's fields with the ones we want */
+ clone->frameOptions = windef->frameOptions;
+
+ /*
+ * Add this new definition to the list. Note that there's
+ * a chance a window with this definition already exists!
+ */
+ pstate->p_windowdefs = lappend(pstate->p_windowdefs, clone);
+ wfunc->winref = list_length(pstate->p_windowdefs);
+ }
break;
}
}
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index ae7d195..1eeeb97 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -482,6 +482,23 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
NameListToString(funcname)),
parser_errposition(pstate, location)));
+ if (over->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ {
+ /*
+ * As this is only implemented for the lead & lag window functions
+ * we'll filter out all aggregate functions too.
+ */
+ if (fdresult != FUNCDETAIL_WINDOWFUNC
+ || (strcmp("lead", strVal(llast(funcname))) != 0 &&
+ strcmp("lag", strVal(llast(funcname))) != 0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("RESPECT NULLS is only implemented for the lead and lag window functions"),
+ parser_errposition(pstate, location)));
+ }
+ }
+
/*
* ordered aggs not allowed in windows yet
*/
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..0699e18 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "nodes/bitmapset.h"
#include "utils/builtins.h"
#include "windowapi.h"
@@ -25,6 +26,12 @@ typedef struct rank_context
} rank_context;
/*
+ * lead-lag process helpers
+ */
+ #define ISNULL_INDEX(i) (2 * (i))
+ #define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+
+/*
* ntile process information
*/
typedef struct
@@ -280,7 +287,8 @@ window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
- * withdefault indicates we have a default third argument.
+ * withdefault indicates we have a default third argument. We'll only
+ * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
@@ -290,8 +298,18 @@ leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
- bool isnull;
- bool isout;
+ bool isnull = false;
+ bool isout = false;
+ bool ignore_nulls;
+ Bitmapset* null_values;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't
+ * mark at all).
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,12 +323,144 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if(!forward)
+ {
+ offset = -offset;
+ }
+
+ if (ignore_nulls)
+ {
+ int64 bits_needed, scanning, current = WinGetCurrentPosition(winobj);
+ bool scanForward;
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't exist".
+ * This means that we'll scan from the current row an 'offset' number of
+ * non-null rows, and then return that one.
+ */
+
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones we've
+ * accessed (more specifically, if they're null or not). We'll need one
+ * bit for whether the value is null and one bit for whether we've checked
+ * that tuple or not. We'll keep these two bits together (as opposed to
+ * having two separate bitmaps) to improve cache locality.
+ */
+ bits_needed = 2 * WinGetPartitionRowCount(winobj);
+
+ /*
+ * This code is a bit messy - we want to initialize the Bitmapset in the
+ * partition's local memory.
+ */
+ null_values = bms_initialize(WinGetPartitionLocalMemory, winobj, bits_needed);
- result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
- WINDOW_SEEK_CURRENT,
- const_offset,
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be in the
+ * opposite direction to the way we're scanning. We'll then force offset to
+ * be positive to make counting down the rows easier.
+ */
+ scanForward = offset == 0 ? forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; scanForward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the default
+ * value, but not whatever's left in result. We'll use the isnull
+ * flag to say "ignore it"!
+ */
+ isnull = true;
+
+ break;
+ }
+
+ /* look in the bitmap cache - do we know if this index is null? */
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), null_values))
+ {
+ isnull = bms_is_member(ISNULL_INDEX(scanning), null_values);
+ }
+ else
+ {
+ Bitmapset *b;
+
+ /* first time we've accessed this index; let's see if it's null: */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
&isnull, &isout);
+ if (isout)
+ break;
+
+ /*
+ * Update our bitmap with this result. Note the bitmap should have
+ * been sized correctly so bms_add_member should never need to
+ * re-allocate a larger chunk of memory.
+ */
+ b = bms_add_member(null_values, HAVESCANNED_INDEX(scanning));
+ Assert(b == null_values);
+ if (isnull)
+ {
+ b = bms_add_member(null_values, ISNULL_INDEX(scanning));
+ Assert(b == null_values);
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a chance
+ * that we may stop iterating here:
+ */
+ if (!isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*
+ * A slight edge case. Consider:
+ *
+ * ----------
+ * A | lag(A, 1)
+ * 1 | NULL
+ * 2 | 1
+ * NULL | ?
+ * ----------
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ */
+ --offset;
+ }
+ }
+ }
+ else
+ {
+ /*
+ * We don't care about nulls; just get the row at the required offset.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ &isnull, &isout);
+ }
if (isout)
{
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 2a4b41d..4700c00 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -93,4 +93,10 @@ extern int bms_first_member(Bitmapset *a);
/* support for hashtables using Bitmapsets as keys: */
extern uint32 bms_hash_value(const Bitmapset *a);
+/* initialize a Bitmapset using a custom memory allocator */
+extern Bitmapset *bms_initialize(
+ void *(*allocator) (void *arg, Size sz), /* function pointer to the allocator */
+ void *arg, /* passed through to the first argument to the allocator */
+ int64 nbits); /* the maximum capacity of the Bitmapset */
+
#endif /* BITMAPSET_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6723647..71b44d5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -435,6 +435,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index b3d72a9..dd7396e 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index ecc1c2c..3b4ce54 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -1020,5 +1022,165 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, first_value(term_date) IGNORE NULLS OVER (...
+ ^
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY...
+ ^
-- cleanup
DROP TABLE empsalary;
+-- some more test cases:
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 5
+ 4 | 6
+ | 6
+ | 6
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+ val | ignore | respect
+-----+--------+---------
+ 1 | 3 | 3
+ 2 | 4 | 4
+ 3 | 5 |
+ 4 | 6 |
+ | 6 |
+ | 6 | 5
+ | 6 | 6
+ 5 | 7 | 7
+ 6 | |
+ 7 | |
+(10 rows)
+
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..665abcb 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -264,5 +266,47 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases:
+
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+DROP TABLE test_table;
+
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+
On 1 July 2013 03:07, Nicholas White <n.j.white@gmail.com> wrote:
Alternatively, it might be trivial to make all aggregate functions work
with ignore nulls in a window contextThis is a good idea, but I'd like to keep the scope of this patch limited
for the time being
Agreed.
- I'll look at doing this (along with the first / last /
nth value window functions) for a later release.
On the other hand, perhaps this is not worth doing for aggregates,
since in that case IGNORE NULLS is just a special case of FILTER
(WHERE ...). Making IGNORE NULLS work for the other window functions
is probably more useful, as you say, in a future patch.
Regards,
Dean
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Jun 30, 2013 at 10:07 PM, Nicholas White <n.j.white@gmail.com> wrote:
I've attached another iteration of the patch that fixes the multiple-window
bug and adds (& uses) a function to create a Bitmapset using a custom
allocator. I don't think there's any outstanding problems with it now.
I think the right way to do this is to temporarily set the current
memory context to winobj->winstate->partcontext while creating or
manipulating the Bitmapset and restore it afterwards. Maybe someone
will say that's a modularity violation, but surely this is worse...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1 July 2013 03:07, Nicholas White <n.j.white@gmail.com> wrote:
I've attached another iteration of the patch that fixes the multiple-window
bug and adds (& uses) a function to create a Bitmapset using a custom
allocator. I don't think there's any outstanding problems with it now.
I just realised there is another issue (sorry). pg_get_viewdef() needs
to be updated so that dumping and restoring a view that uses
ignore/respect nulls works properly.
Regards,
Dean
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
pg_get_viewdef() needs to be updated
Ah, good catch - I've fixed this in the attached. I also discovered that
there's a parent-child hierarchy of WindowDefs (using relname->name), so
instead of cloning the WindowDef (in parse_agg.c) if the frameOptions are
different (e.g. by adding the ignore-nulls flag) I create a child of the
WindowDef and override the frameOptions. This has the useful side-effect of
making pg_get_viewdef work as expected (the previous iteration of the patch
produced a copy of the window definintion, not the window name, as it was
using a nameless clone), although the output has parentheses around the
view name:
lag(i.i, 2) IGNORE NULLS OVER (w) AS lagged_by_2
I've updated the test cases accordingly. Thanks -
Nick
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7c009d8..740e713 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12275,6 +12275,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12289,7 +12290,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12302,6 +12305,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12316,7 +12320,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12410,11 +12416,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index d9f0e79..76244ac 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2002,6 +2002,17 @@ WinGetCurrentPosition(WindowObject winobj)
}
/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
+
+/*
* WinGetPartitionRowCount
* Return total number of rows contained in the current partition.
*
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index b18b7a5..4713574 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -866,3 +866,34 @@ bms_hash_value(const Bitmapset *a)
return DatumGetUInt32(hash_any((const unsigned char *) a->words,
(lastword + 1) * sizeof(bitmapword)));
}
+
+/*
+ * bms_initialize - initialize a Bitmapset using a custom memory allocator
+ *
+ * allocator
+ * A function pointer that will be called once to initialize the
+ * required amount of (zeroed-out) memory
+ * allocator_arg
+ * An argument that will be passed unmodified to the allocator
+ * function. Use this to pass any state the allocator requires.
+ * nbits
+ * The maximum capacity of the Bitmapset. An int64 as a Bitmapset with
+ * INT_MAX words can store more than INT_MAX bits.
+ */
+Bitmapset *
+bms_initialize(
+ void *(*allocator) (void *arg, Size sz),
+ void *allocator_arg,
+ int64 nbits)
+{
+ int nwords;
+ Bitmapset * b;
+
+ nwords = (nbits / BITS_PER_BITMAPWORD) + 1;
+ b = (Bitmapset *) allocator(allocator_arg, BITMAPSET_SIZE(nwords));
+
+ /* set up the Bitmapset's state */
+ b->nwords = nwords;
+
+ return b;
+}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c41f1b5..917e233 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -545,7 +546,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -575,7 +576,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -11782,16 +11783,25 @@ window_definition:
}
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
n->location = @2;
@@ -12770,6 +12780,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12858,6 +12869,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7380618..592899f 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -572,8 +572,7 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
Assert(windef->refname == NULL &&
windef->partitionClause == NIL &&
- windef->orderClause == NIL &&
- windef->frameOptions == FRAMEOPTION_DEFAULTS);
+ windef->orderClause == NIL);
foreach(lc, pstate->p_windowdefs)
{
@@ -582,7 +581,32 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
winref++;
if (refwin->name && strcmp(refwin->name, windef->name) == 0)
{
- wfunc->winref = winref;
+ /*
+ * This is the window we want - but we may have to tweak the
+ * definition slightly (e.g. to support the IGNORE NULLS
+ * frame option). If so, create a 'child' (using refname
+ * to inherit everything from the parent) that just
+ * overrides the frame options.
+ */
+ if (windef->frameOptions == FRAMEOPTION_DEFAULTS)
+ wfunc->winref = winref;
+ else
+ {
+ WindowDef *clone = makeNode(WindowDef);
+
+ clone->refname = pstrdup(refwin->name);
+ clone->frameOptions = windef->frameOptions; /* Note windef! */
+ clone->startOffset = copyObject(refwin->startOffset);
+ clone->endOffset = copyObject(refwin->endOffset);
+ clone->location = refwin->location;
+
+ /*
+ * Add this new definition to the list. Note that there's
+ * a chance a window with this definition already exists!
+ */
+ pstate->p_windowdefs = lappend(pstate->p_windowdefs, clone);
+ wfunc->winref = list_length(pstate->p_windowdefs);
+ }
break;
}
}
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index ae7d195..1eeeb97 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -482,6 +482,23 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
NameListToString(funcname)),
parser_errposition(pstate, location)));
+ if (over->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ {
+ /*
+ * As this is only implemented for the lead & lag window functions
+ * we'll filter out all aggregate functions too.
+ */
+ if (fdresult != FUNCDETAIL_WINDOWFUNC
+ || (strcmp("lead", strVal(llast(funcname))) != 0 &&
+ strcmp("lag", strVal(llast(funcname))) != 0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("RESPECT NULLS is only implemented for the lead and lag window functions"),
+ parser_errposition(pstate, location)));
+ }
+ }
+
/*
* ordered aggs not allowed in windows yet
*/
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index a1ed781..5dcd4d6 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -7461,7 +7461,7 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
appendStringInfoChar(buf, '*');
else
get_rule_expr((Node *) wfunc->args, context, true);
- appendStringInfoString(buf, ") OVER ");
+ appendStringInfoString(buf, ") ");
foreach(l, context->windowClause)
{
@@ -7469,6 +7469,10 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
if (wc->winref == wfunc->winref)
{
+ if (wc->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ appendStringInfoString(buf, "IGNORE NULLS ");
+ appendStringInfoString(buf, "OVER ");
+
if (wc->name)
appendStringInfoString(buf, quote_identifier(wc->name));
else
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..0699e18 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "nodes/bitmapset.h"
#include "utils/builtins.h"
#include "windowapi.h"
@@ -25,6 +26,12 @@ typedef struct rank_context
} rank_context;
/*
+ * lead-lag process helpers
+ */
+ #define ISNULL_INDEX(i) (2 * (i))
+ #define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+
+/*
* ntile process information
*/
typedef struct
@@ -280,7 +287,8 @@ window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
- * withdefault indicates we have a default third argument.
+ * withdefault indicates we have a default third argument. We'll only
+ * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
@@ -290,8 +298,18 @@ leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
- bool isnull;
- bool isout;
+ bool isnull = false;
+ bool isout = false;
+ bool ignore_nulls;
+ Bitmapset* null_values;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't
+ * mark at all).
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,12 +323,144 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if(!forward)
+ {
+ offset = -offset;
+ }
+
+ if (ignore_nulls)
+ {
+ int64 bits_needed, scanning, current = WinGetCurrentPosition(winobj);
+ bool scanForward;
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't exist".
+ * This means that we'll scan from the current row an 'offset' number of
+ * non-null rows, and then return that one.
+ */
+
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones we've
+ * accessed (more specifically, if they're null or not). We'll need one
+ * bit for whether the value is null and one bit for whether we've checked
+ * that tuple or not. We'll keep these two bits together (as opposed to
+ * having two separate bitmaps) to improve cache locality.
+ */
+ bits_needed = 2 * WinGetPartitionRowCount(winobj);
+
+ /*
+ * This code is a bit messy - we want to initialize the Bitmapset in the
+ * partition's local memory.
+ */
+ null_values = bms_initialize(WinGetPartitionLocalMemory, winobj, bits_needed);
- result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
- WINDOW_SEEK_CURRENT,
- const_offset,
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be in the
+ * opposite direction to the way we're scanning. We'll then force offset to
+ * be positive to make counting down the rows easier.
+ */
+ scanForward = offset == 0 ? forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; scanForward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the default
+ * value, but not whatever's left in result. We'll use the isnull
+ * flag to say "ignore it"!
+ */
+ isnull = true;
+
+ break;
+ }
+
+ /* look in the bitmap cache - do we know if this index is null? */
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), null_values))
+ {
+ isnull = bms_is_member(ISNULL_INDEX(scanning), null_values);
+ }
+ else
+ {
+ Bitmapset *b;
+
+ /* first time we've accessed this index; let's see if it's null: */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
&isnull, &isout);
+ if (isout)
+ break;
+
+ /*
+ * Update our bitmap with this result. Note the bitmap should have
+ * been sized correctly so bms_add_member should never need to
+ * re-allocate a larger chunk of memory.
+ */
+ b = bms_add_member(null_values, HAVESCANNED_INDEX(scanning));
+ Assert(b == null_values);
+ if (isnull)
+ {
+ b = bms_add_member(null_values, ISNULL_INDEX(scanning));
+ Assert(b == null_values);
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a chance
+ * that we may stop iterating here:
+ */
+ if (!isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*
+ * A slight edge case. Consider:
+ *
+ * ----------
+ * A | lag(A, 1)
+ * 1 | NULL
+ * 2 | 1
+ * NULL | ?
+ * ----------
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ */
+ --offset;
+ }
+ }
+ }
+ else
+ {
+ /*
+ * We don't care about nulls; just get the row at the required offset.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ &isnull, &isout);
+ }
if (isout)
{
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 2a4b41d..4700c00 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -93,4 +93,10 @@ extern int bms_first_member(Bitmapset *a);
/* support for hashtables using Bitmapsets as keys: */
extern uint32 bms_hash_value(const Bitmapset *a);
+/* initialize a Bitmapset using a custom memory allocator */
+extern Bitmapset *bms_initialize(
+ void *(*allocator) (void *arg, Size sz), /* function pointer to the allocator */
+ void *arg, /* passed through to the first argument to the allocator */
+ int64 nbits); /* the maximum capacity of the Bitmapset */
+
#endif /* BITMAPSET_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6723647..b50701f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -398,19 +398,33 @@ typedef struct SortBy
* For entries in a WINDOW list, "name" is the window name being defined.
* For OVER clauses, we use "name" for the "OVER window" syntax, or "refname"
* for the "OVER (window)" syntax, which is subtly different --- the latter
- * implies overriding the window frame clause.
+ * implies overriding the window frame clause. In this case, the per-field
+ * comments to determine what the semantics are:
+ * VIRTUAL:
+ * If NULL, then the parent's (refname) value is used.
+ * MANDATORY:
+ * Never inherited from the parent, so must be specified -
+ * but can be NULL.
*/
typedef struct WindowDef
{
NodeTag type;
- char *name; /* window's own name */
- char *refname; /* referenced window name, if any */
- List *partitionClause; /* PARTITION BY expression list */
- List *orderClause; /* ORDER BY (list of SortBy) */
- int frameOptions; /* frame_clause options, see below */
- Node *startOffset; /* expression for starting bound, if any */
- Node *endOffset; /* expression for ending bound, if any */
- int location; /* parse location, or -1 if none/unknown */
+ /* window's own name [MANDATORY value of NULL] */
+ char *name;
+ /* referenced window name, if any [MANDATORY] */
+ char *refname;
+ /* PARTITION BY expression list [VIRTUAL] */
+ List *partitionClause;
+ /* ORDER BY (list of SortBy) [VIRTUAL] */
+ List *orderClause;
+ /* frame_clause options, see below [MANDATORY] */
+ int frameOptions;
+ /* expression for starting bound, if any [MANDATORY] */
+ Node *startOffset;
+ /* expression for ending bound, if any [MANDATORY] */
+ Node *endOffset;
+ /* parse location, or -1 if none/unknown [MANDATORY] */
+ int location;
} WindowDef;
/*
@@ -435,6 +449,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index b3d72a9..dd7396e 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index ecc1c2c..05d36c5 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -931,30 +933,39 @@ FROM tenk1 WHERE unique1 < 10;
17 | 9
(10 rows)
+-- test view definitions are preserved
CREATE TEMP VIEW v_window AS
- SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
- FROM generate_series(1, 10) i;
+ SELECT
+ i,
+ sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows,
+ lag(i, 1) IGNORE NULLS OVER (ORDER BY i DESC) AS lagged_by_1,
+ lag(i, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM generate_series(1, 10) i
+ WINDOW w as (ORDER BY i ASC);
SELECT * FROM v_window;
- i | sum_rows
-----+----------
- 1 | 3
- 2 | 6
- 3 | 9
- 4 | 12
- 5 | 15
- 6 | 18
- 7 | 21
- 8 | 24
- 9 | 27
- 10 | 19
+ i | sum_rows | lagged_by_1 | lagged_by_2
+----+----------+-------------+-------------
+ 10 | 19 | | 8
+ 9 | 27 | 10 | 7
+ 8 | 24 | 9 | 6
+ 7 | 21 | 8 | 5
+ 6 | 18 | 7 | 4
+ 5 | 15 | 6 | 3
+ 4 | 12 | 5 | 2
+ 3 | 9 | 4 | 1
+ 2 | 6 | 3 |
+ 1 | 3 | 2 |
(10 rows)
SELECT pg_get_viewdef('v_window');
- pg_get_viewdef
----------------------------------------------------------------------------------------
- SELECT i.i, +
- sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows+
- FROM generate_series(1, 10) i(i);
+ pg_get_viewdef
+-----------------------------------------------------------------------------------------
+ SELECT i.i, +
+ sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows, +
+ lag(i.i, 1) IGNORE NULLS OVER (ORDER BY i.i DESC) AS lagged_by_1, +
+ lag(i.i, 2) IGNORE NULLS OVER (w) AS lagged_by_2 +
+ FROM generate_series(1, 10) i(i) +
+ WINDOW w AS (ORDER BY i.i);
(1 row)
-- with UNION
@@ -1020,5 +1031,165 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, first_value(term_date) IGNORE NULLS OVER (...
+ ^
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY...
+ ^
-- cleanup
DROP TABLE empsalary;
+-- some more test cases:
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 5
+ 4 | 6
+ | 6
+ | 6
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+ val | ignore | respect
+-----+--------+---------
+ 1 | 3 | 3
+ 2 | 4 | 4
+ 3 | 5 |
+ 4 | 6 |
+ | 6 |
+ | 6 | 5
+ | 6 | 6
+ 5 | 7 | 7
+ 6 | |
+ 7 | |
+(10 rows)
+
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..ef2842d 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -222,9 +224,16 @@ SELECT sum(unique1) over
unique1
FROM tenk1 WHERE unique1 < 10;
+-- test view definitions are preserved
CREATE TEMP VIEW v_window AS
- SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
- FROM generate_series(1, 10) i;
+ SELECT
+ i,
+ sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows,
+ lag(i, 1) IGNORE NULLS OVER (ORDER BY i DESC) AS lagged_by_1,
+ lag(i, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM generate_series(1, 10) i
+ WINDOW w as (ORDER BY i ASC);
+
SELECT * FROM v_window;
@@ -264,5 +273,47 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases:
+
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+DROP TABLE test_table;
+
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+
On Mon, 2013-07-01 at 07:40 -0400, Robert Haas wrote:
On Sun, Jun 30, 2013 at 10:07 PM, Nicholas White <n.j.white@gmail.com> wrote:
I've attached another iteration of the patch that fixes the multiple-window
bug and adds (& uses) a function to create a Bitmapset using a custom
allocator. I don't think there's any outstanding problems with it now.I think the right way to do this is to temporarily set the current
memory context to winobj->winstate->partcontext while creating or
manipulating the Bitmapset and restore it afterwards. Maybe someone
will say that's a modularity violation, but surely this is worse...
I think we should get rid of the bitmapset entirely. For one thing, we
want to be able to support large frames, and the size of the allocation
for the bitmap is dependent on the size of the frame. It would take a
very large frame for that to matter, but conceptually, it doesn't seem
right to me.
Instead of the bitmapset, we can keep track of two offsets, and the
number of rows in between which are non-NULL. That only works with a
constant offset; but I'm not inclined to optimize for the special case
involving large frames, variable offset which always happens to be
large, and IGNORE NULLS.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, 2013-06-28 at 13:26 -0400, Robert Haas wrote:
I haven't really reviewed the windowing-related code in depth; I
thought Jeff might jump back in for that part of it. Jeff, is that
something you're planning to do?
Yes, getting back into this patch now after a bit of delay.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, 2013-07-01 at 18:20 -0400, Nicholas White wrote:
pg_get_viewdef() needs to be updated
Ah, good catch - I've fixed this in the attached. I also discovered
that there's a parent-child hierarchy of WindowDefs (using
relname->name), so instead of cloning the WindowDef (in parse_agg.c)
if the frameOptions are different (e.g. by adding the ignore-nulls
flag) I create a child of the WindowDef and override the frameOptions.
This has the useful side-effect of making pg_get_viewdef work as
expected (the previous iteration of the patch produced a copy of the
window definintion, not the window name, as it was using a nameless
clone), although the output has parentheses around the view name:
A couple comments:
* We shouldn't create an arbitrary number of duplicate windows when
many aggregates are specified with IGNORE NULLS.
* It's bad form to modify a list while iterating through it. This is
just a style issue because there's a break afterward, anyway.
Also, I'm concerned that we're changing a reference of the form:
OVER w
into:
OVER (w)
in a user-visible way. Is there a problem with having two windowdefs in
the p_windowdefs list with the same name and different frameOptions?
I think you could just change the matching criteria to be a matching
name and matching frameOptions. In the loop, if you find a matching name
but frameOptions doesn't match, keep a pointer to the windowdef and
create a new one at the end of the loop with the same name.
You'll have to be a little careful that any other code knows that names
can be duplicated in the list though.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I've attached a revised version that fixes the issues above:
changing a reference of the form:
OVER w
into:
OVER (w)
Fixed (and I've updated the tests).
It's bad form to modify a list while iterating through it.
Fixed
We shouldn't create an arbitrary number of duplicate windows
Fixed
Is there a problem with having two windowdefs in
the p_windowdefs list with the same name
...
You'll have to be a little careful that any other code knows that names
can be duplicated in the list though.
I'm not sure I really can verify this - as I'm not sure how much
contrib / other third-party code has access to this data structure.
I'd prefer to be cautious and just create a child window if needed.
I think we should get rid of the bitmapset entirely
...
Instead of the bitmapset, we can keep track of two offsets
I've modified leadlag_common so it uses your suggested algorithm for
constant offsets (although it turns out you only need to keep a single
int64 index in the context). This algorithm calls
WinGetFuncArgInPartition at least twice per row, once to check whether
the current row is null (and so check if we have to move the leading /
lagged index forward) and either once to get leading / lagging value
or more than once to push the leading / lagged value forwards to the
next non-null value.
I've kept the bitmap solution for the non-constant offset case (i.e.
the random partition access case) as I believe it changes the cost of
calculating the lead / lagged values for every row in the partition to
O(partition size) - whereas a non-caching scan-the-partition solution
would be O(partition size * partition size). Is that OK?
Thanks -
Nick
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7c009d8..740e713 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12275,6 +12275,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12289,7 +12290,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12302,6 +12305,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12316,7 +12320,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12410,11 +12416,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index d9f0e79..76244ac 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2002,6 +2002,17 @@ WinGetCurrentPosition(WindowObject winobj)
}
/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
+
+/*
* WinGetPartitionRowCount
* Return total number of rows contained in the current partition.
*
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index b18b7a5..4713574 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -866,3 +866,34 @@ bms_hash_value(const Bitmapset *a)
return DatumGetUInt32(hash_any((const unsigned char *) a->words,
(lastword + 1) * sizeof(bitmapword)));
}
+
+/*
+ * bms_initialize - initialize a Bitmapset using a custom memory allocator
+ *
+ * allocator
+ * A function pointer that will be called once to initialize the
+ * required amount of (zeroed-out) memory
+ * allocator_arg
+ * An argument that will be passed unmodified to the allocator
+ * function. Use this to pass any state the allocator requires.
+ * nbits
+ * The maximum capacity of the Bitmapset. An int64 as a Bitmapset with
+ * INT_MAX words can store more than INT_MAX bits.
+ */
+Bitmapset *
+bms_initialize(
+ void *(*allocator) (void *arg, Size sz),
+ void *allocator_arg,
+ int64 nbits)
+{
+ int nwords;
+ Bitmapset * b;
+
+ nwords = (nbits / BITS_PER_BITMAPWORD) + 1;
+ b = (Bitmapset *) allocator(allocator_arg, BITMAPSET_SIZE(nwords));
+
+ /* set up the Bitmapset's state */
+ b->nwords = nwords;
+
+ return b;
+}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c41f1b5..917e233 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -545,7 +546,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -575,7 +576,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -11782,16 +11783,25 @@ window_definition:
}
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
n->location = @2;
@@ -12770,6 +12780,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12858,6 +12869,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7380618..4c6fb46 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -569,28 +569,76 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
{
Index winref = 0;
ListCell *lc;
+ WindowDef *refwin;
Assert(windef->refname == NULL &&
windef->partitionClause == NIL &&
- windef->orderClause == NIL &&
- windef->frameOptions == FRAMEOPTION_DEFAULTS);
+ windef->orderClause == NIL);
foreach(lc, pstate->p_windowdefs)
{
- WindowDef *refwin = (WindowDef *) lfirst(lc);
-
+ refwin = (WindowDef *) lfirst(lc);
winref++;
- if (refwin->name && strcmp(refwin->name, windef->name) == 0)
- {
- wfunc->winref = winref;
+ if (refwin->name && strcmp(refwin->name, windef->name) == 0)
break;
- }
}
+
if (lc == NULL) /* didn't find it? */
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("window \"%s\" does not exist", windef->name),
parser_errposition(pstate, windef->location)));
+ else if (windef->frameOptions == FRAMEOPTION_DEFAULTS)
+ wfunc->winref = winref;
+ else
+ {
+ /*
+ * This is the window we want - but we have to tweak the
+ * definition slightly (e.g. to support the IGNORE NULLS
+ * frame option) as we're not using the default (i.e. parent)
+ * frame options.
+ *
+ * We'll create a 'child' (using refname to inherit everything
+ * from the parent) that just overrides the frame options
+ * (assuming it doesn't already exist):
+ */
+ WindowDef *clone = makeNode(WindowDef);
+
+ clone->refname = pstrdup(refwin->name);
+ clone->frameOptions = windef->frameOptions; /* Note windef! */
+ clone->startOffset = copyObject(refwin->startOffset);
+ clone->endOffset = copyObject(refwin->endOffset);
+ clone->location = refwin->location;
+
+ /*
+ * Add this new definition to the list. Note that there's
+ * a chance a window with this definition already exists!
+ */
+ winref = 0;
+ foreach(lc, pstate->p_windowdefs)
+ {
+ refwin = (WindowDef *) lfirst(lc);
+
+ winref++;
+ if (refwin->refname &&
+ strcmp(refwin->refname, clone->refname) == 0 &&
+ equal(refwin->partitionClause, clone->partitionClause) &&
+ equal(refwin->orderClause, clone->orderClause) &&
+ refwin->frameOptions == clone->frameOptions &&
+ equal(refwin->startOffset, clone->startOffset) &&
+ equal(refwin->endOffset, clone->endOffset))
+ {
+ /* found a duplicate window specification */
+ wfunc->winref = winref;
+ break;
+ }
+ }
+ if (lc == NULL) /* didn't find it? */
+ {
+ pstate->p_windowdefs = lappend(pstate->p_windowdefs, clone);
+ wfunc->winref = list_length(pstate->p_windowdefs);
+ }
+ }
}
else
{
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index ae7d195..1eeeb97 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -482,6 +482,23 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
NameListToString(funcname)),
parser_errposition(pstate, location)));
+ if (over->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ {
+ /*
+ * As this is only implemented for the lead & lag window functions
+ * we'll filter out all aggregate functions too.
+ */
+ if (fdresult != FUNCDETAIL_WINDOWFUNC
+ || (strcmp("lead", strVal(llast(funcname))) != 0 &&
+ strcmp("lag", strVal(llast(funcname))) != 0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("RESPECT NULLS is only implemented for the lead and lag window functions"),
+ parser_errposition(pstate, location)));
+ }
+ }
+
/*
* ordered aggs not allowed in windows yet
*/
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index a1ed781..8d0509a 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -4762,11 +4762,15 @@ get_rule_windowspec(WindowClause *wc, List *targetList,
bool needspace = false;
const char *sep;
ListCell *l;
+ size_t refname_len = 0;
+ int initial_buf_len = buf->len;
appendStringInfoChar(buf, '(');
if (wc->refname)
{
- appendStringInfoString(buf, quote_identifier(wc->refname));
+ const char *quoted_refname = quote_identifier(wc->refname);
+ refname_len = strlen(quoted_refname);
+ appendStringInfoString(buf, quoted_refname);
needspace = true;
}
/* partition clauses are always inherited, so only print if no refname */
@@ -4848,7 +4852,20 @@ get_rule_windowspec(WindowClause *wc, List *targetList,
/* we will now have a trailing space; remove it */
buf->len--;
}
- appendStringInfoChar(buf, ')');
+
+ /*
+ * We'll tidy up the output slightly; if we've got a refname, but haven't
+ * overridden the partition-by, order-by or any of the frame flags relevant
+ * instide the window def's ()s, then we'll be left with "(<refname>)".
+ * We'll trim off the brackets in this case:
+ */
+ if (wc->refname && buf->len == initial_buf_len + refname_len + 1)
+ {
+ memcpy(buf->data + initial_buf_len, buf->data + initial_buf_len + 1, refname_len);
+ buf->len -= 1; /* the leading "(" */
+ }
+ else
+ appendStringInfoChar(buf, ')');
}
/* ----------
@@ -7461,7 +7478,7 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
appendStringInfoChar(buf, '*');
else
get_rule_expr((Node *) wfunc->args, context, true);
- appendStringInfoString(buf, ") OVER ");
+ appendStringInfoString(buf, ") ");
foreach(l, context->windowClause)
{
@@ -7469,6 +7486,10 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
if (wc->winref == wfunc->winref)
{
+ if (wc->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ appendStringInfoString(buf, "IGNORE NULLS ");
+ appendStringInfoString(buf, "OVER ");
+
if (wc->name)
appendStringInfoString(buf, quote_identifier(wc->name));
else
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..c03b831 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "nodes/bitmapset.h"
#include "utils/builtins.h"
#include "windowapi.h"
@@ -24,6 +25,18 @@ typedef struct rank_context
int64 rank; /* current rank */
} rank_context;
+
+typedef struct leadlag_const_context
+{
+ int64 next; /* the index of the lead / lagged value */
+} leadlag_const_context;
+
+/*
+ * lead-lag process helpers
+ */
+ #define ISNULL_INDEX(i) (2 * (i))
+ #define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+
/*
* ntile process information
*/
@@ -280,7 +293,8 @@ window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
- * withdefault indicates we have a default third argument.
+ * withdefault indicates we have a default third argument. We'll only
+ * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
@@ -290,8 +304,18 @@ leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
- bool isnull;
- bool isout;
+ bool isnull = false;
+ bool isout = false;
+ bool ignore_nulls;
+ Bitmapset* null_values;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't
+ * mark at all).
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,21 +329,235 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if(!forward)
+ {
+ offset = -offset;
+ }
+
+ if (ignore_nulls && !const_offset)
+ {
+ int64 bits_needed, scanning, current = WinGetCurrentPosition(winobj);
+ bool scanForward;
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't exist".
+ * This means that we'll scan from the current row an 'offset' number of
+ * non-null rows, and then return that one.
+ *
+ * As the offset isn't constant we need efficient random access to the
+ * partition, as we'll check upto O(partition size) tuples for each row
+ * we're calculating the window function value for.
+ */
- result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
- WINDOW_SEEK_CURRENT,
- const_offset,
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones we've
+ * accessed (more specifically, if they're null or not). We'll need one
+ * bit for whether the value is null and one bit for whether we've checked
+ * that tuple or not. We'll keep these two bits together (as opposed to
+ * having two separate bitmaps) to improve cache locality.
+ */
+ bits_needed = 2 * WinGetPartitionRowCount(winobj);
+
+ /*
+ * This code is a bit messy - we want to initialize the Bitmapset in the
+ * partition's local memory.
+ */
+ null_values = bms_initialize(WinGetPartitionLocalMemory, winobj, bits_needed);
+
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be in the
+ * opposite direction to the way we're scanning. We'll then force offset to
+ * be positive to make counting down the rows easier.
+ */
+ scanForward = offset == 0 ? forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; scanForward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the default
+ * value, but not whatever's left in result. We'll use the isnull
+ * flag to say "ignore it"!
+ */
+ isnull = true;
+
+ break;
+ }
+
+ /* look in the bitmap cache - do we know if this index is null? */
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), null_values))
+ {
+ isnull = bms_is_member(ISNULL_INDEX(scanning), null_values);
+ }
+ else
+ {
+ Bitmapset *b;
+
+ /* first time we've accessed this index; let's see if it's null: */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
&isnull, &isout);
+ if (isout)
+ break;
+
+ /*
+ * Update our bitmap with this result. Note the bitmap should have
+ * been sized correctly so bms_add_member should never need to
+ * re-allocate a larger chunk of memory.
+ */
+ b = bms_add_member(null_values, HAVESCANNED_INDEX(scanning));
+ Assert(b == null_values);
+ if (isnull)
+ {
+ b = bms_add_member(null_values, ISNULL_INDEX(scanning));
+ Assert(b == null_values);
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a chance
+ * that we may stop iterating here:
+ */
+ if (!isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*
+ * A slight edge case. Consider:
+ *
+ * ----------
+ * A | lag(A, 1)
+ * 1 | NULL
+ * 2 | 1
+ * NULL | ?
+ * ----------
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ */
+ --offset;
+ }
+ }
+ }
+ else if (ignore_nulls /* && const_offset */)
+ {
+ /*
+ * We can process a constant offset much more efficiently; initially
+ * we'll scan through the first <offset> non-null rows, and store that
+ * index. On subsequent rows we'll decide whether to push that index
+ * forwards to the next non-null value, or just return it again.
+ */
+ leadlag_const_context *context = WinGetPartitionLocalMemory(
+ winobj,
+ sizeof(leadlag_const_context));
+ int count_forward = 0;
+
+ /*
+ * Set the forward flag based on the direction of traversal - remember
+ * we can have a LEAD or LAG of -1, and that should be equivalent to
+ * a LAG or LEAD of 1 respectively.
+ */
+ forward = offset == 0 ? forward : (offset > 0);
+
+ if (WinGetCurrentPosition(winobj) == 0)
+ if (forward)
+ count_forward = offset;
+ else
+ context->next = offset; /* LAG, so offset is negative */
+ else
+ {
+ /*
+ * LEADs and LAGs are actually pretty similar - the decision of
+ * whether or not to push our offset value forwards depends on
+ * the current row (for LEADs) or the previous row (for LAGs) is
+ * NULL - hence the (forward ? 0 : -1) below.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ forward ? 0 : -1,
+ WINDOW_SEEK_CURRENT,
+ forward,
+ &isnull, &isout);
+ if (!isnull)
+ count_forward = 1;
+ }
+
+ /*
+ * Count forward through the rows, skipping nulls and terminating if
+ * we run off the end of the window.
+ */
+ for (; count_forward > 0 && !isout; --count_forward)
+ {
+ do
+ {
+ /*
+ * Conveniently, calling WinGetFuncArgInPartition with an
+ * absolute index less than zero (correctly) sets isout
+ * and isnull to true
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ ++(context->next),
+ WINDOW_SEEK_HEAD,
+ !forward,
+ &isnull, &isout);
+ }
+ while (isnull && !isout);
+ }
+
+ result = WinGetFuncArgInPartition(winobj, 0,
+ context->next,
+ WINDOW_SEEK_HEAD,
+ !forward,
+ &isnull, &isout);
+ }
+ else
+ {
+ /*
+ * We don't care about nulls; just get the row at the required offset.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ &isnull, &isout);
+ }
if (isout)
{
/*
- * target row is out of the partition; supply default value if
- * provided. otherwise it'll stay NULL
+ * Target row is out of the partition; supply default value if
+ * provided.
*/
if (withdefault)
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
+ else
+ {
+ /*
+ * Don't return whatever's lying around in result, force the output
+ * to null if there's no default.
+ */
+ Assert(isnull);
+ }
}
if (isnull)
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 2a4b41d..4700c00 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -93,4 +93,10 @@ extern int bms_first_member(Bitmapset *a);
/* support for hashtables using Bitmapsets as keys: */
extern uint32 bms_hash_value(const Bitmapset *a);
+/* initialize a Bitmapset using a custom memory allocator */
+extern Bitmapset *bms_initialize(
+ void *(*allocator) (void *arg, Size sz), /* function pointer to the allocator */
+ void *arg, /* passed through to the first argument to the allocator */
+ int64 nbits); /* the maximum capacity of the Bitmapset */
+
#endif /* BITMAPSET_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 6723647..2ada2f6 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -398,19 +398,35 @@ typedef struct SortBy
* For entries in a WINDOW list, "name" is the window name being defined.
* For OVER clauses, we use "name" for the "OVER window" syntax, or "refname"
* for the "OVER (window)" syntax, which is subtly different --- the latter
- * implies overriding the window frame clause.
+ * implies overriding the window frame clause. In this case, the per-field
+ * comments to determine what the semantics are:
+ * VIRTUAL:
+ * If NULL, then the parent's (refname) value is used.
+ * MANDATORY:
+ * Never inherited from the parent, so must be specified -
+ * but can be NULL.
+ * SUPER:
+ * Always inherited from parent, any local version ignored.
*/
typedef struct WindowDef
{
NodeTag type;
- char *name; /* window's own name */
- char *refname; /* referenced window name, if any */
- List *partitionClause; /* PARTITION BY expression list */
- List *orderClause; /* ORDER BY (list of SortBy) */
- int frameOptions; /* frame_clause options, see below */
- Node *startOffset; /* expression for starting bound, if any */
- Node *endOffset; /* expression for ending bound, if any */
- int location; /* parse location, or -1 if none/unknown */
+ /* window's own name [MANDATORY value of NULL] */
+ char *name;
+ /* referenced window name, if any [MANDATORY] */
+ char *refname;
+ /* PARTITION BY expression list [VIRTUAL] */
+ List *partitionClause;
+ /* ORDER BY (list of SortBy) [SUPER] */
+ List *orderClause;
+ /* frame_clause options, see below [MANDATORY] */
+ int frameOptions;
+ /* expression for starting bound, if any [MANDATORY] */
+ Node *startOffset;
+ /* expression for ending bound, if any [MANDATORY] */
+ Node *endOffset;
+ /* parse location, or -1 if none/unknown [MANDATORY] */
+ int location;
} WindowDef;
/*
@@ -435,6 +451,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index b3d72a9..dd7396e 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -179,6 +179,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -312,6 +313,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index ecc1c2c..38768e3 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -931,30 +933,39 @@ FROM tenk1 WHERE unique1 < 10;
17 | 9
(10 rows)
+-- test view definitions are preserved
CREATE TEMP VIEW v_window AS
- SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
- FROM generate_series(1, 10) i;
+ SELECT
+ i,
+ sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows,
+ lag(i, 1) IGNORE NULLS OVER (ORDER BY i DESC) AS lagged_by_1,
+ lag(i, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM generate_series(1, 10) i
+ WINDOW w as (ORDER BY i ASC);
SELECT * FROM v_window;
- i | sum_rows
-----+----------
- 1 | 3
- 2 | 6
- 3 | 9
- 4 | 12
- 5 | 15
- 6 | 18
- 7 | 21
- 8 | 24
- 9 | 27
- 10 | 19
+ i | sum_rows | lagged_by_1 | lagged_by_2
+----+----------+-------------+-------------
+ 10 | 19 | | 8
+ 9 | 27 | 10 | 7
+ 8 | 24 | 9 | 6
+ 7 | 21 | 8 | 5
+ 6 | 18 | 7 | 4
+ 5 | 15 | 6 | 3
+ 4 | 12 | 5 | 2
+ 3 | 9 | 4 | 1
+ 2 | 6 | 3 |
+ 1 | 3 | 2 |
(10 rows)
SELECT pg_get_viewdef('v_window');
- pg_get_viewdef
----------------------------------------------------------------------------------------
- SELECT i.i, +
- sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows+
- FROM generate_series(1, 10) i(i);
+ pg_get_viewdef
+-----------------------------------------------------------------------------------------
+ SELECT i.i, +
+ sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows, +
+ lag(i.i, 1) IGNORE NULLS OVER (ORDER BY i.i DESC) AS lagged_by_1, +
+ lag(i.i, 2) IGNORE NULLS OVER w AS lagged_by_2 +
+ FROM generate_series(1, 10) i(i) +
+ WINDOW w AS (ORDER BY i.i);
(1 row)
-- with UNION
@@ -1020,5 +1031,165 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of ntile must be greater than zero
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
ERROR: argument of nth_value must be greater than zero
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, first_value(term_date) IGNORE NULLS OVER (...
+ ^
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY...
+ ^
-- cleanup
DROP TABLE empsalary;
+-- some more test cases:
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 5
+ 4 | 6
+ | 6
+ | 6
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+ val | ignore | respect
+-----+--------+---------
+ 1 | 3 | 3
+ 2 | 4 | 4
+ 3 | 5 |
+ 4 | 6 |
+ | 6 |
+ | 6 | 5
+ | 6 | 6
+ 5 | 7 | 7
+ 6 | |
+ 7 | |
+(10 rows)
+
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 769be0f..ef2842d 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -222,9 +224,16 @@ SELECT sum(unique1) over
unique1
FROM tenk1 WHERE unique1 < 10;
+-- test view definitions are preserved
CREATE TEMP VIEW v_window AS
- SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
- FROM generate_series(1, 10) i;
+ SELECT
+ i,
+ sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows,
+ lag(i, 1) IGNORE NULLS OVER (ORDER BY i DESC) AS lagged_by_1,
+ lag(i, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM generate_series(1, 10) i
+ WINDOW w as (ORDER BY i ASC);
+
SELECT * FROM v_window;
@@ -264,5 +273,47 @@ SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;
SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases:
+
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+DROP TABLE test_table;
+
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+
On Thu, 2013-07-11 at 10:51 -0400, Nicholas White wrote:
I've attached a revised version that fixes the issues above:
I'll get to this soon, sorry for the delay.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
np, optimising for quality not speed :)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 07/15/2013 10:19 AM, Jeff Davis wrote:
On Thu, 2013-07-11 at 10:51 -0400, Nicholas White wrote:
I've attached a revised version that fixes the issues above:
I'll get to this soon, sorry for the delay.
Regards,
Jeff Davis
So ... are you doing a final review of this for the CF, Jeff? We need
to either commit it or bounce it to the next CF.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WM5924e82d1f3d93f75dd9fea1077d0369ef59a88d00350df3850ad51643285f2cde2d6d927f983a4a270bf1ef33b44f92@asav-1.01.com
On Fri, 2013-07-19 at 09:39 -0700, Josh Berkus wrote:
So ... are you doing a final review of this for the CF, Jeff? We need
to either commit it or bounce it to the next CF.
I am going on vacation tomorrow, and I just didn't quite find time to
take this to commit. Sorry about that, Nicholas. The patch improved a
lot this CF though, so we'll get it in quickly and I don't foresee any
problem with it making it in 9.4.
(For that matter, am I not supposed to commit between 'fests? Or is it
still an option for me to finish up with this after I get back even if
we close the CF?)
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 21.07.2013 08:41, Jeff Davis wrote:
(For that matter, am I not supposed to commit between 'fests? Or is it
still an option for me to finish up with this after I get back even if
we close the CF?)
It's totally OK to commit stuff between 'fests.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
(For that matter, am I not supposed to commit between 'fests? Or is it
still an option for me to finish up with this after I get back even if
we close the CF?)
The idea of the CommitFests is to give committers some *time off*
between them. If a committer wants to commit stuff when it's not a CF,
that's totally up to them.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WMd425b4297514f9eed810b3db965696993c0a22238be11a3a54db383572656b8c2633d7373b2fba359a0a610dfc7253ad@asav-2.01.com
On Thu, 2013-07-11 at 10:51 -0400, Nicholas White wrote:
I've attached a revised version that fixes the issues above:
This patch is in the 2013-09 commitfest but needs a rebase.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
but needs a rebase.
See attached - thanks!
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 425544a..a3babed 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12295,6 +12295,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12309,7 +12310,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12322,6 +12325,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12336,7 +12340,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12430,11 +12436,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index bbc5336..3b69ac1 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2016,6 +2016,17 @@ WinGetCurrentPosition(WindowObject winobj)
}
/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
+
+/*
* WinGetPartitionRowCount
* Return total number of rows contained in the current partition.
*
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index b18b7a5..4713574 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -866,3 +866,34 @@ bms_hash_value(const Bitmapset *a)
return DatumGetUInt32(hash_any((const unsigned char *) a->words,
(lastword + 1) * sizeof(bitmapword)));
}
+
+/*
+ * bms_initialize - initialize a Bitmapset using a custom memory allocator
+ *
+ * allocator
+ * A function pointer that will be called once to initialize the
+ * required amount of (zeroed-out) memory
+ * allocator_arg
+ * An argument that will be passed unmodified to the allocator
+ * function. Use this to pass any state the allocator requires.
+ * nbits
+ * The maximum capacity of the Bitmapset. An int64 as a Bitmapset with
+ * INT_MAX words can store more than INT_MAX bits.
+ */
+Bitmapset *
+bms_initialize(
+ void *(*allocator) (void *arg, Size sz),
+ void *allocator_arg,
+ int64 nbits)
+{
+ int nwords;
+ Bitmapset * b;
+
+ nwords = (nbits / BITS_PER_BITMAPWORD) + 1;
+ b = (Bitmapset *) allocator(allocator_arg, BITMAPSET_SIZE(nwords));
+
+ /* set up the Bitmapset's state */
+ b->nwords = nwords;
+
+ return b;
+}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 22e82ba..9073f90 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -546,7 +547,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -576,7 +577,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -11553,19 +11554,28 @@ filter_clause:
| /*EMPTY*/ { $$ = NULL; }
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
- n->location = @2;
+ n->location = @3;
$$ = n;
}
| /*EMPTY*/
@@ -12542,6 +12552,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12631,6 +12642,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 4e4e1cd..51fc6ca 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -579,28 +579,76 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
{
Index winref = 0;
ListCell *lc;
+ WindowDef *refwin;
Assert(windef->refname == NULL &&
windef->partitionClause == NIL &&
- windef->orderClause == NIL &&
- windef->frameOptions == FRAMEOPTION_DEFAULTS);
+ windef->orderClause == NIL);
foreach(lc, pstate->p_windowdefs)
{
- WindowDef *refwin = (WindowDef *) lfirst(lc);
-
+ refwin = (WindowDef *) lfirst(lc);
winref++;
- if (refwin->name && strcmp(refwin->name, windef->name) == 0)
- {
- wfunc->winref = winref;
+ if (refwin->name && strcmp(refwin->name, windef->name) == 0)
break;
- }
}
+
if (lc == NULL) /* didn't find it? */
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("window \"%s\" does not exist", windef->name),
parser_errposition(pstate, windef->location)));
+ else if (windef->frameOptions == FRAMEOPTION_DEFAULTS)
+ wfunc->winref = winref;
+ else
+ {
+ /*
+ * This is the window we want - but we have to tweak the
+ * definition slightly (e.g. to support the IGNORE NULLS
+ * frame option) as we're not using the default (i.e. parent)
+ * frame options.
+ *
+ * We'll create a 'child' (using refname to inherit everything
+ * from the parent) that just overrides the frame options
+ * (assuming it doesn't already exist):
+ */
+ WindowDef *clone = makeNode(WindowDef);
+
+ clone->refname = pstrdup(refwin->name);
+ clone->frameOptions = windef->frameOptions; /* Note windef! */
+ clone->startOffset = copyObject(refwin->startOffset);
+ clone->endOffset = copyObject(refwin->endOffset);
+ clone->location = refwin->location;
+
+ /*
+ * Add this new definition to the list. Note that there's
+ * a chance a window with this definition already exists!
+ */
+ winref = 0;
+ foreach(lc, pstate->p_windowdefs)
+ {
+ refwin = (WindowDef *) lfirst(lc);
+
+ winref++;
+ if (refwin->refname &&
+ strcmp(refwin->refname, clone->refname) == 0 &&
+ equal(refwin->partitionClause, clone->partitionClause) &&
+ equal(refwin->orderClause, clone->orderClause) &&
+ refwin->frameOptions == clone->frameOptions &&
+ equal(refwin->startOffset, clone->startOffset) &&
+ equal(refwin->endOffset, clone->endOffset))
+ {
+ /* found a duplicate window specification */
+ wfunc->winref = winref;
+ break;
+ }
+ }
+ if (lc == NULL) /* didn't find it? */
+ {
+ pstate->p_windowdefs = lappend(pstate->p_windowdefs, clone);
+ wfunc->winref = list_length(pstate->p_windowdefs);
+ }
+ }
}
else
{
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 1f02c9a..8a7f867 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -518,6 +518,23 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
errmsg("FILTER is not implemented in non-aggregate window functions"),
parser_errposition(pstate, location)));
+ if (over->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ {
+ /*
+ * As this is only implemented for the lead & lag window functions
+ * we'll filter out all aggregate functions too.
+ */
+ if (fdresult != FUNCDETAIL_WINDOWFUNC
+ || (strcmp("lead", strVal(llast(funcname))) != 0 &&
+ strcmp("lag", strVal(llast(funcname))) != 0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("RESPECT NULLS is only implemented for the lead and lag window functions"),
+ parser_errposition(pstate, location)));
+ }
+ }
+
/*
* ordered aggs not allowed in windows yet
*/
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 2b005d6..6c222a3 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -4778,11 +4778,15 @@ get_rule_windowspec(WindowClause *wc, List *targetList,
bool needspace = false;
const char *sep;
ListCell *l;
+ size_t refname_len = 0;
+ int initial_buf_len = buf->len;
appendStringInfoChar(buf, '(');
if (wc->refname)
{
- appendStringInfoString(buf, quote_identifier(wc->refname));
+ const char *quoted_refname = quote_identifier(wc->refname);
+ refname_len = strlen(quoted_refname);
+ appendStringInfoString(buf, quoted_refname);
needspace = true;
}
/* partition clauses are always inherited, so only print if no refname */
@@ -4864,7 +4868,20 @@ get_rule_windowspec(WindowClause *wc, List *targetList,
/* we will now have a trailing space; remove it */
buf->len--;
}
- appendStringInfoChar(buf, ')');
+
+ /*
+ * We'll tidy up the output slightly; if we've got a refname, but haven't
+ * overridden the partition-by, order-by or any of the frame flags relevant
+ * inside the window def's ()s, then we'll be left with "(<refname>)".
+ * We'll trim off the brackets in this case:
+ */
+ if (wc->refname && buf->len == initial_buf_len + refname_len + 1)
+ {
+ memcpy(buf->data + initial_buf_len, buf->data + initial_buf_len + 1, refname_len);
+ buf->len -= 1; /* the trailing ")" */
+ }
+ else
+ appendStringInfoChar(buf, ')');
}
/* ----------
@@ -7493,7 +7510,7 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
get_rule_expr((Node *) wfunc->aggfilter, context, false);
}
- appendStringInfoString(buf, ") OVER ");
+ appendStringInfoString(buf, ") ");
foreach(l, context->windowClause)
{
@@ -7501,6 +7518,10 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
if (wc->winref == wfunc->winref)
{
+ if (wc->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ appendStringInfoString(buf, "IGNORE NULLS ");
+ appendStringInfoString(buf, "OVER ");
+
if (wc->name)
appendStringInfoString(buf, quote_identifier(wc->name));
else
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..c03b831 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "nodes/bitmapset.h"
#include "utils/builtins.h"
#include "windowapi.h"
@@ -24,6 +25,18 @@ typedef struct rank_context
int64 rank; /* current rank */
} rank_context;
+
+typedef struct leadlag_const_context
+{
+ int64 next; /* the index of the lead / lagged value */
+} leadlag_const_context;
+
+/*
+ * lead-lag process helpers
+ */
+ #define ISNULL_INDEX(i) (2 * (i))
+ #define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+
/*
* ntile process information
*/
@@ -280,7 +293,8 @@ window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
- * withdefault indicates we have a default third argument.
+ * withdefault indicates we have a default third argument. We'll only
+ * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
@@ -290,8 +304,18 @@ leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
- bool isnull;
- bool isout;
+ bool isnull = false;
+ bool isout = false;
+ bool ignore_nulls;
+ Bitmapset* null_values;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't
+ * mark at all).
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,21 +329,235 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if(!forward)
+ {
+ offset = -offset;
+ }
+
+ if (ignore_nulls && !const_offset)
+ {
+ int64 bits_needed, scanning, current = WinGetCurrentPosition(winobj);
+ bool scanForward;
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't exist".
+ * This means that we'll scan from the current row an 'offset' number of
+ * non-null rows, and then return that one.
+ *
+ * As the offset isn't constant we need efficient random access to the
+ * partition, as we'll check upto O(partition size) tuples for each row
+ * we're calculating the window function value for.
+ */
- result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
- WINDOW_SEEK_CURRENT,
- const_offset,
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones we've
+ * accessed (more specifically, if they're null or not). We'll need one
+ * bit for whether the value is null and one bit for whether we've checked
+ * that tuple or not. We'll keep these two bits together (as opposed to
+ * having two separate bitmaps) to improve cache locality.
+ */
+ bits_needed = 2 * WinGetPartitionRowCount(winobj);
+
+ /*
+ * This code is a bit messy - we want to initialize the Bitmapset in the
+ * partition's local memory.
+ */
+ null_values = bms_initialize(WinGetPartitionLocalMemory, winobj, bits_needed);
+
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be in the
+ * opposite direction to the way we're scanning. We'll then force offset to
+ * be positive to make counting down the rows easier.
+ */
+ scanForward = offset == 0 ? forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; scanForward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the default
+ * value, but not whatever's left in result. We'll use the isnull
+ * flag to say "ignore it"!
+ */
+ isnull = true;
+
+ break;
+ }
+
+ /* look in the bitmap cache - do we know if this index is null? */
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), null_values))
+ {
+ isnull = bms_is_member(ISNULL_INDEX(scanning), null_values);
+ }
+ else
+ {
+ Bitmapset *b;
+
+ /* first time we've accessed this index; let's see if it's null: */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
&isnull, &isout);
+ if (isout)
+ break;
+
+ /*
+ * Update our bitmap with this result. Note the bitmap should have
+ * been sized correctly so bms_add_member should never need to
+ * re-allocate a larger chunk of memory.
+ */
+ b = bms_add_member(null_values, HAVESCANNED_INDEX(scanning));
+ Assert(b == null_values);
+ if (isnull)
+ {
+ b = bms_add_member(null_values, ISNULL_INDEX(scanning));
+ Assert(b == null_values);
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a chance
+ * that we may stop iterating here:
+ */
+ if (!isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*
+ * A slight edge case. Consider:
+ *
+ * ----------
+ * A | lag(A, 1)
+ * 1 | NULL
+ * 2 | 1
+ * NULL | ?
+ * ----------
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ */
+ --offset;
+ }
+ }
+ }
+ else if (ignore_nulls /* && const_offset */)
+ {
+ /*
+ * We can process a constant offset much more efficiently; initially
+ * we'll scan through the first <offset> non-null rows, and store that
+ * index. On subsequent rows we'll decide whether to push that index
+ * forwards to the next non-null value, or just return it again.
+ */
+ leadlag_const_context *context = WinGetPartitionLocalMemory(
+ winobj,
+ sizeof(leadlag_const_context));
+ int count_forward = 0;
+
+ /*
+ * Set the forward flag based on the direction of traversal - remember
+ * we can have a LEAD or LAG of -1, and that should be equivalent to
+ * a LAG or LEAD of 1 respectively.
+ */
+ forward = offset == 0 ? forward : (offset > 0);
+
+ if (WinGetCurrentPosition(winobj) == 0)
+ if (forward)
+ count_forward = offset;
+ else
+ context->next = offset; /* LAG, so offset is negative */
+ else
+ {
+ /*
+ * LEADs and LAGs are actually pretty similar - the decision of
+ * whether or not to push our offset value forwards depends on
+ * the current row (for LEADs) or the previous row (for LAGs) is
+ * NULL - hence the (forward ? 0 : -1) below.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ forward ? 0 : -1,
+ WINDOW_SEEK_CURRENT,
+ forward,
+ &isnull, &isout);
+ if (!isnull)
+ count_forward = 1;
+ }
+
+ /*
+ * Count forward through the rows, skipping nulls and terminating if
+ * we run off the end of the window.
+ */
+ for (; count_forward > 0 && !isout; --count_forward)
+ {
+ do
+ {
+ /*
+ * Conveniently, calling WinGetFuncArgInPartition with an
+ * absolute index less than zero (correctly) sets isout
+ * and isnull to true
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ ++(context->next),
+ WINDOW_SEEK_HEAD,
+ !forward,
+ &isnull, &isout);
+ }
+ while (isnull && !isout);
+ }
+
+ result = WinGetFuncArgInPartition(winobj, 0,
+ context->next,
+ WINDOW_SEEK_HEAD,
+ !forward,
+ &isnull, &isout);
+ }
+ else
+ {
+ /*
+ * We don't care about nulls; just get the row at the required offset.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ &isnull, &isout);
+ }
if (isout)
{
/*
- * target row is out of the partition; supply default value if
- * provided. otherwise it'll stay NULL
+ * Target row is out of the partition; supply default value if
+ * provided.
*/
if (withdefault)
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
+ else
+ {
+ /*
+ * Don't return whatever's lying around in result, force the output
+ * to null if there's no default.
+ */
+ Assert(isnull);
+ }
}
if (isnull)
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 2a4b41d..4700c00 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -93,4 +93,10 @@ extern int bms_first_member(Bitmapset *a);
/* support for hashtables using Bitmapsets as keys: */
extern uint32 bms_hash_value(const Bitmapset *a);
+/* initialize a Bitmapset using a custom memory allocator */
+extern Bitmapset *bms_initialize(
+ void *(*allocator) (void *arg, Size sz), /* function pointer to the allocator */
+ void *arg, /* passed through to the first argument to the allocator */
+ int64 nbits); /* the maximum capacity of the Bitmapset */
+
#endif /* BITMAPSET_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 51fef68..d27ca5f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -406,19 +406,35 @@ typedef struct SortBy
* For entries in a WINDOW list, "name" is the window name being defined.
* For OVER clauses, we use "name" for the "OVER window" syntax, or "refname"
* for the "OVER (window)" syntax, which is subtly different --- the latter
- * implies overriding the window frame clause.
+ * implies overriding the window frame clause. In this case, the per-field
+ * comments to determine what the semantics are:
+ * VIRTUAL:
+ * If NULL, then the parent's (refname) value is used.
+ * MANDATORY:
+ * Never inherited from the parent, so must be specified -
+ * but can be NULL.
+ * SUPER:
+ * Always inherited from parent, any local version ignored.
*/
typedef struct WindowDef
{
NodeTag type;
- char *name; /* window's own name */
- char *refname; /* referenced window name, if any */
- List *partitionClause; /* PARTITION BY expression list */
- List *orderClause; /* ORDER BY (list of SortBy) */
- int frameOptions; /* frame_clause options, see below */
- Node *startOffset; /* expression for starting bound, if any */
- Node *endOffset; /* expression for ending bound, if any */
- int location; /* parse location, or -1 if none/unknown */
+ /* window's own name [MANDATORY value of NULL] */
+ char *name;
+ /* referenced window name, if any [MANDATORY] */
+ char *refname;
+ /* PARTITION BY expression list [VIRTUAL] */
+ List *partitionClause;
+ /* ORDER BY (list of SortBy) [SUPER] */
+ List *orderClause;
+ /* frame_clause options, see below [MANDATORY] */
+ int frameOptions;
+ /* expression for starting bound, if any [MANDATORY] */
+ Node *startOffset;
+ /* expression for ending bound, if any [MANDATORY] */
+ Node *endOffset;
+ /* parse location, or -1 if none/unknown [MANDATORY] */
+ int location;
} WindowDef;
/*
@@ -443,6 +459,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 8bd34d6..9196b41 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -180,6 +180,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -314,6 +315,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 7b31d13..5926a72 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -931,30 +933,39 @@ FROM tenk1 WHERE unique1 < 10;
17 | 9
(10 rows)
+-- test view definitions are preserved
CREATE TEMP VIEW v_window AS
- SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
- FROM generate_series(1, 10) i;
+ SELECT
+ i,
+ sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows,
+ lag(i, 1) IGNORE NULLS OVER (ORDER BY i DESC) AS lagged_by_1,
+ lag(i, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM generate_series(1, 10) i
+ WINDOW w as (ORDER BY i ASC);
SELECT * FROM v_window;
- i | sum_rows
-----+----------
- 1 | 3
- 2 | 6
- 3 | 9
- 4 | 12
- 5 | 15
- 6 | 18
- 7 | 21
- 8 | 24
- 9 | 27
- 10 | 19
+ i | sum_rows | lagged_by_1 | lagged_by_2
+----+----------+-------------+-------------
+ 10 | 19 | | 8
+ 9 | 27 | 10 | 7
+ 8 | 24 | 9 | 6
+ 7 | 21 | 8 | 5
+ 6 | 18 | 7 | 4
+ 5 | 15 | 6 | 3
+ 4 | 12 | 5 | 2
+ 3 | 9 | 4 | 1
+ 2 | 6 | 3 |
+ 1 | 3 | 2 |
(10 rows)
SELECT pg_get_viewdef('v_window');
- pg_get_viewdef
----------------------------------------------------------------------------------------
- SELECT i.i, +
- sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows+
- FROM generate_series(1, 10) i(i);
+ pg_get_viewdef
+-----------------------------------------------------------------------------------------
+ SELECT i.i, +
+ sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows, +
+ lag(i.i, 1) IGNORE NULLS OVER (ORDER BY i.i DESC) AS lagged_by_1, +
+ lag(i.i, 2) IGNORE NULLS OVER w AS lagged_by_2 +
+ FROM generate_series(1, 10) i(i) +
+ WINDOW w AS (ORDER BY i.i);
(1 row)
-- with UNION
@@ -1033,5 +1044,165 @@ FROM empsalary GROUP BY depname;
25100 | 1 | 22600 | develop
(3 rows)
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, first_value(term_date) IGNORE NULLS OVER (...
+ ^
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY...
+ ^
-- cleanup
DROP TABLE empsalary;
+-- some more test cases:
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 5
+ 4 | 6
+ | 6
+ | 6
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+ val | ignore | respect
+-----+--------+---------
+ 1 | 3 | 3
+ 2 | 4 | 4
+ 3 | 5 |
+ 4 | 6 |
+ | 6 |
+ | 6 | 5
+ | 6 | 6
+ 5 | 7 | 7
+ 6 | |
+ 7 | |
+(10 rows)
+
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 6ee3696..cda112f 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -222,9 +224,16 @@ SELECT sum(unique1) over
unique1
FROM tenk1 WHERE unique1 < 10;
+-- test view definitions are preserved
CREATE TEMP VIEW v_window AS
- SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
- FROM generate_series(1, 10) i;
+ SELECT
+ i,
+ sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows,
+ lag(i, 1) IGNORE NULLS OVER (ORDER BY i DESC) AS lagged_by_1,
+ lag(i, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM generate_series(1, 10) i
+ WINDOW w as (ORDER BY i ASC);
+
SELECT * FROM v_window;
@@ -272,5 +281,48 @@ SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
depname
FROM empsalary GROUP BY depname;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases:
+
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+DROP TABLE test_table;
+
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+
+
On Wed, 2013-08-21 at 22:34 -0400, Nicholas White wrote:
but needs a rebase.
See attached - thanks!
Please fix these compiler warnings:
windowfuncs.c: In function ‘leadlag_common’:
windowfuncs.c:366:3: warning: passing argument 1 of ‘bms_initialize’ from incompatible pointer type [enabled by default]
In file included from windowfuncs.c:16:0:
../../../../src/include/nodes/bitmapset.h:97:19: note: expected ‘void * (*)(void *, Size)’ but argument is of type ‘void * (*)(struct WindowObjectData *, Size)’
windowfuncs.c:306:8: warning: ‘result’ may be used uninitialized in this function [-Wmaybe-uninitialized]
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Please fix these compiler warnings
Fixed - see attached. Thanks -
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 425544a..a3babed 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12295,6 +12295,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12309,7 +12310,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12322,6 +12325,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12336,7 +12340,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12430,11 +12436,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index bbc5336..3b69ac1 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2016,6 +2016,17 @@ WinGetCurrentPosition(WindowObject winobj)
}
/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
+
+/*
* WinGetPartitionRowCount
* Return total number of rows contained in the current partition.
*
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index b18b7a5..4713574 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -866,3 +866,34 @@ bms_hash_value(const Bitmapset *a)
return DatumGetUInt32(hash_any((const unsigned char *) a->words,
(lastword + 1) * sizeof(bitmapword)));
}
+
+/*
+ * bms_initialize - initialize a Bitmapset using a custom memory allocator
+ *
+ * allocator
+ * A function pointer that will be called once to initialize the
+ * required amount of (zeroed-out) memory
+ * allocator_arg
+ * An argument that will be passed unmodified to the allocator
+ * function. Use this to pass any state the allocator requires.
+ * nbits
+ * The maximum capacity of the Bitmapset. An int64 as a Bitmapset with
+ * INT_MAX words can store more than INT_MAX bits.
+ */
+Bitmapset *
+bms_initialize(
+ void *(*allocator) (void *arg, Size sz),
+ void *allocator_arg,
+ int64 nbits)
+{
+ int nwords;
+ Bitmapset * b;
+
+ nwords = (nbits / BITS_PER_BITMAPWORD) + 1;
+ b = (Bitmapset *) allocator(allocator_arg, BITMAPSET_SIZE(nwords));
+
+ /* set up the Bitmapset's state */
+ b->nwords = nwords;
+
+ return b;
+}
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 22e82ba..9073f90 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -546,7 +547,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -576,7 +577,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -11553,19 +11554,28 @@ filter_clause:
| /*EMPTY*/ { $$ = NULL; }
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
- n->location = @2;
+ n->location = @3;
$$ = n;
}
| /*EMPTY*/
@@ -12542,6 +12552,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12631,6 +12642,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 4e4e1cd..51fc6ca 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -579,28 +579,76 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
{
Index winref = 0;
ListCell *lc;
+ WindowDef *refwin;
Assert(windef->refname == NULL &&
windef->partitionClause == NIL &&
- windef->orderClause == NIL &&
- windef->frameOptions == FRAMEOPTION_DEFAULTS);
+ windef->orderClause == NIL);
foreach(lc, pstate->p_windowdefs)
{
- WindowDef *refwin = (WindowDef *) lfirst(lc);
-
+ refwin = (WindowDef *) lfirst(lc);
winref++;
if (refwin->name && strcmp(refwin->name, windef->name) == 0)
- {
- wfunc->winref = winref;
break;
}
- }
+
if (lc == NULL) /* didn't find it? */
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("window \"%s\" does not exist", windef->name),
parser_errposition(pstate, windef->location)));
+ else if (windef->frameOptions == FRAMEOPTION_DEFAULTS)
+ wfunc->winref = winref;
+ else
+ {
+ /*
+ * This is the window we want - but we have to tweak the
+ * definition slightly (e.g. to support the IGNORE NULLS
+ * frame option) as we're not using the default (i.e. parent)
+ * frame options.
+ *
+ * We'll create a 'child' (using refname to inherit everything
+ * from the parent) that just overrides the frame options
+ * (assuming it doesn't already exist):
+ */
+ WindowDef *clone = makeNode(WindowDef);
+
+ clone->refname = pstrdup(refwin->name);
+ clone->frameOptions = windef->frameOptions; /* Note windef! */
+ clone->startOffset = copyObject(refwin->startOffset);
+ clone->endOffset = copyObject(refwin->endOffset);
+ clone->location = refwin->location;
+
+ /*
+ * Add this new definition to the list. Note that there's
+ * a chance a window with this definition already exists!
+ */
+ winref = 0;
+ foreach(lc, pstate->p_windowdefs)
+ {
+ refwin = (WindowDef *) lfirst(lc);
+
+ winref++;
+ if (refwin->refname &&
+ strcmp(refwin->refname, clone->refname) == 0 &&
+ equal(refwin->partitionClause, clone->partitionClause) &&
+ equal(refwin->orderClause, clone->orderClause) &&
+ refwin->frameOptions == clone->frameOptions &&
+ equal(refwin->startOffset, clone->startOffset) &&
+ equal(refwin->endOffset, clone->endOffset))
+ {
+ /* found a duplicate window specification */
+ wfunc->winref = winref;
+ break;
+ }
+ }
+ if (lc == NULL) /* didn't find it? */
+ {
+ pstate->p_windowdefs = lappend(pstate->p_windowdefs, clone);
+ wfunc->winref = list_length(pstate->p_windowdefs);
+ }
+ }
}
else
{
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 1f02c9a..8a7f867 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -518,6 +518,23 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
errmsg("FILTER is not implemented in non-aggregate window functions"),
parser_errposition(pstate, location)));
+ if (over->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ {
+ /*
+ * As this is only implemented for the lead & lag window functions
+ * we'll filter out all aggregate functions too.
+ */
+ if (fdresult != FUNCDETAIL_WINDOWFUNC
+ || (strcmp("lead", strVal(llast(funcname))) != 0 &&
+ strcmp("lag", strVal(llast(funcname))) != 0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("RESPECT NULLS is only implemented for the lead and lag window functions"),
+ parser_errposition(pstate, location)));
+ }
+ }
+
/*
* ordered aggs not allowed in windows yet
*/
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 2b005d6..6c222a3 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -4778,11 +4778,15 @@ get_rule_windowspec(WindowClause *wc, List *targetList,
bool needspace = false;
const char *sep;
ListCell *l;
+ size_t refname_len = 0;
+ int initial_buf_len = buf->len;
appendStringInfoChar(buf, '(');
if (wc->refname)
{
- appendStringInfoString(buf, quote_identifier(wc->refname));
+ const char *quoted_refname = quote_identifier(wc->refname);
+ refname_len = strlen(quoted_refname);
+ appendStringInfoString(buf, quoted_refname);
needspace = true;
}
/* partition clauses are always inherited, so only print if no refname */
@@ -4864,6 +4868,19 @@ get_rule_windowspec(WindowClause *wc, List *targetList,
/* we will now have a trailing space; remove it */
buf->len--;
}
+
+ /*
+ * We'll tidy up the output slightly; if we've got a refname, but haven't
+ * overridden the partition-by, order-by or any of the frame flags relevant
+ * inside the window def's ()s, then we'll be left with "(<refname>)".
+ * We'll trim off the brackets in this case:
+ */
+ if (wc->refname && buf->len == initial_buf_len + refname_len + 1)
+ {
+ memcpy(buf->data + initial_buf_len, buf->data + initial_buf_len + 1, refname_len);
+ buf->len -= 1; /* the trailing ")" */
+ }
+ else
appendStringInfoChar(buf, ')');
}
@@ -7493,7 +7510,7 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
get_rule_expr((Node *) wfunc->aggfilter, context, false);
}
- appendStringInfoString(buf, ") OVER ");
+ appendStringInfoString(buf, ") ");
foreach(l, context->windowClause)
{
@@ -7501,6 +7518,10 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
if (wc->winref == wfunc->winref)
{
+ if (wc->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ appendStringInfoString(buf, "IGNORE NULLS ");
+ appendStringInfoString(buf, "OVER ");
+
if (wc->name)
appendStringInfoString(buf, quote_identifier(wc->name));
else
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..187ed94 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "nodes/bitmapset.h"
#include "utils/builtins.h"
#include "windowapi.h"
@@ -24,6 +25,18 @@ typedef struct rank_context
int64 rank; /* current rank */
} rank_context;
+
+typedef struct leadlag_const_context
+{
+ int64 next; /* the index of the lead / lagged value */
+} leadlag_const_context;
+
+/*
+ * lead-lag process helpers
+ */
+ #define ISNULL_INDEX(i) (2 * (i))
+ #define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+
/*
* ntile process information
*/
@@ -280,7 +293,8 @@ window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
- * withdefault indicates we have a default third argument.
+ * withdefault indicates we have a default third argument. We'll only
+ * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
@@ -290,8 +304,18 @@ leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
- bool isnull;
- bool isout;
+ bool isnull = false;
+ bool isout = false;
+ bool ignore_nulls;
+ Bitmapset* null_values;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't
+ * mark at all).
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,21 +329,239 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if(!forward)
+ {
+ offset = -offset;
+ }
+
+ if (ignore_nulls && !const_offset)
+ {
+ int64 bits_needed, scanning, current = WinGetCurrentPosition(winobj);
+ bool scanForward;
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't exist".
+ * This means that we'll scan from the current row an 'offset' number of
+ * non-null rows, and then return that one.
+ *
+ * As the offset isn't constant we need efficient random access to the
+ * partition, as we'll check upto O(partition size) tuples for each row
+ * we're calculating the window function value for.
+ */
+
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones we've
+ * accessed (more specifically, if they're null or not). We'll need one
+ * bit for whether the value is null and one bit for whether we've checked
+ * that tuple or not. We'll keep these two bits together (as opposed to
+ * having two separate bitmaps) to improve cache locality.
+ */
+ bits_needed = 2 * WinGetPartitionRowCount(winobj);
+
+ /*
+ * This code is a bit messy - we want to initialize the Bitmapset in the
+ * partition's local memory.
+ */
+ null_values = bms_initialize(
+ (void *(*) (void *arg, Size sz)) WinGetPartitionLocalMemory,
+ winobj,
+ bits_needed);
+
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be in the
+ * opposite direction to the way we're scanning. We'll then force offset to
+ * be positive to make counting down the rows easier.
+ */
+ scanForward = offset == 0 ? forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; scanForward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the default
+ * value, but not whatever's left in result. We'll use the isnull
+ * flag to say "ignore it"!
+ */
+ isnull = true;
+ result = (Datum) 0;
+
+ break;
+ }
+
+ /* look in the bitmap cache - do we know if this index is null? */
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), null_values))
+ {
+ isnull = bms_is_member(ISNULL_INDEX(scanning), null_values);
+ }
+ else
+ {
+ Bitmapset *b;
+
+ /* first time we've accessed this index; let's see if it's null: */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ if (isout)
+ break;
+
+ /*
+ * Update our bitmap with this result. Note the bitmap should have
+ * been sized correctly so bms_add_member should never need to
+ * re-allocate a larger chunk of memory.
+ */
+ b = bms_add_member(null_values, HAVESCANNED_INDEX(scanning));
+ Assert(b == null_values);
+ if (isnull)
+ {
+ b = bms_add_member(null_values, ISNULL_INDEX(scanning));
+ Assert(b == null_values);
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a chance
+ * that we may stop iterating here:
+ */
+ if (!isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*
+ * A slight edge case. Consider:
+ *
+ * ----------
+ * A | lag(A, 1)
+ * 1 | NULL
+ * 2 | 1
+ * NULL | ?
+ * ----------
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ */
+ --offset;
+ }
+ }
+ }
+ else if (ignore_nulls /* && const_offset */)
+ {
+ /*
+ * We can process a constant offset much more efficiently; initially
+ * we'll scan through the first <offset> non-null rows, and store that
+ * index. On subsequent rows we'll decide whether to push that index
+ * forwards to the next non-null value, or just return it again.
+ */
+ leadlag_const_context *context = WinGetPartitionLocalMemory(
+ winobj,
+ sizeof(leadlag_const_context));
+ int count_forward = 0;
+
+ /*
+ * Set the forward flag based on the direction of traversal - remember
+ * we can have a LEAD or LAG of -1, and that should be equivalent to
+ * a LAG or LEAD of 1 respectively.
+ */
+ forward = offset == 0 ? forward : (offset > 0);
+
+ if (WinGetCurrentPosition(winobj) == 0)
+ if (forward)
+ count_forward = offset;
+ else
+ context->next = offset; /* LAG, so offset is negative */
+ else
+ {
+ /*
+ * LEADs and LAGs are actually pretty similar - the decision of
+ * whether or not to push our offset value forwards depends on
+ * the current row (for LEADs) or the previous row (for LAGs) is
+ * NULL - hence the (forward ? 0 : -1) below.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ forward ? 0 : -1,
+ WINDOW_SEEK_CURRENT,
+ forward,
+ &isnull, &isout);
+ if (!isnull)
+ count_forward = 1;
+ }
+
+ /*
+ * Count forward through the rows, skipping nulls and terminating if
+ * we run off the end of the window.
+ */
+ for (; count_forward > 0 && !isout; --count_forward)
+ {
+ do
+ {
+ /*
+ * Conveniently, calling WinGetFuncArgInPartition with an
+ * absolute index less than zero (correctly) sets isout
+ * and isnull to true
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ ++(context->next),
+ WINDOW_SEEK_HEAD,
+ !forward,
+ &isnull, &isout);
+ }
+ while (isnull && !isout);
+ }
result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
+ context->next,
+ WINDOW_SEEK_HEAD,
+ !forward,
+ &isnull, &isout);
+ }
+ else
+ {
+ /*
+ * We don't care about nulls; just get the row at the required offset.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
WINDOW_SEEK_CURRENT,
const_offset,
&isnull, &isout);
+ }
if (isout)
{
/*
- * target row is out of the partition; supply default value if
- * provided. otherwise it'll stay NULL
+ * Target row is out of the partition; supply default value if
+ * provided.
*/
if (withdefault)
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
+ else
+ {
+ /*
+ * Don't return whatever's lying around in result, force the output
+ * to null if there's no default.
+ */
+ Assert(isnull);
+ }
}
if (isnull)
diff --git a/src/include/nodes/bitmapset.h b/src/include/nodes/bitmapset.h
index 2a4b41d..4700c00 100644
--- a/src/include/nodes/bitmapset.h
+++ b/src/include/nodes/bitmapset.h
@@ -93,4 +93,10 @@ extern int bms_first_member(Bitmapset *a);
/* support for hashtables using Bitmapsets as keys: */
extern uint32 bms_hash_value(const Bitmapset *a);
+/* initialize a Bitmapset using a custom memory allocator */
+extern Bitmapset *bms_initialize(
+ void *(*allocator) (void *arg, Size sz), /* function pointer to the allocator */
+ void *arg, /* passed through to the first argument to the allocator */
+ int64 nbits); /* the maximum capacity of the Bitmapset */
+
#endif /* BITMAPSET_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 51fef68..d27ca5f 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -406,19 +406,35 @@ typedef struct SortBy
* For entries in a WINDOW list, "name" is the window name being defined.
* For OVER clauses, we use "name" for the "OVER window" syntax, or "refname"
* for the "OVER (window)" syntax, which is subtly different --- the latter
- * implies overriding the window frame clause.
+ * implies overriding the window frame clause. In this case, the per-field
+ * comments to determine what the semantics are:
+ * VIRTUAL:
+ * If NULL, then the parent's (refname) value is used.
+ * MANDATORY:
+ * Never inherited from the parent, so must be specified -
+ * but can be NULL.
+ * SUPER:
+ * Always inherited from parent, any local version ignored.
*/
typedef struct WindowDef
{
NodeTag type;
- char *name; /* window's own name */
- char *refname; /* referenced window name, if any */
- List *partitionClause; /* PARTITION BY expression list */
- List *orderClause; /* ORDER BY (list of SortBy) */
- int frameOptions; /* frame_clause options, see below */
- Node *startOffset; /* expression for starting bound, if any */
- Node *endOffset; /* expression for ending bound, if any */
- int location; /* parse location, or -1 if none/unknown */
+ /* window's own name [MANDATORY value of NULL] */
+ char *name;
+ /* referenced window name, if any [MANDATORY] */
+ char *refname;
+ /* PARTITION BY expression list [VIRTUAL] */
+ List *partitionClause;
+ /* ORDER BY (list of SortBy) [SUPER] */
+ List *orderClause;
+ /* frame_clause options, see below [MANDATORY] */
+ int frameOptions;
+ /* expression for starting bound, if any [MANDATORY] */
+ Node *startOffset;
+ /* expression for ending bound, if any [MANDATORY] */
+ Node *endOffset;
+ /* parse location, or -1 if none/unknown [MANDATORY] */
+ int location;
} WindowDef;
/*
@@ -443,6 +459,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 8bd34d6..9196b41 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -180,6 +180,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -314,6 +315,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..81f5ba0 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 7b31d13..5926a72 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -931,30 +933,39 @@ FROM tenk1 WHERE unique1 < 10;
17 | 9
(10 rows)
+-- test view definitions are preserved
CREATE TEMP VIEW v_window AS
- SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
- FROM generate_series(1, 10) i;
+ SELECT
+ i,
+ sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows,
+ lag(i, 1) IGNORE NULLS OVER (ORDER BY i DESC) AS lagged_by_1,
+ lag(i, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM generate_series(1, 10) i
+ WINDOW w as (ORDER BY i ASC);
SELECT * FROM v_window;
- i | sum_rows
-----+----------
- 1 | 3
- 2 | 6
- 3 | 9
- 4 | 12
- 5 | 15
- 6 | 18
- 7 | 21
- 8 | 24
- 9 | 27
- 10 | 19
+ i | sum_rows | lagged_by_1 | lagged_by_2
+----+----------+-------------+-------------
+ 10 | 19 | | 8
+ 9 | 27 | 10 | 7
+ 8 | 24 | 9 | 6
+ 7 | 21 | 8 | 5
+ 6 | 18 | 7 | 4
+ 5 | 15 | 6 | 3
+ 4 | 12 | 5 | 2
+ 3 | 9 | 4 | 1
+ 2 | 6 | 3 |
+ 1 | 3 | 2 |
(10 rows)
SELECT pg_get_viewdef('v_window');
pg_get_viewdef
----------------------------------------------------------------------------------------
+-----------------------------------------------------------------------------------------
SELECT i.i, +
- sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows+
- FROM generate_series(1, 10) i(i);
+ sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows, +
+ lag(i.i, 1) IGNORE NULLS OVER (ORDER BY i.i DESC) AS lagged_by_1, +
+ lag(i.i, 2) IGNORE NULLS OVER w AS lagged_by_2 +
+ FROM generate_series(1, 10) i(i) +
+ WINDOW w AS (ORDER BY i.i);
(1 row)
-- with UNION
@@ -1033,5 +1044,165 @@ FROM empsalary GROUP BY depname;
25100 | 1 | 22600 | develop
(3 rows)
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, first_value(term_date) IGNORE NULLS OVER (...
+ ^
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY...
+ ^
-- cleanup
DROP TABLE empsalary;
+-- some more test cases:
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 5
+ 4 | 6
+ | 6
+ | 6
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+ val | ignore | respect
+-----+--------+---------
+ 1 | 3 | 3
+ 2 | 4 | 4
+ 3 | 5 |
+ 4 | 6 |
+ | 6 |
+ | 6 | 5
+ | 6 | 6
+ 5 | 7 | 7
+ 6 | |
+ 7 | |
+(10 rows)
+
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 6ee3696..cda112f 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -222,9 +224,16 @@ SELECT sum(unique1) over
unique1
FROM tenk1 WHERE unique1 < 10;
+-- test view definitions are preserved
CREATE TEMP VIEW v_window AS
- SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
- FROM generate_series(1, 10) i;
+ SELECT
+ i,
+ sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows,
+ lag(i, 1) IGNORE NULLS OVER (ORDER BY i DESC) AS lagged_by_1,
+ lag(i, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM generate_series(1, 10) i
+ WINDOW w as (ORDER BY i ASC);
+
SELECT * FROM v_window;
@@ -272,5 +281,48 @@ SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
depname
FROM empsalary GROUP BY depname;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases:
+
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+DROP TABLE test_table;
+
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+
+
I gave this a quick look. It took me a while to figure out how to apply
it -- turns out you used the "ignore whitespace" option to diff, so the
only way to apply it was with patch -p1 --ignore-whitespace. Please
don't do that.
I ran pgindent over the patched code and there were a number of changes.
I suggest you run that over your local copy before your next submission,
to avoid the next official run to mangle your stuff in unforeseen ways.
For instance, the comment starting with "A slight edge case" would be
mangled; I suggest enclosing that in /*---- to avoid the problem.
(TBH that's the only interesting thing, but avoiding that kind of
breakage is worth it IMHO.)
First thing I noticed was the funky bms_initialize thingy. There was
some controversy upthread about the use of bitmapsets, and it seems you
opted for not using them for the constant case as suggested by Jeff; but
apparently the other comment by Robert about the custom bms initializer
went largely ignored. I agree with him that this is grotty. However,
the current API to get partition-local memory is too limited as there's
no way to change to the partition's context; instead you only get the
option to allocate a certain amount of memory and return that. I think
the easiest way to get around this problem is to create a new
windowapi.h function which returns the MemoryContext for the partition.
Then you can just allocate the BMS in that context.
But how do we ensure that the BMS is allocated in a context? You'd have
to switch contexts each time you call bms_add_member. I don't have a
good answer to this. I used this code in another project:
/*
* grow the "visited" bitmapset to the index' current size, to avoid
* repeated repalloc's
*/
{
BlockNumber lastblock;
lastblock = RelationGetNumberOfBlocks(rel);
visited = bms_add_member(visited, lastblock);
visited = bms_del_member(visited, lastblock);
}
This way, I know the bitmapset already has enough space for all the bits
I need and there will be no further allocation. But this is also
grotty. Maybe we should have a new entry point in bitmapset.h, say
"bms_grow" that ensures you have enough space for that many bits. Or
perhaps add a MemoryContext member to struct Bitmapset, so that all
allocations occur therein.
I'm not too sure I follow the parse_agg.c changes, but don't you need to
free the clone in the branch that finds a duplicate window spec? Is
this parent/child relationship thingy a preexisting concept, or are you
just coming up with it? It seems a bit unfamiliar to me.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Sep 26, 2013 at 4:20 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
But how do we ensure that the BMS is allocated in a context? You'd have
to switch contexts each time you call bms_add_member. I don't have a
good answer to this.
The coding of bms_add_member is pretty funky. Why doesn't it just
repalloc() the input argument if it's not big enough? If it did that,
the new allocation would be in the same memory context as the original
one, and we'd not need to care about the memory context when adding an
element to an already non-empty set.
But even if we did decide to switch memory contexts on every call, it
would still be much cleaner than this. It's only three lines of code
(and about as many instructions) to save and restore
CurrentMemoryContext.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 27.09.2013 14:12, Robert Haas wrote:
On Thu, Sep 26, 2013 at 4:20 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:But how do we ensure that the BMS is allocated in a context? You'd have
to switch contexts each time you call bms_add_member. I don't have a
good answer to this.The coding of bms_add_member is pretty funky. Why doesn't it just
repalloc() the input argument if it's not big enough? If it did that,
the new allocation would be in the same memory context as the original
one, and we'd not need to care about the memory context when adding an
element to an already non-empty set.But even if we did decide to switch memory contexts on every call, it
would still be much cleaner than this. It's only three lines of code
(and about as many instructions) to save and restore
CurrentMemoryContext.
[looks] Huh, yeah, as it stands, bms_add_member() is an accident waiting
to happen. It's totally non-obvious that it might cause the BMS to jump
from another memory context to CurrentMemoryContext. Same with
bms_add_members(). And it would be good to Assert in bms_join that both
arguments are in the same from MemoryContext.
Let's fix that...
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
bms_add_member() is an accident waiting to happen
I've attached a patch that makes it use repalloc as suggested - is it
OK to commit separately? I'll address the lead-lag patch comments in
the next couple of days. Thanks -
Attachments:
repalloc.patchapplication/octet-stream; name=repalloc.patchDownload
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index b18b7a5..7b4d4db 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -634,17 +634,8 @@ bms_add_member(Bitmapset *a, int x)
bitnum = BITNUM(x);
if (wordnum >= a->nwords)
{
- /* Slow path: make a larger set and union the input set into it */
- Bitmapset *result;
- int nwords;
- int i;
-
- result = bms_make_singleton(x);
- nwords = a->nwords;
- for (i = 0; i < nwords; i++)
- result->words[i] |= a->words[i];
- pfree(a);
- return result;
+ a = (Bitmapset *) repalloc(a, BITMAPSET_SIZE(wordnum + 1));
+ a->nwords = wordnum + 1;
}
/* Fast path: x fits in existing set */
a->words[wordnum] |= ((bitmapword) 1 << bitnum);
On 29.09.2013 23:32, Nicholas White wrote:
bms_add_member() is an accident waiting to happen
I've attached a patch that makes it use repalloc as suggested
You'll have to zero out the extended portion.
I tried to demonstrate that by setting RANDOMIZE_ALLOCATED_MEMORY, but
surprisingly regression tests still passed. I guess the regression suite
doesn't use wide enough bitmapsets to exercise that. But this causes an
assertion failure, with RANDOMIZE_ALLOCATED_MEMORY:
create table t (i int4);
select * from t as t1, t as t2, t as t3, t as t4, t as t5, t as t6, t as
t7, t as t8, t as t9, t as t10, t as t11, t as t12, t as t13, t as t14,
t as t15, t as t16, t as t17, t as t18, t as t19, t as t20, t as t21, t
as t22, t as t23, t as t24, t as t25, t as t26, t as t27, t as t28, t as
t29, t as t30, t as t31, t as t32, t as t33, t as t34, t as t35, t as
t36, t as t37, t as t38, t as t39, t as t40;
- is it OK to commit separately? I'll address the lead-lag patch
comments in the next couple of days. Thanks
Yep, thanks. I committed the attached.
After thinking about this some more, I realized that bms_add_member() is
still sensitive to CurrentMemoryContext, if the 'a' argument is NULL.
That's probably OK for the lag&lead patch - I didn't check - but if
we're going to start relying on the fact that bms_add_member() no longer
allocates a new bms in CurrentMemoryContext, that needs to be
documented. bitmapset.c currently doesn't say a word about memory contexts.
So what needs to be done next is to document how the functions in
bitmapset.c work wrt. memory contexts. Then double-check that the
behavior of all the other "recycling" bms functions is consistent. (At
least bms_add_members() needs a similar change).
- Heikki
Attachments:
0001-In-bms_add_member-use-repalloc-if-the-bms-needs-to-b.patchtext/x-diff; name=0001-In-bms_add_member-use-repalloc-if-the-bms-needs-to-b.patchDownload
>From ee01d848f39400c8524c66944ada6fde47894978 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Mon, 30 Sep 2013 16:37:00 +0300
Subject: [PATCH 1/1] In bms_add_member(), use repalloc() if the bms needs to
be enlarged.
Previously bms_add_member() would palloc a whole-new copy of the existing
set, copy the words, and pfree the old one. repalloc() is potentially much
faster, and more importantly, this is less surprising if CurrentMemoryContext
is not the same as the context the old set is in. bms_add_member() still
allocates a new bitmapset in CurrentMemoryContext if NULL is passed as
argument, but that is a lot less likely to induce bugs.
Nicholas White.
---
src/backend/nodes/bitmapset.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/src/backend/nodes/bitmapset.c b/src/backend/nodes/bitmapset.c
index b18b7a5..540db16 100644
--- a/src/backend/nodes/bitmapset.c
+++ b/src/backend/nodes/bitmapset.c
@@ -632,21 +632,20 @@ bms_add_member(Bitmapset *a, int x)
return bms_make_singleton(x);
wordnum = WORDNUM(x);
bitnum = BITNUM(x);
+
+ /* enlarge the set if necessary */
if (wordnum >= a->nwords)
{
- /* Slow path: make a larger set and union the input set into it */
- Bitmapset *result;
- int nwords;
+ int oldnwords = a->nwords;
int i;
- result = bms_make_singleton(x);
- nwords = a->nwords;
- for (i = 0; i < nwords; i++)
- result->words[i] |= a->words[i];
- pfree(a);
- return result;
+ a = (Bitmapset *) repalloc(a, BITMAPSET_SIZE(wordnum + 1));
+ a->nwords = wordnum + 1;
+ /* zero out the enlarged portion */
+ for (i = oldnwords; i < a->nwords; i++)
+ a->words[i] = 0;
}
- /* Fast path: x fits in existing set */
+
a->words[wordnum] |= ((bitmapword) 1 << bitnum);
return a;
}
--
1.8.4.rc3
I've attached another iteration of the lead-lag patch.
I suggest you run that over your local copy before your next submission
I ran pgindent before generating my patch (without -w this time), and
I've got a few more whitespace differences in the files that I
touched. I hope that hasn't added too much noise.
I suggest enclosing that in /*---- to avoid the problem.
Done
create a new windowapi.h function which returns the MemoryContext for the partition
...
But even if we did decide to switch memory contexts on every call, it would still be much cleaner than this.
I've removed all the bms_initalize code from the patch and am using
this solution. As the partition memory is zero-initialised I just
store a Bitmapset pointer in the WinGetPartitionLocalMemory. The
bms_add_member and bms_is_member functions behave sensibly for
null-pointer inputs (they return a bms_make_singleton in the current
memory context and false respectively). I've surrounded the calls to
bms_make_singleton with a memory context switch (to the partition's
context) so the Bitmapset stays in the partition's context.
Maybe we should have a new entry point in bitmapset.h, say "bms_grow" that ensures you have enough space for that many bits
This would be useful, as currently n additions require O(n) repallocs,
especially as I'm iterating through the indices in ascending order.
However, I'd rather "cheat" as I know the number of bits I'll need up
front; I can just set the (n+1)-th bit to force a single repalloc to
the final size. It's worth noting that other Bitmap implementations
(e.g. Java's java.util.BitSet) try to minimise re-allocations by
increasing the size to (e.g.) Max(2 * current size, n) if a re-size is
needed.
but don't you need to free the clone in the branch that finds a duplicate window spec?
Good catch - I've fixed that
Is this parent/child relationship thingy a preexisting concept
Yes, although it's not very well documented. I've added a lot of
documentation to the WindowDef struct in
src/include/nodes/parsenodes.h to explain which of the struct's
members use this mechanism. The WindowDef is very much like an object
in a higher-level language, where some of the members are 'virtual',
so use the parent's version if they don't have a value, and some
members are 'final', so values in this member in child WindowDefs are
ignore (i.e. the parent WindowDef's value is always used). I don't
think this degree of complexity is necessary for the lead-lag patch
alone, but since it was there I decided to take advantage of it.
Thanks -
Nick
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7dd1ef2..54ed0a4 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12328,6 +12328,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12342,7 +12343,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12355,6 +12358,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12369,7 +12373,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12463,11 +12469,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 544ba98..fc7908b 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -1981,6 +1981,18 @@ window_gettupleslot(WindowObject winobj, int64 pos, TupleTableSlot *slot)
* API exposed to window functions
***********************************************************************/
+/*
+ * WinGetPartitionMemoryContext
+ * Returns the memory context for the current partition. All
+ * allocations done by WinGetPartitionLocalMemory are put in
+ * this context.
+ */
+MemoryContext
+WinGetPartitionMemoryContext(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->partcontext;
+}
/*
* WinGetPartitionLocalMemory
@@ -2017,6 +2029,17 @@ WinGetCurrentPosition(WindowObject winobj)
}
/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
+
+/*
* WinGetPartitionRowCount
* Return total number of rows contained in the current partition.
*
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a9812af..c59ab63 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -547,7 +548,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -577,7 +578,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -11572,19 +11573,28 @@ filter_clause:
| /*EMPTY*/ { $$ = NULL; }
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
- n->location = @2;
+ n->location = @3;
$$ = n;
}
| /*EMPTY*/
@@ -12561,6 +12571,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12650,6 +12661,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 98cb58a..2bf0b00 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -579,28 +579,80 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
{
Index winref = 0;
ListCell *lc;
+ WindowDef *refwin;
Assert(windef->refname == NULL &&
windef->partitionClause == NIL &&
- windef->orderClause == NIL &&
- windef->frameOptions == FRAMEOPTION_DEFAULTS);
+ windef->orderClause == NIL);
foreach(lc, pstate->p_windowdefs)
{
- WindowDef *refwin = (WindowDef *) lfirst(lc);
-
+ refwin = (WindowDef *) lfirst(lc);
winref++;
if (refwin->name && strcmp(refwin->name, windef->name) == 0)
- {
- wfunc->winref = winref;
break;
- }
}
+
if (lc == NULL) /* didn't find it? */
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("window \"%s\" does not exist", windef->name),
parser_errposition(pstate, windef->location)));
+ else if (windef->frameOptions == FRAMEOPTION_DEFAULTS)
+ wfunc->winref = winref;
+ else
+ {
+ /*
+ * This is the window we want - but we have to tweak the
+ * definition slightly (e.g. to support the IGNORE NULLS frame
+ * option) as we're not using the default (i.e. parent) frame
+ * options.
+ *
+ * We'll create a 'child' (using refname to inherit everything
+ * from the parent) that just overrides the frame options
+ * (assuming it doesn't already exist):
+ */
+ WindowDef *clone = makeNode(WindowDef);
+
+ clone->refname = pstrdup(refwin->name);
+ clone->frameOptions = windef->frameOptions; /* Note windef! */
+ clone->startOffset = copyObject(refwin->startOffset);
+ clone->endOffset = copyObject(refwin->endOffset);
+ clone->location = refwin->location;
+
+ /*
+ * Add this new definition to the list. Note that there's a chance
+ * a window with this definition already exists!
+ */
+ winref = 0;
+ foreach(lc, pstate->p_windowdefs)
+ {
+ refwin = (WindowDef *) lfirst(lc);
+
+ winref++;
+ if (refwin->refname &&
+ strcmp(refwin->refname, clone->refname) == 0 &&
+ equal(refwin->partitionClause, clone->partitionClause) &&
+ equal(refwin->orderClause, clone->orderClause) &&
+ refwin->frameOptions == clone->frameOptions &&
+ equal(refwin->startOffset, clone->startOffset) &&
+ equal(refwin->endOffset, clone->endOffset))
+ {
+ /*
+ * found a duplicate window specification, don't need
+ * clone
+ */
+ wfunc->winref = winref;
+ pfree(clone);
+ break;
+ }
+ }
+ if (lc == NULL) /* didn't find it? */
+ {
+ pstate->p_windowdefs = lappend(pstate->p_windowdefs, clone);
+ wfunc->winref = list_length(pstate->p_windowdefs);
+ }
+ }
}
else
{
@@ -819,8 +871,8 @@ check_ungrouped_columns_walker(Node *node,
* If we find an aggregate call of the original level, do not recurse into
* its arguments or filter; ungrouped vars there are not an error. We can
* also skip looking at aggregates of higher levels, since they could not
- * possibly contain Vars of concern to us (see transformAggregateCall).
- * We do need to look at aggregates of lower levels, however.
+ * possibly contain Vars of concern to us (see transformAggregateCall). We
+ * do need to look at aggregates of lower levels, however.
*/
if (IsA(node, Aggref) &&
(int) ((Aggref *) node)->agglevelsup >= context->sublevels_up)
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 2bd24c8..7066046 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -391,13 +391,13 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
*/
if (nargs > 0 && vatype == ANYOID && func_variadic)
{
- Oid va_arr_typid = actual_arg_types[nargs - 1];
+ Oid va_arr_typid = actual_arg_types[nargs - 1];
if (!OidIsValid(get_element_type(va_arr_typid)))
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("VARIADIC argument must be an array"),
- parser_errposition(pstate, exprLocation((Node *) llast(fargs)))));
+ parser_errposition(pstate, exprLocation((Node *) llast(fargs)))));
}
/* build the appropriate output structure */
@@ -522,6 +522,23 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
errmsg("FILTER is not implemented in non-aggregate window functions"),
parser_errposition(pstate, location)));
+ if (over->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ {
+ /*
+ * As this is only implemented for the lead & lag window functions
+ * we'll filter out all aggregate functions too.
+ */
+ if (fdresult != FUNCDETAIL_WINDOWFUNC
+ || (strcmp("lead", strVal(llast(funcname))) != 0 &&
+ strcmp("lag", strVal(llast(funcname))) != 0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("RESPECT NULLS is only implemented for the lead and lag window functions"),
+ parser_errposition(pstate, location)));
+ }
+ }
+
/*
* ordered aggs not allowed in windows yet
*/
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 9a1d12e..78b0738 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -4764,11 +4764,16 @@ get_rule_windowspec(WindowClause *wc, List *targetList,
bool needspace = false;
const char *sep;
ListCell *l;
+ size_t refname_len = 0;
+ int initial_buf_len = buf->len;
appendStringInfoChar(buf, '(');
if (wc->refname)
{
- appendStringInfoString(buf, quote_identifier(wc->refname));
+ const char *quoted_refname = quote_identifier(wc->refname);
+
+ refname_len = strlen(quoted_refname);
+ appendStringInfoString(buf, quoted_refname);
needspace = true;
}
/* partition clauses are always inherited, so only print if no refname */
@@ -4850,7 +4855,20 @@ get_rule_windowspec(WindowClause *wc, List *targetList,
/* we will now have a trailing space; remove it */
buf->len--;
}
- appendStringInfoChar(buf, ')');
+
+ /*
+ * We'll tidy up the output slightly; if we've got a refname, but haven't
+ * overridden the partition-by, order-by or any of the frame flags
+ * relevant inside the window def's ()s, then we'll be left with
+ * "(<refname>)". We'll trim off the brackets in this case:
+ */
+ if (wc->refname && buf->len == initial_buf_len + refname_len + 1)
+ {
+ memcpy(buf->data + initial_buf_len, buf->data + initial_buf_len + 1, refname_len);
+ buf->len -= 1; /* the trailing ")" */
+ }
+ else
+ appendStringInfoChar(buf, ')');
}
/* ----------
@@ -7493,7 +7511,7 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
get_rule_expr((Node *) wfunc->aggfilter, context, false);
}
- appendStringInfoString(buf, ") OVER ");
+ appendStringInfoString(buf, ") ");
foreach(l, context->windowClause)
{
@@ -7501,6 +7519,10 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
if (wc->winref == wfunc->winref)
{
+ if (wc->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ appendStringInfoString(buf, "IGNORE NULLS ");
+ appendStringInfoString(buf, "OVER ");
+
if (wc->name)
appendStringInfoString(buf, quote_identifier(wc->name));
else
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..0150886 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,6 +13,7 @@
*/
#include "postgres.h"
+#include "nodes/bitmapset.h"
#include "utils/builtins.h"
#include "windowapi.h"
@@ -24,6 +25,18 @@ typedef struct rank_context
int64 rank; /* current rank */
} rank_context;
+
+typedef struct leadlag_const_context
+{
+ int64 next; /* the index of the lead / lagged value */
+} leadlag_const_context;
+
+/*
+ * lead-lag process helpers
+ */
+#define ISNULL_INDEX(i) (2 * (i))
+#define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+
/*
* ntile process information
*/
@@ -280,7 +293,8 @@ window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
- * withdefault indicates we have a default third argument.
+ * withdefault indicates we have a default third argument. We'll only
+ * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
@@ -290,8 +304,24 @@ leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
- bool isnull;
- bool isout;
+ bool isnull = false;
+ bool isout = false;
+ bool ignore_nulls;
+
+ /**
+ * A ** pointer as we keep a Bitmapset * in the partition context, and
+ * WinGetPartitionLocalMemory returns a pointer to whatever's in the
+ * context.
+ */
+ Bitmapset **null_values;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't mark
+ * at all).
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,21 +335,245 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if (!forward)
+ {
+ offset = -offset;
+ }
+
+ if (ignore_nulls && !const_offset)
+ {
+ int64 scanning,
+ current = WinGetCurrentPosition(winobj);
+ bool scanForward;
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't
+ * exist". This means that we'll scan from the current row an 'offset'
+ * number of non-null rows, and then return that one.
+ *
+ * As the offset isn't constant we need efficient random access to the
+ * partition, as we'll check upto O(partition size) tuples for each
+ * row we're calculating the window function value for.
+ */
+
+ null_values = (Bitmapset **) WinGetPartitionLocalMemory(winobj, sizeof(Bitmapset *));
+
+ if (*null_values == NULL)
+ {
+ MemoryContext oldcxt;
+
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones
+ * we've accessed (more specifically, if they're null or not).
+ * We'll need one bit for whether the value is null and one bit
+ * for whether we've checked that tuple or not. We'll keep these
+ * two bits together (as opposed to having two separate bitmaps)
+ * to improve cache locality.
+ *
+ * However, we'd lose the efficient gains if we keep having to
+ * resize the Bitmapset (by setting higher and higher bits). We
+ * know the maximum number of bits we'll ever need, so we'll use
+ * bms_make_singleton to force our Bitmapset up to the required
+ * size.
+ */
+ int64 bits_needed = 2 * WinGetPartitionRowCount(winobj);
+
+ oldcxt = MemoryContextSwitchTo(WinGetPartitionMemoryContext(winobj));
+ *null_values = bms_make_singleton(bits_needed + 1);
+ MemoryContextSwitchTo(oldcxt);
+ }
+
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be
+ * in the opposite direction to the way we're scanning. We'll then
+ * force offset to be positive to make counting down the rows easier.
+ */
+ scanForward = offset == 0 ? forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; scanForward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the
+ * default value, but not whatever's left in result. We'll use
+ * the isnull flag to say "ignore it"!
+ */
+ isnull = true;
+ result = (Datum) 0;
+
+ break;
+ }
+
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), *null_values))
+ {
+ isnull = bms_is_member(ISNULL_INDEX(scanning), *null_values);
+ }
+ else
+ {
+ /*
+ * first time we've accessed this index; let's see if it's
+ * null:
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ if (isout)
+ break;
+
+ bms_add_member(*null_values, HAVESCANNED_INDEX(scanning));
+ if (isnull)
+ {
+ bms_add_member(*null_values, ISNULL_INDEX(scanning));
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a
+ * chance that we may stop iterating here:
+ */
+ if (!isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to
+ * the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*--------
+ * A slight edge case. Consider:
+ *
+ * =================
+ * A | lag(A, 1)
+ * =================
+ * 1 | NULL
+ * 2 | 1
+ * NULL | ?
+ * =================
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ *--------
+ */
+ --offset;
+ }
+ }
+ }
+ else if (ignore_nulls /* && const_offset */ )
+ {
+ /*
+ * We can process a constant offset much more efficiently; initially
+ * we'll scan through the first <offset> non-null rows, and store that
+ * index. On subsequent rows we'll decide whether to push that index
+ * forwards to the next non-null value, or just return it again.
+ */
+ leadlag_const_context *context = WinGetPartitionLocalMemory(
+ winobj,
+ sizeof(leadlag_const_context));
+ int count_forward = 0;
+
+ /*
+ * Set the forward flag based on the direction of traversal - remember
+ * we can have a LEAD or LAG of -1, and that should be equivalent to a
+ * LAG or LEAD of 1 respectively.
+ */
+ forward = offset == 0 ? forward : (offset > 0);
- result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
- WINDOW_SEEK_CURRENT,
- const_offset,
- &isnull, &isout);
+ if (WinGetCurrentPosition(winobj) == 0)
+ if (forward)
+ count_forward = offset;
+ else
+ context->next = offset; /* LAG, so offset is negative */
+ else
+ {
+ /*
+ * LEADs and LAGs are actually pretty similar - the decision of
+ * whether or not to push our offset value forwards depends on the
+ * current row (for LEADs) or the previous row (for LAGs) is NULL
+ * - hence the (forward ? 0 : -1) below.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ forward ? 0 : -1,
+ WINDOW_SEEK_CURRENT,
+ forward,
+ &isnull, &isout);
+ if (!isnull)
+ count_forward = 1;
+ }
+
+ /*
+ * Count forward through the rows, skipping nulls and terminating if
+ * we run off the end of the window.
+ */
+ for (; count_forward > 0 && !isout; --count_forward)
+ {
+ do
+ {
+ /*
+ * Conveniently, calling WinGetFuncArgInPartition with an
+ * absolute index less than zero (correctly) sets isout and
+ * isnull to true
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ ++(context->next),
+ WINDOW_SEEK_HEAD,
+ !forward,
+ &isnull, &isout);
+ }
+ while (isnull && !isout);
+ }
+
+ result = WinGetFuncArgInPartition(winobj, 0,
+ context->next,
+ WINDOW_SEEK_HEAD,
+ !forward,
+ &isnull, &isout);
+ }
+ else
+ {
+ /*
+ * We don't care about nulls; just get the row at the required offset.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ &isnull, &isout);
+ }
if (isout)
{
/*
- * target row is out of the partition; supply default value if
- * provided. otherwise it'll stay NULL
+ * Target row is out of the partition; supply default value if
+ * provided.
*/
if (withdefault)
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
+ else
+ {
+ /*
+ * Don't return whatever's lying around in result, force the
+ * output to null if there's no default.
+ */
+ Assert(isnull);
+ }
}
if (isnull)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 51fef68..a26fc01 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -128,7 +128,7 @@ typedef struct Query
List *targetList; /* target list (of TargetEntry) */
- List *withCheckOptions; /* a list of WithCheckOption's */
+ List *withCheckOptions; /* a list of WithCheckOption's */
List *returningList; /* return-values list (of TargetEntry) */
@@ -406,19 +406,35 @@ typedef struct SortBy
* For entries in a WINDOW list, "name" is the window name being defined.
* For OVER clauses, we use "name" for the "OVER window" syntax, or "refname"
* for the "OVER (window)" syntax, which is subtly different --- the latter
- * implies overriding the window frame clause.
+ * implies overriding the window frame clause. In this case, the per-field
+ * comments to determine what the semantics are:
+ * VIRTUAL:
+ * If NULL, then the parent's (refname) value is used.
+ * MANDATORY:
+ * Never inherited from the parent, so must be specified -
+ * but can be NULL.
+ * SUPER:
+ * Always inherited from parent, any local version ignored.
*/
typedef struct WindowDef
{
NodeTag type;
- char *name; /* window's own name */
- char *refname; /* referenced window name, if any */
- List *partitionClause; /* PARTITION BY expression list */
- List *orderClause; /* ORDER BY (list of SortBy) */
- int frameOptions; /* frame_clause options, see below */
- Node *startOffset; /* expression for starting bound, if any */
- Node *endOffset; /* expression for ending bound, if any */
- int location; /* parse location, or -1 if none/unknown */
+ /* window's own name [MANDATORY value of NULL] */
+ char *name;
+ /* referenced window name, if any [MANDATORY] */
+ char *refname;
+ /* PARTITION BY expression list [VIRTUAL] */
+ List *partitionClause;
+ /* ORDER BY (list of SortBy) [SUPER] */
+ List *orderClause;
+ /* frame_clause options, see below [MANDATORY] */
+ int frameOptions;
+ /* expression for starting bound, if any [MANDATORY] */
+ Node *startOffset;
+ /* expression for ending bound, if any [MANDATORY] */
+ Node *endOffset;
+ /* parse location, or -1 if none/unknown [MANDATORY] */
+ int location;
} WindowDef;
/*
@@ -443,6 +459,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
@@ -655,10 +672,10 @@ typedef struct XmlSerialize
*
* The same comments apply to FUNCTION RTEs when the function's return type
* is a named composite type. In addition, for all return types, FUNCTION
- * RTEs with ORDINALITY must always have the last colname entry being the
- * one for the ordinal column; this is enforced when constructing the RTE.
- * Thus when ORDINALITY is used, there will be exactly one more colname
- * than would have been present otherwise.
+ * RTEs with ORDINALITY must always have the last colname entry being the
+ * one for the ordinal column; this is enforced when constructing the RTE.
+ * Thus when ORDINALITY is used, there will be exactly one more colname
+ * than would have been present otherwise.
*
* In JOIN RTEs, the colnames in both alias and eref are one-to-one with
* joinaliasvars entries. A JOIN RTE will omit columns of its inputs when
@@ -760,18 +777,18 @@ typedef struct RangeTblEntry
* If the function returns an otherwise-unspecified RECORD, funccoltypes
* lists the column types declared in the RTE's column type specification,
* funccoltypmods lists their declared typmods, funccolcollations their
- * collations. Note that in this case, ORDINALITY is not permitted, so
+ * collations. Note that in this case, ORDINALITY is not permitted, so
* there is no extra ordinal column to be allowed for.
*
- * Otherwise, those fields are NIL, and the result column types must be
+ * Otherwise, those fields are NIL, and the result column types must be
* derived from the funcexpr while treating the ordinal column, if
- * present, as a special case. (see get_rte_attribute_*)
+ * present, as a special case. (see get_rte_attribute_*)
*/
Node *funcexpr; /* expression tree for func call */
List *funccoltypes; /* OID list of column type OIDs */
List *funccoltypmods; /* integer list of column typmods */
List *funccolcollations; /* OID list of column collation OIDs */
- bool funcordinality; /* is this called WITH ORDINALITY? */
+ bool funcordinality; /* is this called WITH ORDINALITY? */
/*
* Fields valid for a values RTE (else NIL):
@@ -811,10 +828,10 @@ typedef struct RangeTblEntry
typedef struct WithCheckOption
{
NodeTag type;
- char *viewname; /* name of view that specified the WCO */
- Node *qual; /* constraint qual to check */
- bool cascaded; /* true = WITH CASCADED CHECK OPTION */
-} WithCheckOption;
+ char *viewname; /* name of view that specified the WCO */
+ Node *qual; /* constraint qual to check */
+ bool cascaded; /* true = WITH CASCADED CHECK OPTION */
+} WithCheckOption;
/*
* SortGroupClause -
@@ -2371,7 +2388,7 @@ typedef enum ViewCheckOption
NO_CHECK_OPTION,
LOCAL_CHECK_OPTION,
CASCADED_CHECK_OPTION
-} ViewCheckOption;
+} ViewCheckOption;
typedef struct ViewStmt
{
@@ -2381,7 +2398,7 @@ typedef struct ViewStmt
Node *query; /* the SELECT query */
bool replace; /* replace an existing view? */
List *options; /* options from WITH clause */
- ViewCheckOption withCheckOption; /* WITH CHECK OPTION */
+ ViewCheckOption withCheckOption; /* WITH CHECK OPTION */
} ViewStmt;
/* ----------------------
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 8bd34d6..9196b41 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -180,6 +180,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -314,6 +315,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..bcd1719 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -41,11 +41,14 @@ typedef struct WindowObjectData *WindowObject;
#define WindowObjectIsValid(winobj) \
((winobj) != NULL && IsA(winobj, WindowObjectData))
+extern MemoryContext WinGetPartitionMemoryContext(WindowObject winobj);
extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 7b31d13..e5a248e 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -931,30 +933,39 @@ FROM tenk1 WHERE unique1 < 10;
17 | 9
(10 rows)
+-- test view definitions are preserved
CREATE TEMP VIEW v_window AS
- SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
- FROM generate_series(1, 10) i;
+ SELECT
+ i,
+ sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows,
+ lag(i, 1) IGNORE NULLS OVER (ORDER BY i DESC) AS lagged_by_1,
+ lag(i, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM generate_series(1, 10) i
+ WINDOW w as (ORDER BY i ASC);
SELECT * FROM v_window;
- i | sum_rows
-----+----------
- 1 | 3
- 2 | 6
- 3 | 9
- 4 | 12
- 5 | 15
- 6 | 18
- 7 | 21
- 8 | 24
- 9 | 27
- 10 | 19
+ i | sum_rows | lagged_by_1 | lagged_by_2
+----+----------+-------------+-------------
+ 10 | 19 | | 8
+ 9 | 27 | 10 | 7
+ 8 | 24 | 9 | 6
+ 7 | 21 | 8 | 5
+ 6 | 18 | 7 | 4
+ 5 | 15 | 6 | 3
+ 4 | 12 | 5 | 2
+ 3 | 9 | 4 | 1
+ 2 | 6 | 3 |
+ 1 | 3 | 2 |
(10 rows)
SELECT pg_get_viewdef('v_window');
- pg_get_viewdef
----------------------------------------------------------------------------------------
- SELECT i.i, +
- sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows+
- FROM generate_series(1, 10) i(i);
+ pg_get_viewdef
+-----------------------------------------------------------------------------------------
+ SELECT i.i, +
+ sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows, +
+ lag(i.i, 1) IGNORE NULLS OVER (ORDER BY i.i DESC) AS lagged_by_1, +
+ lag(i.i, 2) IGNORE NULLS OVER w AS lagged_by_2 +
+ FROM generate_series(1, 10) i(i) +
+ WINDOW w AS (ORDER BY i.i);
(1 row)
-- with UNION
@@ -1033,5 +1044,197 @@ FROM empsalary GROUP BY depname;
25100 | 1 | 22600 | develop
(3 rows)
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, first_value(term_date) IGNORE NULLS OVER (...
+ ^
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY...
+ ^
+-- test non-deterministic lag (i.e. the lag-by value depends on the data)
+ALTER TABLE empsalary ADD an_offset INTEGER NOT NULL DEFAULT 1;
+SELECT term_date, lead(term_date, an_offset) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date, an_offset) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
-- cleanup
DROP TABLE empsalary;
+-- some more test cases:
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 5
+ 4 | 6
+ | 6
+ | 6
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+ val | ignore | respect
+-----+--------+---------
+ 1 | 3 | 3
+ 2 | 4 | 4
+ 3 | 5 |
+ 4 | 6 |
+ | 6 |
+ | 6 | 5
+ | 6 | 6
+ 5 | 7 | 7
+ 6 | |
+ 7 | |
+(10 rows)
+
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 6ee3696..2aac686 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -222,9 +224,16 @@ SELECT sum(unique1) over
unique1
FROM tenk1 WHERE unique1 < 10;
+-- test view definitions are preserved
CREATE TEMP VIEW v_window AS
- SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
- FROM generate_series(1, 10) i;
+ SELECT
+ i,
+ sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows,
+ lag(i, 1) IGNORE NULLS OVER (ORDER BY i DESC) AS lagged_by_1,
+ lag(i, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM generate_series(1, 10) i
+ WINDOW w as (ORDER BY i ASC);
+
SELECT * FROM v_window;
@@ -272,5 +281,53 @@ SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
depname
FROM empsalary GROUP BY depname;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- test non-deterministic lag (i.e. the lag-by value depends on the data)
+ALTER TABLE empsalary ADD an_offset INTEGER NOT NULL DEFAULT 1;
+
+SELECT term_date, lead(term_date, an_offset) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date, an_offset) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases:
+
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+DROP TABLE test_table;
+
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
Nicholas White escribi�:
But even if we did decide to switch memory contexts on every call, it would still be much cleaner than this.
I've removed all the bms_initalize code from the patch and am using
this solution. As the partition memory is zero-initialised I just
store a Bitmapset pointer in the WinGetPartitionLocalMemory. The
bms_add_member and bms_is_member functions behave sensibly for
null-pointer inputs (they return a bms_make_singleton in the current
memory context and false respectively). I've surrounded the calls to
bms_make_singleton with a memory context switch (to the partition's
context) so the Bitmapset stays in the partition's context.
Now that I look again, would GetMemoryChunkContext() be useful here?
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
GetMemoryChunkContext
Indeed - you can call that with the value WinGetPartitionLocalMemory
returns to get the MemoryContext for the partition - so I've removed
the function from windowapi.h that I added to get that. See attached -
Attachments:
lead-lag-ignore-nulls.patchapplication/octet-stream; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7dd1ef2..54ed0a4 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12328,6 +12328,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12342,7 +12343,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12355,6 +12358,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -12369,7 +12373,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -12463,11 +12469,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 544ba98..23cd972 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -1981,7 +1981,6 @@ window_gettupleslot(WindowObject winobj, int64 pos, TupleTableSlot *slot)
* API exposed to window functions
***********************************************************************/
-
/*
* WinGetPartitionLocalMemory
* Get working memory that lives till end of partition processing
@@ -2017,6 +2016,17 @@ WinGetCurrentPosition(WindowObject winobj)
}
/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
+
+/*
* WinGetPartitionRowCount
* Return total number of rows contained in the current partition.
*
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index a9812af..c59ab63 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -288,6 +288,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -547,7 +548,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -577,7 +578,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -11572,19 +11573,28 @@ filter_clause:
| /*EMPTY*/ { $$ = NULL; }
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
- n->location = @2;
+ n->location = @3;
$$ = n;
}
| /*EMPTY*/
@@ -12561,6 +12571,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12650,6 +12661,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 98cb58a..2bf0b00 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -579,28 +579,80 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
{
Index winref = 0;
ListCell *lc;
+ WindowDef *refwin;
Assert(windef->refname == NULL &&
windef->partitionClause == NIL &&
- windef->orderClause == NIL &&
- windef->frameOptions == FRAMEOPTION_DEFAULTS);
+ windef->orderClause == NIL);
foreach(lc, pstate->p_windowdefs)
{
- WindowDef *refwin = (WindowDef *) lfirst(lc);
-
+ refwin = (WindowDef *) lfirst(lc);
winref++;
if (refwin->name && strcmp(refwin->name, windef->name) == 0)
- {
- wfunc->winref = winref;
break;
- }
}
+
if (lc == NULL) /* didn't find it? */
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("window \"%s\" does not exist", windef->name),
parser_errposition(pstate, windef->location)));
+ else if (windef->frameOptions == FRAMEOPTION_DEFAULTS)
+ wfunc->winref = winref;
+ else
+ {
+ /*
+ * This is the window we want - but we have to tweak the
+ * definition slightly (e.g. to support the IGNORE NULLS frame
+ * option) as we're not using the default (i.e. parent) frame
+ * options.
+ *
+ * We'll create a 'child' (using refname to inherit everything
+ * from the parent) that just overrides the frame options
+ * (assuming it doesn't already exist):
+ */
+ WindowDef *clone = makeNode(WindowDef);
+
+ clone->refname = pstrdup(refwin->name);
+ clone->frameOptions = windef->frameOptions; /* Note windef! */
+ clone->startOffset = copyObject(refwin->startOffset);
+ clone->endOffset = copyObject(refwin->endOffset);
+ clone->location = refwin->location;
+
+ /*
+ * Add this new definition to the list. Note that there's a chance
+ * a window with this definition already exists!
+ */
+ winref = 0;
+ foreach(lc, pstate->p_windowdefs)
+ {
+ refwin = (WindowDef *) lfirst(lc);
+
+ winref++;
+ if (refwin->refname &&
+ strcmp(refwin->refname, clone->refname) == 0 &&
+ equal(refwin->partitionClause, clone->partitionClause) &&
+ equal(refwin->orderClause, clone->orderClause) &&
+ refwin->frameOptions == clone->frameOptions &&
+ equal(refwin->startOffset, clone->startOffset) &&
+ equal(refwin->endOffset, clone->endOffset))
+ {
+ /*
+ * found a duplicate window specification, don't need
+ * clone
+ */
+ wfunc->winref = winref;
+ pfree(clone);
+ break;
+ }
+ }
+ if (lc == NULL) /* didn't find it? */
+ {
+ pstate->p_windowdefs = lappend(pstate->p_windowdefs, clone);
+ wfunc->winref = list_length(pstate->p_windowdefs);
+ }
+ }
}
else
{
@@ -819,8 +871,8 @@ check_ungrouped_columns_walker(Node *node,
* If we find an aggregate call of the original level, do not recurse into
* its arguments or filter; ungrouped vars there are not an error. We can
* also skip looking at aggregates of higher levels, since they could not
- * possibly contain Vars of concern to us (see transformAggregateCall).
- * We do need to look at aggregates of lower levels, however.
+ * possibly contain Vars of concern to us (see transformAggregateCall). We
+ * do need to look at aggregates of lower levels, however.
*/
if (IsA(node, Aggref) &&
(int) ((Aggref *) node)->agglevelsup >= context->sublevels_up)
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 2bd24c8..7066046 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -391,13 +391,13 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
*/
if (nargs > 0 && vatype == ANYOID && func_variadic)
{
- Oid va_arr_typid = actual_arg_types[nargs - 1];
+ Oid va_arr_typid = actual_arg_types[nargs - 1];
if (!OidIsValid(get_element_type(va_arr_typid)))
ereport(ERROR,
(errcode(ERRCODE_DATATYPE_MISMATCH),
errmsg("VARIADIC argument must be an array"),
- parser_errposition(pstate, exprLocation((Node *) llast(fargs)))));
+ parser_errposition(pstate, exprLocation((Node *) llast(fargs)))));
}
/* build the appropriate output structure */
@@ -522,6 +522,23 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
errmsg("FILTER is not implemented in non-aggregate window functions"),
parser_errposition(pstate, location)));
+ if (over->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ {
+ /*
+ * As this is only implemented for the lead & lag window functions
+ * we'll filter out all aggregate functions too.
+ */
+ if (fdresult != FUNCDETAIL_WINDOWFUNC
+ || (strcmp("lead", strVal(llast(funcname))) != 0 &&
+ strcmp("lag", strVal(llast(funcname))) != 0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("RESPECT NULLS is only implemented for the lead and lag window functions"),
+ parser_errposition(pstate, location)));
+ }
+ }
+
/*
* ordered aggs not allowed in windows yet
*/
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 9a1d12e..78b0738 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -4764,11 +4764,16 @@ get_rule_windowspec(WindowClause *wc, List *targetList,
bool needspace = false;
const char *sep;
ListCell *l;
+ size_t refname_len = 0;
+ int initial_buf_len = buf->len;
appendStringInfoChar(buf, '(');
if (wc->refname)
{
- appendStringInfoString(buf, quote_identifier(wc->refname));
+ const char *quoted_refname = quote_identifier(wc->refname);
+
+ refname_len = strlen(quoted_refname);
+ appendStringInfoString(buf, quoted_refname);
needspace = true;
}
/* partition clauses are always inherited, so only print if no refname */
@@ -4850,7 +4855,20 @@ get_rule_windowspec(WindowClause *wc, List *targetList,
/* we will now have a trailing space; remove it */
buf->len--;
}
- appendStringInfoChar(buf, ')');
+
+ /*
+ * We'll tidy up the output slightly; if we've got a refname, but haven't
+ * overridden the partition-by, order-by or any of the frame flags
+ * relevant inside the window def's ()s, then we'll be left with
+ * "(<refname>)". We'll trim off the brackets in this case:
+ */
+ if (wc->refname && buf->len == initial_buf_len + refname_len + 1)
+ {
+ memcpy(buf->data + initial_buf_len, buf->data + initial_buf_len + 1, refname_len);
+ buf->len -= 1; /* the trailing ")" */
+ }
+ else
+ appendStringInfoChar(buf, ')');
}
/* ----------
@@ -7493,7 +7511,7 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
get_rule_expr((Node *) wfunc->aggfilter, context, false);
}
- appendStringInfoString(buf, ") OVER ");
+ appendStringInfoString(buf, ") ");
foreach(l, context->windowClause)
{
@@ -7501,6 +7519,10 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
if (wc->winref == wfunc->winref)
{
+ if (wc->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ appendStringInfoString(buf, "IGNORE NULLS ");
+ appendStringInfoString(buf, "OVER ");
+
if (wc->name)
appendStringInfoString(buf, quote_identifier(wc->name));
else
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b7c42d3..280b96d 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,7 +13,9 @@
*/
#include "postgres.h"
+#include "nodes/bitmapset.h"
#include "utils/builtins.h"
+#include "utils/memutils.h"
#include "windowapi.h"
/*
@@ -24,6 +26,18 @@ typedef struct rank_context
int64 rank; /* current rank */
} rank_context;
+
+typedef struct leadlag_const_context
+{
+ int64 next; /* the index of the lead / lagged value */
+} leadlag_const_context;
+
+/*
+ * lead-lag process helpers
+ */
+#define ISNULL_INDEX(i) (2 * (i))
+#define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+
/*
* ntile process information
*/
@@ -280,7 +294,8 @@ window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
- * withdefault indicates we have a default third argument.
+ * withdefault indicates we have a default third argument. We'll only
+ * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
@@ -290,8 +305,24 @@ leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
- bool isnull;
- bool isout;
+ bool isnull = false;
+ bool isout = false;
+ bool ignore_nulls;
+
+ /**
+ * A ** pointer as we keep a Bitmapset * in the partition context, and
+ * WinGetPartitionLocalMemory returns a pointer to whatever's in the
+ * context.
+ */
+ Bitmapset **null_values;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't mark
+ * at all).
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,21 +336,245 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if (!forward)
+ {
+ offset = -offset;
+ }
+
+ if (ignore_nulls && !const_offset)
+ {
+ int64 scanning,
+ current = WinGetCurrentPosition(winobj);
+ bool scanForward;
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't
+ * exist". This means that we'll scan from the current row an 'offset'
+ * number of non-null rows, and then return that one.
+ *
+ * As the offset isn't constant we need efficient random access to the
+ * partition, as we'll check upto O(partition size) tuples for each
+ * row we're calculating the window function value for.
+ */
+
+ null_values = (Bitmapset **) WinGetPartitionLocalMemory(winobj, sizeof(Bitmapset *));
+
+ if (*null_values == NULL)
+ {
+ MemoryContext oldcxt;
+
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones
+ * we've accessed (more specifically, if they're null or not).
+ * We'll need one bit for whether the value is null and one bit
+ * for whether we've checked that tuple or not. We'll keep these
+ * two bits together (as opposed to having two separate bitmaps)
+ * to improve cache locality.
+ *
+ * However, we'd lose the efficient gains if we keep having to
+ * resize the Bitmapset (by setting higher and higher bits). We
+ * know the maximum number of bits we'll ever need, so we'll use
+ * bms_make_singleton to force our Bitmapset up to the required
+ * size.
+ */
+ int64 bits_needed = 2 * WinGetPartitionRowCount(winobj);
+
+ oldcxt = MemoryContextSwitchTo(GetMemoryChunkContext(null_values));
+ *null_values = bms_make_singleton(bits_needed + 1);
+ MemoryContextSwitchTo(oldcxt);
+ }
+
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be
+ * in the opposite direction to the way we're scanning. We'll then
+ * force offset to be positive to make counting down the rows easier.
+ */
+ scanForward = offset == 0 ? forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; scanForward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the
+ * default value, but not whatever's left in result. We'll use
+ * the isnull flag to say "ignore it"!
+ */
+ isnull = true;
+ result = (Datum) 0;
+
+ break;
+ }
+
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), *null_values))
+ {
+ isnull = bms_is_member(ISNULL_INDEX(scanning), *null_values);
+ }
+ else
+ {
+ /*
+ * first time we've accessed this index; let's see if it's
+ * null:
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ if (isout)
+ break;
+
+ bms_add_member(*null_values, HAVESCANNED_INDEX(scanning));
+ if (isnull)
+ {
+ bms_add_member(*null_values, ISNULL_INDEX(scanning));
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a
+ * chance that we may stop iterating here:
+ */
+ if (!isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to
+ * the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*--------
+ * A slight edge case. Consider:
+ *
+ * =================
+ * A | lag(A, 1)
+ * =================
+ * 1 | NULL
+ * 2 | 1
+ * NULL | ?
+ * =================
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ *--------
+ */
+ --offset;
+ }
+ }
+ }
+ else if (ignore_nulls /* && const_offset */ )
+ {
+ /*
+ * We can process a constant offset much more efficiently; initially
+ * we'll scan through the first <offset> non-null rows, and store that
+ * index. On subsequent rows we'll decide whether to push that index
+ * forwards to the next non-null value, or just return it again.
+ */
+ leadlag_const_context *context = WinGetPartitionLocalMemory(
+ winobj,
+ sizeof(leadlag_const_context));
+ int count_forward = 0;
+
+ /*
+ * Set the forward flag based on the direction of traversal - remember
+ * we can have a LEAD or LAG of -1, and that should be equivalent to a
+ * LAG or LEAD of 1 respectively.
+ */
+ forward = offset == 0 ? forward : (offset > 0);
- result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
- WINDOW_SEEK_CURRENT,
- const_offset,
- &isnull, &isout);
+ if (WinGetCurrentPosition(winobj) == 0)
+ if (forward)
+ count_forward = offset;
+ else
+ context->next = offset; /* LAG, so offset is negative */
+ else
+ {
+ /*
+ * LEADs and LAGs are actually pretty similar - the decision of
+ * whether or not to push our offset value forwards depends on the
+ * current row (for LEADs) or the previous row (for LAGs) is NULL
+ * - hence the (forward ? 0 : -1) below.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ forward ? 0 : -1,
+ WINDOW_SEEK_CURRENT,
+ forward,
+ &isnull, &isout);
+ if (!isnull)
+ count_forward = 1;
+ }
+
+ /*
+ * Count forward through the rows, skipping nulls and terminating if
+ * we run off the end of the window.
+ */
+ for (; count_forward > 0 && !isout; --count_forward)
+ {
+ do
+ {
+ /*
+ * Conveniently, calling WinGetFuncArgInPartition with an
+ * absolute index less than zero (correctly) sets isout and
+ * isnull to true
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ ++(context->next),
+ WINDOW_SEEK_HEAD,
+ !forward,
+ &isnull, &isout);
+ }
+ while (isnull && !isout);
+ }
+
+ result = WinGetFuncArgInPartition(winobj, 0,
+ context->next,
+ WINDOW_SEEK_HEAD,
+ !forward,
+ &isnull, &isout);
+ }
+ else
+ {
+ /*
+ * We don't care about nulls; just get the row at the required offset.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ &isnull, &isout);
+ }
if (isout)
{
/*
- * target row is out of the partition; supply default value if
- * provided. otherwise it'll stay NULL
+ * Target row is out of the partition; supply default value if
+ * provided.
*/
if (withdefault)
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
+ else
+ {
+ /*
+ * Don't return whatever's lying around in result, force the
+ * output to null if there's no default.
+ */
+ Assert(isnull);
+ }
}
if (isnull)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 51fef68..a26fc01 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -128,7 +128,7 @@ typedef struct Query
List *targetList; /* target list (of TargetEntry) */
- List *withCheckOptions; /* a list of WithCheckOption's */
+ List *withCheckOptions; /* a list of WithCheckOption's */
List *returningList; /* return-values list (of TargetEntry) */
@@ -406,19 +406,35 @@ typedef struct SortBy
* For entries in a WINDOW list, "name" is the window name being defined.
* For OVER clauses, we use "name" for the "OVER window" syntax, or "refname"
* for the "OVER (window)" syntax, which is subtly different --- the latter
- * implies overriding the window frame clause.
+ * implies overriding the window frame clause. In this case, the per-field
+ * comments to determine what the semantics are:
+ * VIRTUAL:
+ * If NULL, then the parent's (refname) value is used.
+ * MANDATORY:
+ * Never inherited from the parent, so must be specified -
+ * but can be NULL.
+ * SUPER:
+ * Always inherited from parent, any local version ignored.
*/
typedef struct WindowDef
{
NodeTag type;
- char *name; /* window's own name */
- char *refname; /* referenced window name, if any */
- List *partitionClause; /* PARTITION BY expression list */
- List *orderClause; /* ORDER BY (list of SortBy) */
- int frameOptions; /* frame_clause options, see below */
- Node *startOffset; /* expression for starting bound, if any */
- Node *endOffset; /* expression for ending bound, if any */
- int location; /* parse location, or -1 if none/unknown */
+ /* window's own name [MANDATORY value of NULL] */
+ char *name;
+ /* referenced window name, if any [MANDATORY] */
+ char *refname;
+ /* PARTITION BY expression list [VIRTUAL] */
+ List *partitionClause;
+ /* ORDER BY (list of SortBy) [SUPER] */
+ List *orderClause;
+ /* frame_clause options, see below [MANDATORY] */
+ int frameOptions;
+ /* expression for starting bound, if any [MANDATORY] */
+ Node *startOffset;
+ /* expression for ending bound, if any [MANDATORY] */
+ Node *endOffset;
+ /* parse location, or -1 if none/unknown [MANDATORY] */
+ int location;
} WindowDef;
/*
@@ -443,6 +459,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
@@ -655,10 +672,10 @@ typedef struct XmlSerialize
*
* The same comments apply to FUNCTION RTEs when the function's return type
* is a named composite type. In addition, for all return types, FUNCTION
- * RTEs with ORDINALITY must always have the last colname entry being the
- * one for the ordinal column; this is enforced when constructing the RTE.
- * Thus when ORDINALITY is used, there will be exactly one more colname
- * than would have been present otherwise.
+ * RTEs with ORDINALITY must always have the last colname entry being the
+ * one for the ordinal column; this is enforced when constructing the RTE.
+ * Thus when ORDINALITY is used, there will be exactly one more colname
+ * than would have been present otherwise.
*
* In JOIN RTEs, the colnames in both alias and eref are one-to-one with
* joinaliasvars entries. A JOIN RTE will omit columns of its inputs when
@@ -760,18 +777,18 @@ typedef struct RangeTblEntry
* If the function returns an otherwise-unspecified RECORD, funccoltypes
* lists the column types declared in the RTE's column type specification,
* funccoltypmods lists their declared typmods, funccolcollations their
- * collations. Note that in this case, ORDINALITY is not permitted, so
+ * collations. Note that in this case, ORDINALITY is not permitted, so
* there is no extra ordinal column to be allowed for.
*
- * Otherwise, those fields are NIL, and the result column types must be
+ * Otherwise, those fields are NIL, and the result column types must be
* derived from the funcexpr while treating the ordinal column, if
- * present, as a special case. (see get_rte_attribute_*)
+ * present, as a special case. (see get_rte_attribute_*)
*/
Node *funcexpr; /* expression tree for func call */
List *funccoltypes; /* OID list of column type OIDs */
List *funccoltypmods; /* integer list of column typmods */
List *funccolcollations; /* OID list of column collation OIDs */
- bool funcordinality; /* is this called WITH ORDINALITY? */
+ bool funcordinality; /* is this called WITH ORDINALITY? */
/*
* Fields valid for a values RTE (else NIL):
@@ -811,10 +828,10 @@ typedef struct RangeTblEntry
typedef struct WithCheckOption
{
NodeTag type;
- char *viewname; /* name of view that specified the WCO */
- Node *qual; /* constraint qual to check */
- bool cascaded; /* true = WITH CASCADED CHECK OPTION */
-} WithCheckOption;
+ char *viewname; /* name of view that specified the WCO */
+ Node *qual; /* constraint qual to check */
+ bool cascaded; /* true = WITH CASCADED CHECK OPTION */
+} WithCheckOption;
/*
* SortGroupClause -
@@ -2371,7 +2388,7 @@ typedef enum ViewCheckOption
NO_CHECK_OPTION,
LOCAL_CHECK_OPTION,
CASCADED_CHECK_OPTION
-} ViewCheckOption;
+} ViewCheckOption;
typedef struct ViewStmt
{
@@ -2381,7 +2398,7 @@ typedef struct ViewStmt
Node *query; /* the SELECT query */
bool replace; /* replace an existing view? */
List *options; /* options from WITH clause */
- ViewCheckOption withCheckOption; /* WITH CHECK OPTION */
+ ViewCheckOption withCheckOption; /* WITH CHECK OPTION */
} ViewStmt;
/* ----------------------
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 8bd34d6..9196b41 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -180,6 +180,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -314,6 +315,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 5bbf1fa..c6b179c 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index 7b31d13..e5a248e 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -5,19 +5,21 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
depname | empno | salary | sum
-----------+-------+--------+-------
@@ -931,30 +933,39 @@ FROM tenk1 WHERE unique1 < 10;
17 | 9
(10 rows)
+-- test view definitions are preserved
CREATE TEMP VIEW v_window AS
- SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
- FROM generate_series(1, 10) i;
+ SELECT
+ i,
+ sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows,
+ lag(i, 1) IGNORE NULLS OVER (ORDER BY i DESC) AS lagged_by_1,
+ lag(i, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM generate_series(1, 10) i
+ WINDOW w as (ORDER BY i ASC);
SELECT * FROM v_window;
- i | sum_rows
-----+----------
- 1 | 3
- 2 | 6
- 3 | 9
- 4 | 12
- 5 | 15
- 6 | 18
- 7 | 21
- 8 | 24
- 9 | 27
- 10 | 19
+ i | sum_rows | lagged_by_1 | lagged_by_2
+----+----------+-------------+-------------
+ 10 | 19 | | 8
+ 9 | 27 | 10 | 7
+ 8 | 24 | 9 | 6
+ 7 | 21 | 8 | 5
+ 6 | 18 | 7 | 4
+ 5 | 15 | 6 | 3
+ 4 | 12 | 5 | 2
+ 3 | 9 | 4 | 1
+ 2 | 6 | 3 |
+ 1 | 3 | 2 |
(10 rows)
SELECT pg_get_viewdef('v_window');
- pg_get_viewdef
----------------------------------------------------------------------------------------
- SELECT i.i, +
- sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows+
- FROM generate_series(1, 10) i(i);
+ pg_get_viewdef
+-----------------------------------------------------------------------------------------
+ SELECT i.i, +
+ sum(i.i) OVER (ORDER BY i.i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows, +
+ lag(i.i, 1) IGNORE NULLS OVER (ORDER BY i.i DESC) AS lagged_by_1, +
+ lag(i.i, 2) IGNORE NULLS OVER w AS lagged_by_2 +
+ FROM generate_series(1, 10) i(i) +
+ WINDOW w AS (ORDER BY i.i);
(1 row)
-- with UNION
@@ -1033,5 +1044,197 @@ FROM empsalary GROUP BY depname;
25100 | 1 | 22600 | develop
(3 rows)
+-- test null behaviour: (1) lags
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ |
+ 11-17-2009 |
+ | 11-17-2009
+ |
+ |
+(10 rows)
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ respect | lag
+---------+---------
+ frog |
+ | frog
+ | frog
+ chicken | frog
+ | chicken
+ | chicken
+ tiger | chicken
+ gorilla | tiger
+ | gorilla
+ | gorilla
+(10 rows)
+
+-- (2) leads
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ |
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 |
+ |
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, first_value(term_date) IGNORE NULLS OVER (...
+ ^
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY...
+ ^
+-- test non-deterministic lag (i.e. the lag-by value depends on the data)
+ALTER TABLE empsalary ADD an_offset INTEGER NOT NULL DEFAULT 1;
+SELECT term_date, lead(term_date, an_offset) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lead
+------------+------------
+ | 03-05-2009
+ | 03-05-2009
+ 03-05-2009 | 09-22-2010
+ 09-22-2010 | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+ 11-17-2009 |
+ |
+ |
+ |
+(10 rows)
+
+SELECT term_date, lag(term_date, an_offset) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+ term_date | lag
+------------+------------
+ |
+ |
+ 03-05-2009 |
+ 09-22-2010 | 03-05-2009
+ | 09-22-2010
+ | 09-22-2010
+ 11-17-2009 | 09-22-2010
+ | 11-17-2009
+ | 11-17-2009
+ | 11-17-2009
+(10 rows)
+
-- cleanup
DROP TABLE empsalary;
+-- some more test cases:
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+ val | lead
+-----+------
+ 1 | 3
+ 2 | 4
+ 3 | 5
+ 4 | 6
+ | 6
+ | 6
+ | 6
+ 5 | 7
+ 6 |
+ 7 |
+(10 rows)
+
+DROP TABLE test_table;
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
+ val | ignore | respect
+-----+--------+---------
+ 1 | 3 | 3
+ 2 | 4 | 4
+ 3 | 5 |
+ 4 | 6 |
+ | 6 |
+ | 6 | 5
+ | 6 | 6
+ 5 | 7 | 7
+ 6 | |
+ 7 | |
+(10 rows)
+
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 6ee3696..2aac686 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -6,20 +6,22 @@ CREATE TEMPORARY TABLE empsalary (
depname varchar,
empno bigint,
salary int,
- enroll_date date
+ enroll_date date,
+ term_date date,
+ respect text
);
INSERT INTO empsalary VALUES
-('develop', 10, 5200, '2007-08-01'),
-('sales', 1, 5000, '2006-10-01'),
-('personnel', 5, 3500, '2007-12-10'),
-('sales', 4, 4800, '2007-08-08'),
-('personnel', 2, 3900, '2006-12-23'),
-('develop', 7, 4200, '2008-01-01'),
-('develop', 9, 4500, '2008-01-01'),
-('sales', 3, 4800, '2007-08-01'),
-('develop', 8, 6000, '2006-10-01'),
-('develop', 11, 5200, '2007-08-15');
+('develop', 10, 5200, '2007-08-01', null, null),
+('sales', 1, 5000, '2006-10-01', null, 'frog'),
+('personnel', 5, 3500, '2007-12-10', null, null),
+('sales', 4, 4800, '2007-08-08', '2010-09-22', 'chicken'),
+('personnel', 2, 3900, '2006-12-23', null, null),
+('develop', 7, 4200, '2008-01-01', null, null),
+('develop', 9, 4500, '2008-01-01', null, 'gorilla'),
+('sales', 3, 4800, '2007-08-01', '2009-03-05', null),
+('develop', 8, 6000, '2006-10-01', '2009-11-17', 'tiger'),
+('develop', 11, 5200, '2007-08-15', null, null);
SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;
@@ -222,9 +224,16 @@ SELECT sum(unique1) over
unique1
FROM tenk1 WHERE unique1 < 10;
+-- test view definitions are preserved
CREATE TEMP VIEW v_window AS
- SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
- FROM generate_series(1, 10) i;
+ SELECT
+ i,
+ sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows,
+ lag(i, 1) IGNORE NULLS OVER (ORDER BY i DESC) AS lagged_by_1,
+ lag(i, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM generate_series(1, 10) i
+ WINDOW w as (ORDER BY i ASC);
+
SELECT * FROM v_window;
@@ -272,5 +281,53 @@ SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
depname
FROM empsalary GROUP BY depname;
+-- test null behaviour: (1) lags
+
+SELECT term_date, lag(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a numeric (date) column
+SELECT term_date, lag(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- a text column
+SELECT respect, lag(respect) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- (2) leads
+
+SELECT term_date, lead(term_date) OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) RESPECT NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lead(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT term_date, first_value(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+SELECT term_date, max(term_date) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+-- test non-deterministic lag (i.e. the lag-by value depends on the data)
+ALTER TABLE empsalary ADD an_offset INTEGER NOT NULL DEFAULT 1;
+
+SELECT term_date, lead(term_date, an_offset) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
+SELECT term_date, lag(term_date, an_offset) IGNORE NULLS OVER (ORDER BY empno) FROM empsalary ORDER BY empno;
+
-- cleanup
DROP TABLE empsalary;
+
+-- some more test cases:
+
+-- (1) leading with an order-by
+CREATE TABLE test_table (
+ id serial,
+ val integer);
+INSERT INTO test_table (val) SELECT * FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]);
+SELECT val, lead(val, 2) IGNORE NULLS OVER (ORDER BY id) FROM test_table ORDER BY id;
+DROP TABLE test_table;
+
+-- (2) two functions in the same window
+SELECT val,
+ lead(val, 2) IGNORE NULLS OVER w AS ignore,
+ lead(val, 2) RESPECT NULLS OVER w AS respect
+FROM unnest(ARRAY[1,2,3,4,NULL, NULL, NULL, 5, 6, 7]) AS val
+WINDOW w as ();
We have this block:
+ {
+ /*
+ * This is the window we want - but we have to tweak the
+ * definition slightly (e.g. to support the IGNORE NULLS frame
+ * option) as we're not using the default (i.e. parent) frame
+ * options.
+ *
+ * We'll create a 'child' (using refname to inherit everything
+ * from the parent) that just overrides the frame options
+ * (assuming it doesn't already exist):
+ */
+ WindowDef *clone = makeNode(WindowDef);
... then it goes to populate the clone. When this is done, we use the
clone to walk the list of existing WindowDefs, and if there's a match we
free this one and use that one. Wouldn't it be better to walk the
existing array first looking for a match, and only create a clone if
none is found? This would avoid the memory leak problems; I originally
pointed out that you're leaking the bits created by this makeNode() call
above, but now that I look at it again, I think you're also leaking the
bits created by the two copyObject() calls to create the clone. It
appears to me that it's simpler to not allocate any memory in the first
place, unless necessary.
Also, in parsenodes.h, you had the [MANDATORY] and such tags. Three
things about that: 1) it looks a lot uglier than the original, so how
about the modified version below? and 2) what does "MANDATORY value of
NULL" means? Maybe you mean "MANDATORY value or NULL" instead? 3)
Exactly what case does the "in this case" phrase refer to? I think the
comment should be more explicit. Also, I think this should be its own
paragraph instead of being mixed with the "For entries in a" paragraph.
/*
* WindowDef - raw representation of WINDOW and OVER clauses
*
* For entries in a WINDOW list, "name" is the window name being defined.
* For OVER clauses, we use "name" for the "OVER window" syntax, or "refname"
* for the "OVER (window)" syntax, which is subtly different --- the latter
* implies overriding the window frame clause.
*
* In this case, the per-field indicators determine what the semantics
* are:
* [V]irtual
* If NULL, then the parent's (refname) value is used.
* [M]andatory
* Never inherited from the parent, so must be specified; may be NULL.
* [S]uper
* Always inherited from parent, any local version ignored.
*/
typedef struct WindowDef
{
NodeTag type;
char *name; /* [M] window's own name */
char *refname; /* [M] referenced window name, if any */
List *partitionClause; /* [V] PARTITION BY expression list */
List *orderClause; /* [M] ORDER BY (list of SortBy) */
int frameOptions; /* [M] frame_clause options, see below */
Node *startOffset; /* [M] expression for starting bound, if any */
Node *endOffset; /* [M] expression for ending bound, if any */
int location; /* parse location, or -1 if none/unknown */
} WindowDef;
In gram.y there are some spurious whitespaces at end-of-line. You
should be able to see them with git diff --check. (I don't think we
support running pgindent on .y files, which would have otherwise cleaned
this up.)
A style issue. You have this:
+ /*
+ * We can process a constant offset much more efficiently; initially
+ * we'll scan through the first <offset> non-null rows, and store that
+ * index. On subsequent rows we'll decide whether to push that index
+ * forwards to the next non-null value, or just return it again.
+ */
+ leadlag_const_context *context = WinGetPartitionLocalMemory(
+ winobj,
+ sizeof(leadlag_const_context));
+ int count_forward = 0;
I think it'd be better to put the declarations above the comment, and
assignment to "context" below the comment. This way, the indentation of
the assignment is not so odd. So it'd look like
+ leadlag_const_context *context;
+ int count_forward = 0;
+
+ /*
+ * We can process a constant offset much more efficiently; initially
+ * we'll scan through the first <offset> non-null rows, and store that
+ * index. On subsequent rows we'll decide whether to push that index
+ * forwards to the next non-null value, or just return it again.
+ */
+ context = WinGetPartitionLocalMemory(winobj,
+ sizeof(leadlag_const_context));
And a final style comment. You have a block like this:
if (ignore_nulls && !const_offset)
{
long block;
}
else if (ignore_nulls /* && const_offset */)
{
another long block;
}
else
{
more stuff;
}
I think this looks better like this, even if it causes an extra level of
indentation:
if (ignore_nulls)
{
if (const_offset)
{
some stuff;
}
else
{
more;
}
}
else
{
the third block;
}
Finally, I'm not really sure about the column you added to the
regression tests table. It looks way too artificial; I mean the column
name even states what test is going to use that data (respect=gorilla?
uhm). I'm not sure what's a better option; maybe if you just named the
column "favorite_pet" or something like that, it would appear less
random. Maybe it'd be better to just create your own table for this.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jul 5, 2013 at 02:36:10PM -0700, Jeff Davis wrote:
On Fri, 2013-06-28 at 13:26 -0400, Robert Haas wrote:
I haven't really reviewed the windowing-related code in depth; I
thought Jeff might jump back in for that part of it. Jeff, is that
something you're planning to do?Yes, getting back into this patch now after a bit of delay.
Jeff, any status on this?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, 2013-11-29 at 16:18 -0500, Bruce Momjian wrote:
On Fri, Jul 5, 2013 at 02:36:10PM -0700, Jeff Davis wrote:
On Fri, 2013-06-28 at 13:26 -0400, Robert Haas wrote:
I haven't really reviewed the windowing-related code in depth; I
thought Jeff might jump back in for that part of it. Jeff, is that
something you're planning to do?Yes, getting back into this patch now after a bit of delay.
Jeff, any status on this?
The last message was a review from Alvaro that hasn't been addressed
yet.
Right now I am looking at the extension templates patch. But this patch
is fairly close, so if Nicholas doesn't get to looking at it, I'll see
what I can do.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Nov 29, 2013 at 01:21:27PM -0800, Jeff Davis wrote:
On Fri, 2013-11-29 at 16:18 -0500, Bruce Momjian wrote:
On Fri, Jul 5, 2013 at 02:36:10PM -0700, Jeff Davis wrote:
On Fri, 2013-06-28 at 13:26 -0400, Robert Haas wrote:
I haven't really reviewed the windowing-related code in depth; I
thought Jeff might jump back in for that part of it. Jeff, is that
something you're planning to do?Yes, getting back into this patch now after a bit of delay.
Jeff, any status on this?
The last message was a review from Alvaro that hasn't been addressed
yet.Right now I am looking at the extension templates patch. But this patch
is fairly close, so if Nicholas doesn't get to looking at it, I'll see
what I can do.
Thank you. I see it is looking very active on the commit-fest: :-)
https://commitfest.postgresql.org/action/patch_view?id=1096
Thanks for all the work.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Thanks for the detailed feedback, I'm sorry it took so long to
incorporate it. I've attached the latest version of the patch, fixing
in particular:
We have this block:
I've re-written this so it only does a single pass through the window
definitions (my patch originally added a second pass), and only does
the clone if required.
In gram.y there are some spurious whitespaces at end-of-line.
Fixed - I didn't know about diff --check, it's very useful!
Also, in parsenodes.h, you had the [MANDATORY] and such tags.
I've re-written the comments (without tags) to make it much easier to
understand . I agree they were ugly!
Exactly what case does the "in this case" phrase refer to?
Clarified in the comments
A style issue. You have this:
Fixed
And a final style comment.
Fixed
Finally, I'm not really sure about the column you added to the regression tests table.
Indeed, it was a bit artificial. I've re-written the tests to use a
separate table as you suggest.
Thanks -
Nick
Attachments:
lead-lag-ignore-nulls.patchtext/x-patch; charset=UTF-8; name=lead-lag-ignore-nulls.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 0809a6d..5da852e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -13185,6 +13185,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -13199,7 +13200,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -13212,6 +13215,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
@@ -13226,7 +13230,9 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
- <replaceable class="parameter">default</replaceable> to null
+ <replaceable class="parameter">default</replaceable> to null. If
+ <literal>IGNORE NULLS</> is specified then the function will be evaluated
+ as if the rows containing nulls didn't exist.
</entry>
</row>
@@ -13320,11 +13326,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
- <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
- <function>first_value</>, <function>last_value</>, and
- <function>nth_value</>. This is not implemented in
- <productname>PostgreSQL</productname>: the behavior is always the
- same as the standard's default, namely <literal>RESPECT NULLS</>.
+ <literal>IGNORE NULLS</> option for <function>first_value</>,
+ <function>last_value</>, and <function>nth_value</>. This is not
+ implemented in <productname>PostgreSQL</productname>: the behavior is
+ always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 2fcc630..5cea825 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2431,7 +2431,6 @@ window_gettupleslot(WindowObject winobj, int64 pos, TupleTableSlot *slot)
* API exposed to window functions
***********************************************************************/
-
/*
* WinGetPartitionLocalMemory
* Get working memory that lives till end of partition processing
@@ -2467,6 +2466,17 @@ WinGetCurrentPosition(WindowObject winobj)
}
/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+int
+WinGetFrameOptions(WindowObject winobj)
+{
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+}
+
+/*
* WinGetPartitionRowCount
* Return total number of rows contained in the current partition.
*
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7b9895d..f11bc66 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -290,6 +290,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+%type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
@@ -552,7 +553,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
+ IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -582,7 +583,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
- RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
+ RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
@@ -11885,19 +11886,28 @@ window_definition:
}
;
-over_clause: OVER window_specification
- { $$ = $2; }
- | OVER ColId
+opt_ignore_nulls:
+ IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
+ | RESPECT NULLS_P { $$ = 0; }
+ | /* EMPTY */ { $$ = 0; }
+ ;
+
+over_clause: opt_ignore_nulls OVER window_specification
+ {
+ $3->frameOptions |= $1;
+ $$ = $3;
+ }
+ | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
- n->name = $2;
+ n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
- n->frameOptions = FRAMEOPTION_DEFAULTS;
+ n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
- n->location = @2;
+ n->location = @3;
$$ = n;
}
| /*EMPTY*/
@@ -12891,6 +12901,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
@@ -12980,6 +12991,7 @@ unreserved_keyword:
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 272d27f..6edfc4d 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -694,28 +694,82 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
{
Index winref = 0;
ListCell *lc;
+ WindowDef *refwin = NULL;
Assert(windef->refname == NULL &&
windef->partitionClause == NIL &&
- windef->orderClause == NIL &&
- windef->frameOptions == FRAMEOPTION_DEFAULTS);
+ windef->orderClause == NIL);
foreach(lc, pstate->p_windowdefs)
{
- WindowDef *refwin = (WindowDef *) lfirst(lc);
-
+ WindowDef *thiswin = (WindowDef *) lfirst(lc);
winref++;
- if (refwin->name && strcmp(refwin->name, windef->name) == 0)
+
+ if (thiswin->name && strcmp(thiswin->name, windef->name) == 0)
+ {
+ /*
+ * "thiswin" is the window we want - but we have to tweak the
+ * definition slightly as some window options (e.g. IGNORE
+ * NULLS) can't be specified a standalone window definition;
+ * they can only be specified when invoking a window function
+ * over a window definition. However, we don't want to modify
+ * the window def itself (as that'll affect other window
+ * functions that use it - so if we need to make changes to it
+ * we'll clone refwin, change the clone and add the clone to
+ * the list of window definitions in pstate.
+ *
+ * There's one catch: what if a statement has two (or more)
+ * window function calls that reference the same window
+ * definition, and both have IGNORE NULLs? We don't want to
+ * add two modified definitions to pstate, so we'll only break
+ * if thiswin is an exact match - if not we'll keep looking for
+ * a window definition with the same name *and* same frame
+ * options.
+ */
+ wfunc->winref = winref;
+ refwin = thiswin;
+ if(windef->frameOptions == FRAMEOPTION_DEFAULTS)
+ break; /* don't need to clone, so just use this one */
+ }
+
+ if (refwin && /* we need to have found the parent window */
+ thiswin->refname &&
+ strcmp(thiswin->refname, windef->name) == 0 && /* it reference the right parent */
+ thiswin->frameOptions == windef->frameOptions)
{
+ /* found a clone window specification that we can re-use */
wfunc->winref = winref;
+ refwin = thiswin;
break;
}
}
- if (lc == NULL) /* didn't find it? */
+
+ if (refwin == NULL) /* didn't find it? */
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("window \"%s\" does not exist", windef->name),
parser_errposition(pstate, windef->location)));
+ else if (windef->frameOptions != refwin->frameOptions /* we can't use the clone */
+ && windef->frameOptions != FRAMEOPTION_DEFAULTS) /* we can't use the parent */
+ {
+ /*
+ * This means we've found the parent, but no clones (if there were
+ * any) had the correct frame options. We'll clone the parent we
+ * found (refwin), set the frame options we want and add the new
+ * clone to pstate:
+ */
+ WindowDef *clone = makeNode(WindowDef);
+
+ clone->name = NULL;
+ clone->refname = pstrdup(refwin->name);
+ clone->frameOptions = windef->frameOptions; /* Note windef! */
+ clone->startOffset = copyObject(refwin->startOffset);
+ clone->endOffset = copyObject(refwin->endOffset);
+ clone->location = refwin->location;
+
+ pstate->p_windowdefs = lappend(pstate->p_windowdefs, clone);
+ wfunc->winref = list_length(pstate->p_windowdefs);
+ }
}
else
{
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index cc46084..ce39955 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -726,6 +726,22 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
NameListToString(funcname)),
parser_errposition(pstate, location)));
+ if (over->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ {
+ /*
+ * As this is only implemented for the lead & lag window functions
+ * we'll filter out all aggregate functions too.
+ */
+ if (fdresult != FUNCDETAIL_WINDOWFUNC
+ || (strcmp("lead", strVal(llast(funcname))) != 0 &&
+ strcmp("lag", strVal(llast(funcname))) != 0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("RESPECT NULLS is only implemented for the lead and lag window functions"),
+ parser_errposition(pstate, location))); }
+ }
+
/*
* ordered aggs not allowed in windows yet
*/
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index ea7b8c5..add4048 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -4916,11 +4916,16 @@ get_rule_windowspec(WindowClause *wc, List *targetList,
bool needspace = false;
const char *sep;
ListCell *l;
+ size_t refname_len = 0;
+ int initial_buf_len = buf->len;
appendStringInfoChar(buf, '(');
if (wc->refname)
{
- appendStringInfoString(buf, quote_identifier(wc->refname));
+ const char *quoted_refname = quote_identifier(wc->refname);
+
+ refname_len = strlen(quoted_refname);
+ appendStringInfoString(buf, quoted_refname);
needspace = true;
}
/* partition clauses are always inherited, so only print if no refname */
@@ -5002,7 +5007,20 @@ get_rule_windowspec(WindowClause *wc, List *targetList,
/* we will now have a trailing space; remove it */
buf->len--;
}
- appendStringInfoChar(buf, ')');
+
+ /*
+ * We'll tidy up the output slightly; if we've got a refname, but haven't
+ * overridden the partition-by, order-by or any of the frame flags
+ * relevant inside the window def's ()s, then we'll be left with
+ * "(<refname>)". We'll trim off the brackets in this case:
+ */
+ if (wc->refname && buf->len == initial_buf_len + refname_len + 1)
+ {
+ memcpy(buf->data + initial_buf_len, buf->data + initial_buf_len + 1, refname_len);
+ buf->len -= 1; /* the trailing ")" */
+ }
+ else
+ appendStringInfoChar(buf, ')');
}
/* ----------
@@ -7674,7 +7692,7 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
get_rule_expr((Node *) wfunc->aggfilter, context, false);
}
- appendStringInfoString(buf, ") OVER ");
+ appendStringInfoString(buf, ") ");
foreach(l, context->windowClause)
{
@@ -7682,6 +7700,10 @@ get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
if (wc->winref == wfunc->winref)
{
+ if (wc->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ appendStringInfoString(buf, "IGNORE NULLS ");
+ appendStringInfoString(buf, "OVER ");
+
if (wc->name)
appendStringInfoString(buf, quote_identifier(wc->name));
else
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index 19f1fde..ccb53ce 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -13,7 +13,9 @@
*/
#include "postgres.h"
+#include "nodes/bitmapset.h"
#include "utils/builtins.h"
+#include "utils/memutils.h"
#include "windowapi.h"
/*
@@ -24,6 +26,18 @@ typedef struct rank_context
int64 rank; /* current rank */
} rank_context;
+
+typedef struct leadlag_const_context
+{
+ int64 next; /* the index of the lead / lagged value */
+} leadlag_const_context;
+
+/*
+ * lead-lag process helpers
+ */
+#define ISNULL_INDEX(i) (2 * (i))
+#define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+
/*
* ntile process information
*/
@@ -280,7 +294,8 @@ window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
- * withdefault indicates we have a default third argument.
+ * withdefault indicates we have a default third argument. We'll only
+ * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
@@ -290,8 +305,24 @@ leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
- bool isnull;
- bool isout;
+ bool isnull = false;
+ bool isout = false;
+ bool ignore_nulls;
+
+ /**
+ * A ** pointer as we keep a Bitmapset * in the partition context, and
+ * WinGetPartitionLocalMemory returns a pointer to whatever's in the
+ * context.
+ */
+ Bitmapset **null_values;
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't mark
+ * at all).
+ */
+ ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
@@ -305,21 +336,247 @@ leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
+ if (!forward)
+ {
+ offset = -offset;
+ }
- result = WinGetFuncArgInPartition(winobj, 0,
- (forward ? offset : -offset),
- WINDOW_SEEK_CURRENT,
- const_offset,
- &isnull, &isout);
+ if (ignore_nulls)
+ {
+ if(const_offset)
+ { int count_forward = 0;
+ leadlag_const_context *context;
+
+ /*
+ * We can process a constant offset much more efficiently; initially
+ * we'll scan through the first <offset> non-null rows, and store that
+ * index. On subsequent rows we'll decide whether to push that index
+ * forwards to the next non-null value, or just return it again.
+ */
+ context = WinGetPartitionLocalMemory(winobj, sizeof(leadlag_const_context));
+
+ /*
+ * Set the forward flag based on the direction of traversal - remember
+ * we can have a LEAD or LAG of -1, and that should be equivalent to a
+ * LAG or LEAD of 1 respectively.
+ */
+ forward = offset == 0 ? forward : (offset > 0);
+
+ if (WinGetCurrentPosition(winobj) == 0)
+ if (forward)
+ count_forward = offset;
+ else
+ context->next = offset; /* LAG, so offset is negative */
+ else
+ {
+ /*
+ * LEADs and LAGs are actually pretty similar - the decision of
+ * whether or not to push our offset value forwards depends on the
+ * current row (for LEADs) or the previous row (for LAGs) is NULL
+ * - hence the (forward ? 0 : -1) below.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ forward ? 0 : -1,
+ WINDOW_SEEK_CURRENT,
+ forward,
+ &isnull, &isout);
+ if (!isnull)
+ count_forward = 1;
+ }
+
+ /*
+ * Count forward through the rows, skipping nulls and terminating if
+ * we run off the end of the window.
+ */
+ for (; count_forward > 0 && !isout; --count_forward)
+ {
+ do
+ {
+ /*
+ * Conveniently, calling WinGetFuncArgInPartition with an
+ * absolute index less than zero (correctly) sets isout and
+ * isnull to true
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ ++(context->next),
+ WINDOW_SEEK_HEAD,
+ !forward,
+ &isnull, &isout);
+ }
+ while (isnull && !isout);
+ }
+
+ result = WinGetFuncArgInPartition(winobj, 0,
+ context->next,
+ WINDOW_SEEK_HEAD,
+ !forward,
+ &isnull, &isout);
+ }
+ else
+ {
+ int64 scanning,
+ current = WinGetCurrentPosition(winobj);
+ bool scanForward;
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't
+ * exist". This means that we'll scan from the current row an 'offset'
+ * number of non-null rows, and then return that one.
+ *
+ * As the offset isn't constant we need efficient random access to the
+ * partition, as we'll check upto O(partition size) tuples for each
+ * row we're calculating the window function value for.
+ */
+
+ null_values = (Bitmapset **) WinGetPartitionLocalMemory(winobj, sizeof(Bitmapset *));
+
+ if (*null_values == NULL)
+ {
+ MemoryContext oldcxt;
+
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones
+ * we've accessed (more specifically, if they're null or not).
+ * We'll need one bit for whether the value is null and one bit
+ * for whether we've checked that tuple or not. We'll keep these
+ * two bits together (as opposed to having two separate bitmaps)
+ * to improve cache locality.
+ *
+ * However, we'd lose the efficient gains if we keep having to
+ * resize the Bitmapset (by setting higher and higher bits). We
+ * know the maximum number of bits we'll ever need, so we'll use
+ * bms_make_singleton to force our Bitmapset up to the required
+ * size.
+ */
+ int64 bits_needed = 2 * WinGetPartitionRowCount(winobj);
+
+ oldcxt = MemoryContextSwitchTo(GetMemoryChunkContext(null_values));
+ *null_values = bms_make_singleton(bits_needed + 1);
+ MemoryContextSwitchTo(oldcxt);
+ }
+
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be
+ * in the opposite direction to the way we're scanning. We'll then
+ * force offset to be positive to make counting down the rows easier.
+ */
+ scanForward = offset == 0 ? forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; scanForward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the
+ * default value, but not whatever's left in result. We'll use
+ * the isnull flag to say "ignore it"!
+ */
+ isnull = true;
+ result = (Datum) 0;
+
+ break;
+ }
+
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), *null_values))
+ {
+ isnull = bms_is_member(ISNULL_INDEX(scanning), *null_values);
+ }
+ else
+ {
+ /*
+ * first time we've accessed this index; let's see if it's
+ * null:
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ if (isout)
+ break;
+
+ bms_add_member(*null_values, HAVESCANNED_INDEX(scanning));
+ if (isnull)
+ {
+ bms_add_member(*null_values, ISNULL_INDEX(scanning));
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a
+ * chance that we may stop iterating here:
+ */
+ if (!isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &isnull, &isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to
+ * the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*--------
+ * A slight edge case. Consider:
+ *
+ * =================
+ * A | lag(A, 1)
+ * =================
+ * 1 | NULL
+ * 2 | 1
+ * NULL | ?
+ * =================
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ *--------
+ */
+ --offset;
+ }
+ }
+ }
+ }
+ else
+ {
+ /*
+ * We don't care about nulls; just get the row at the required offset.
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ &isnull, &isout);
+ }
if (isout)
{
/*
- * target row is out of the partition; supply default value if
- * provided. otherwise it'll stay NULL
+ * Target row is out of the partition; supply default value if
+ * provided.
*/
if (withdefault)
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
+ else
+ {
+ /*
+ * Don't return whatever's lying around in result, force the
+ * output to null if there's no default.
+ */
+ Assert(isnull);
+ }
}
if (isnull)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 18d4991..8b18db4 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -406,19 +406,38 @@ typedef struct SortBy
* For entries in a WINDOW list, "name" is the window name being defined.
* For OVER clauses, we use "name" for the "OVER window" syntax, or "refname"
* for the "OVER (window)" syntax, which is subtly different --- the latter
- * implies overriding the window frame clause.
+ * implies overriding the window frame clause. The semantics of each override
+ * depends on the field.
*/
typedef struct WindowDef
{
NodeTag type;
- char *name; /* window's own name */
- char *refname; /* referenced window name, if any */
- List *partitionClause; /* PARTITION BY expression list */
- List *orderClause; /* ORDER BY (list of SortBy) */
- int frameOptions; /* frame_clause options, see below */
- Node *startOffset; /* expression for starting bound, if any */
- Node *endOffset; /* expression for ending bound, if any */
- int location; /* parse location, or -1 if none/unknown */
+ /* Window's own name. This must be NULL for overrides. */
+ char *name;
+ /* Referenced window name, if any. This must be present on overrides. */
+ char *refname;
+ /*
+ * PARTITION BY expression list. If an override leaves this NULL, the
+ * parent's partitionClause will be used.
+ */
+ List *partitionClause;
+ /*
+ * ORDER BY (list of SortBy). This field is ignored in overrides - the
+ * parent's value will always be used.
+ */
+ List *orderClause;
+ /*
+ * The remaining fields in this struct must be specified on overrides,
+ * even if the override's value is the same as the parent's.
+ */
+ /* frame_clause options, see below */
+ int frameOptions;
+ /* Expression for starting bound, if any */
+ Node *startOffset;
+ /* expression for ending bound, if any */
+ Node *endOffset;
+ /* parse location, or -1 if none/unknown */
+ int location;
} WindowDef;
/*
@@ -443,6 +462,7 @@ typedef struct WindowDef
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+#define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 61fae22..c11c65a 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -180,6 +180,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
@@ -314,6 +315,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index 8557464..1d676e8 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -46,6 +46,8 @@ extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
diff --git a/src/test/regress/expected/window.out b/src/test/regress/expected/window.out
index c2cc742..6b84dc1 100644
--- a/src/test/regress/expected/window.out
+++ b/src/test/regress/expected/window.out
@@ -1781,3 +1781,216 @@ SELECT i, b, bool_and(b) OVER w, bool_or(b) OVER w
5 | t | t | t
(5 rows)
+-- check we haven't reserved words that might break backwards-compatibility:
+CREATE TABLE reserved (
+ ignore text,
+ respect text,
+ nulls text
+);
+DROP TABLE reserved;
+-- testing ignore nulls functionality
+CREATE TEMPORARY TABLE dogs (
+ name text,
+ breed text,
+ age smallint
+);
+INSERT INTO dogs VALUES
+('K-9', 'robot', NULL),
+('alfred', NULL, 8),
+('bones', 'shar pei', NULL),
+('churchill', 'bulldog', NULL),
+('lassie', NULL, 4),
+('mickey', 'poodle', 7),
+('molly', 'poodle', NULL),
+('rover', 'shar pei', 3);
+-- test view definitions are preserved
+CREATE TEMP VIEW v_dogs AS
+ SELECT
+ name,
+ sum(age) OVER (order by age rows between 1 preceding and 1 following) as sum_rows,
+ lag(age, 1) IGNORE NULLS OVER (ORDER BY name DESC) AS lagged_by_1,
+ lag(age, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM dogs
+ WINDOW w as (ORDER BY name ASC);
+SELECT pg_get_viewdef('v_dogs');
+ pg_get_viewdef
+--------------------------------------------------------------------------------------------------
+ SELECT dogs.name, +
+ sum(dogs.age) OVER (ORDER BY dogs.age ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows,+
+ lag(dogs.age, 1) IGNORE NULLS OVER (ORDER BY dogs.name DESC) AS lagged_by_1, +
+ lag(dogs.age, 2) IGNORE NULLS OVER w AS lagged_by_2 +
+ FROM dogs +
+ WINDOW w AS (ORDER BY dogs.name);
+(1 row)
+
+-- (1) lags by constant
+SELECT name, lag(age) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+-----------+-----
+ K-9 |
+ alfred |
+ bones | 8
+ churchill |
+ lassie |
+ mickey | 4
+ molly | 7
+ rover |
+(8 rows)
+
+SELECT name, lag(age) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+-----------+-----
+ K-9 |
+ alfred |
+ bones | 8
+ churchill |
+ lassie |
+ mickey | 4
+ molly | 7
+ rover |
+(8 rows)
+
+SELECT name, lag(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+-----------+-----
+ K-9 |
+ alfred |
+ bones | 8
+ churchill | 8
+ lassie | 8
+ mickey | 4
+ molly | 7
+ rover | 7
+(8 rows)
+
+-- (2) leads by constant
+SELECT name, lead(age) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+-----------+------
+ K-9 | 8
+ alfred |
+ bones |
+ churchill | 4
+ lassie | 7
+ mickey |
+ molly | 3
+ rover |
+(8 rows)
+
+SELECT name, lead(age) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+-----------+------
+ K-9 | 8
+ alfred |
+ bones |
+ churchill | 4
+ lassie | 7
+ mickey |
+ molly | 3
+ rover |
+(8 rows)
+
+SELECT name, lead(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+-----------+------
+ K-9 | 8
+ alfred | 4
+ bones | 4
+ churchill | 4
+ lassie | 7
+ mickey | 3
+ molly | 3
+ rover |
+(8 rows)
+
+-- (3) lags by expression
+SELECT name, lag(age * 2) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+-----------+-----
+ K-9 |
+ alfred |
+ bones | 16
+ churchill |
+ lassie |
+ mickey | 8
+ molly | 14
+ rover |
+(8 rows)
+
+SELECT name, lag(age * 2) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+-----------+-----
+ K-9 |
+ alfred |
+ bones | 16
+ churchill |
+ lassie |
+ mickey | 8
+ molly | 14
+ rover |
+(8 rows)
+
+SELECT name, lag(age * 2) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+-----------+-----
+ K-9 |
+ alfred |
+ bones | 16
+ churchill | 16
+ lassie | 16
+ mickey | 8
+ molly | 14
+ rover | 14
+(8 rows)
+
+-- (4) leads by expression
+SELECT name, lead(age * 2) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+-----------+------
+ K-9 | 16
+ alfred |
+ bones |
+ churchill | 8
+ lassie | 14
+ mickey |
+ molly | 6
+ rover |
+(8 rows)
+
+SELECT name, lead(age * 2) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+-----------+------
+ K-9 | 16
+ alfred |
+ bones |
+ churchill | 8
+ lassie | 14
+ mickey |
+ molly | 6
+ rover |
+(8 rows)
+
+SELECT name, lead(age * 2) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+-----------+------
+ K-9 | 16
+ alfred | 8
+ bones | 8
+ churchill | 8
+ lassie | 14
+ mickey | 6
+ molly | 6
+ rover |
+(8 rows)
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT name, first_value(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT name, first_value(age) IGNORE NULLS OVER (ORDER BY na...
+ ^
+SELECT name, max(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+LINE 1: SELECT name, max(age) IGNORE NULLS OVER (ORDER BY name) FROM...
+ ^
+DROP TABLE dogs CASCADE;
+NOTICE: drop cascades to view v_dogs
diff --git a/src/test/regress/sql/window.sql b/src/test/regress/sql/window.sql
index 31c98eb..13c0a7b 100644
--- a/src/test/regress/sql/window.sql
+++ b/src/test/regress/sql/window.sql
@@ -621,3 +621,66 @@ SELECT to_char(SUM(n::float8) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FO
SELECT i, b, bool_and(b) OVER w, bool_or(b) OVER w
FROM (VALUES (1,true), (2,true), (3,false), (4,false), (5,true)) v(i,b)
WINDOW w AS (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING);
+
+-- check we haven't reserved words that might break backwards-compatibility:
+CREATE TABLE reserved (
+ ignore text,
+ respect text,
+ nulls text
+);
+DROP TABLE reserved;
+
+-- testing ignore nulls functionality
+
+CREATE TEMPORARY TABLE dogs (
+ name text,
+ breed text,
+ age smallint
+);
+
+INSERT INTO dogs VALUES
+('K-9', 'robot', NULL),
+('alfred', NULL, 8),
+('bones', 'shar pei', NULL),
+('churchill', 'bulldog', NULL),
+('lassie', NULL, 4),
+('mickey', 'poodle', 7),
+('molly', 'poodle', NULL),
+('rover', 'shar pei', 3);
+
+-- test view definitions are preserved
+CREATE TEMP VIEW v_dogs AS
+ SELECT
+ name,
+ sum(age) OVER (order by age rows between 1 preceding and 1 following) as sum_rows,
+ lag(age, 1) IGNORE NULLS OVER (ORDER BY name DESC) AS lagged_by_1,
+ lag(age, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM dogs
+ WINDOW w as (ORDER BY name ASC);
+SELECT pg_get_viewdef('v_dogs');
+
+-- (1) lags by constant
+SELECT name, lag(age) OVER (ORDER BY name) FROM dogs ORDER BY name;
+SELECT name, lag(age) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+SELECT name, lag(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+-- (2) leads by constant
+SELECT name, lead(age) OVER (ORDER BY name) FROM dogs ORDER BY name;
+SELECT name, lead(age) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+SELECT name, lead(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+-- (3) lags by expression
+SELECT name, lag(age * 2) OVER (ORDER BY name) FROM dogs ORDER BY name;
+SELECT name, lag(age * 2) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+SELECT name, lag(age * 2) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+-- (4) leads by expression
+SELECT name, lead(age * 2) OVER (ORDER BY name) FROM dogs ORDER BY name;
+SELECT name, lead(age * 2) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+SELECT name, lead(age * 2) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+-- these should be errors as the functionality isn't implemented yet:
+SELECT name, first_value(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+SELECT name, max(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+DROP TABLE dogs CASCADE;
\ No newline at end of file
Hi.
What's the status of this patch? Jeff, �lvaro, you're listed as
reviewers. Have you had a chance to look at the updated version
that Nick posted?
-- Abhijit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Abhijit -
What's the status of this patch?
The latest version of the patch needs a review, and I'd like to get it
committed in this CF if possible. Thanks -
Nick
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, 2014-04-16 at 12:50 +0100, Nicholas White wrote:
Thanks for the detailed feedback, I'm sorry it took so long to
incorporate it. I've attached the latest version of the patch, fixing
in particular:
I took a good look at this today.
* It fails for offset of 0 with IGNORE NULLS. Fixed (trivial).
* The tests are locale-sensitive. Fixed (trivial).
* The leadlag_common function is just way too long. I refactored the
IGNORE NULLS code into it's own function (win_get_arg_ignore_nulls())
with the same API as WinGetFuncArgInPartition. This cleans things up
substantially, and makes it easier to add useful comments.
* "We're implementing the former semantics, so we'll need to correct
slightly" sounds arbitrary, but it's mandated by the standard. That
should be clarified.
* I did a lot of other refactoring within win_get_arg_ignore_nulls for
the constant case. I'm not done yet, and I'm not 100% sure it's a net
gain, because the code ended up a little longer. But the previous
version was quite hard to follow because of so many special cases around
positive versus negative offsets. For instance, having the negative
'next' value in your code actually means something quite different than
when it's positive, but it took me a while to figure that out, so I made
it into two variables. I hope my code is moving it in a direction that's
easier for others to understand.
Please let me know if you think I am making things worse with my
refactorings. Otherwise I'll keep working on this and hopefully get it
committable soon.
The attached patch is still a WIP; just posting it here in case you see
any red flags.
Regards,
Jeff Davis
Attachments:
lead_lag_jeff_20140706.patchtext/x-patch; charset=UTF-8; name=lead_lag_jeff_20140706.patchDownload
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***************
*** 13164,13169 **** SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
--- 13164,13170 ----
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
***************
*** 13178,13184 **** SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
! <replaceable class="parameter">default</replaceable> to null
</entry>
</row>
--- 13179,13187 ----
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
! <replaceable class="parameter">default</replaceable> to null. If
! <literal>IGNORE NULLS</> is specified then the function will be evaluated
! as if the rows containing nulls didn't exist.
</entry>
</row>
***************
*** 13191,13196 **** SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
--- 13194,13200 ----
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
***************
*** 13205,13211 **** SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
! <replaceable class="parameter">default</replaceable> to null
</entry>
</row>
--- 13209,13217 ----
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
! <replaceable class="parameter">default</replaceable> to null. If
! <literal>IGNORE NULLS</> is specified then the function will be evaluated
! as if the rows containing nulls didn't exist.
</entry>
</row>
***************
*** 13299,13309 **** SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
! <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
! <function>first_value</>, <function>last_value</>, and
! <function>nth_value</>. This is not implemented in
! <productname>PostgreSQL</productname>: the behavior is always the
! same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
--- 13305,13314 ----
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
! <literal>IGNORE NULLS</> option for <function>first_value</>,
! <function>last_value</>, and <function>nth_value</>. This is not
! implemented in <productname>PostgreSQL</productname>: the behavior is
! always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
*** a/src/backend/executor/nodeWindowAgg.c
--- b/src/backend/executor/nodeWindowAgg.c
***************
*** 2458,2464 **** window_gettupleslot(WindowObject winobj, int64 pos, TupleTableSlot *slot)
* API exposed to window functions
***********************************************************************/
-
/*
* WinGetPartitionLocalMemory
* Get working memory that lives till end of partition processing
--- 2458,2463 ----
***************
*** 2494,2499 **** WinGetCurrentPosition(WindowObject winobj)
--- 2493,2509 ----
}
/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+ int
+ WinGetFrameOptions(WindowObject winobj)
+ {
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+ }
+
+ /*
* WinGetPartitionRowCount
* Return total number of rows contained in the current partition.
*
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 293,298 **** static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
--- 293,299 ----
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+ %type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
***************
*** 556,562 **** static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
! IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
--- 557,563 ----
HANDLER HAVING HEADER_P HOLD HOUR_P
! IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
***************
*** 586,592 **** static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
! RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
--- 587,593 ----
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
! RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
***************
*** 11900,11918 **** window_definition:
}
;
! over_clause: OVER window_specification
! { $$ = $2; }
! | OVER ColId
{
WindowDef *n = makeNode(WindowDef);
! n->name = $2;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
! n->frameOptions = FRAMEOPTION_DEFAULTS;
n->startOffset = NULL;
n->endOffset = NULL;
! n->location = @2;
$$ = n;
}
| /*EMPTY*/
--- 11901,11928 ----
}
;
! opt_ignore_nulls:
! IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
! | RESPECT NULLS_P { $$ = 0; }
! | /* EMPTY */ { $$ = 0; }
! ;
!
! over_clause: opt_ignore_nulls OVER window_specification
! {
! $3->frameOptions |= $1;
! $$ = $3;
! }
! | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
! n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
! n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
! n->location = @3;
$$ = n;
}
| /*EMPTY*/
***************
*** 12906,12911 **** unreserved_keyword:
--- 12916,12922 ----
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
***************
*** 12993,12998 **** unreserved_keyword:
--- 13004,13010 ----
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
*** a/src/backend/parser/parse_agg.c
--- b/src/backend/parser/parse_agg.c
***************
*** 694,721 **** transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
{
Index winref = 0;
ListCell *lc;
Assert(windef->refname == NULL &&
windef->partitionClause == NIL &&
! windef->orderClause == NIL &&
! windef->frameOptions == FRAMEOPTION_DEFAULTS);
foreach(lc, pstate->p_windowdefs)
{
! WindowDef *refwin = (WindowDef *) lfirst(lc);
!
winref++;
! if (refwin->name && strcmp(refwin->name, windef->name) == 0)
{
wfunc->winref = winref;
break;
}
}
! if (lc == NULL) /* didn't find it? */
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("window \"%s\" does not exist", windef->name),
parser_errposition(pstate, windef->location)));
}
else
{
--- 694,775 ----
{
Index winref = 0;
ListCell *lc;
+ WindowDef *refwin = NULL;
Assert(windef->refname == NULL &&
windef->partitionClause == NIL &&
! windef->orderClause == NIL);
foreach(lc, pstate->p_windowdefs)
{
! WindowDef *thiswin = (WindowDef *) lfirst(lc);
winref++;
!
! if (thiswin->name && strcmp(thiswin->name, windef->name) == 0)
! {
! /*
! * "thiswin" is the window we want - but we have to tweak the
! * definition slightly as some window options (e.g. IGNORE
! * NULLS) can't be specified a standalone window definition;
! * they can only be specified when invoking a window function
! * over a window definition. However, we don't want to modify
! * the window def itself (as that'll affect other window
! * functions that use it - so if we need to make changes to it
! * we'll clone refwin, change the clone and add the clone to
! * the list of window definitions in pstate.
! *
! * There's one catch: what if a statement has two (or more)
! * window function calls that reference the same window
! * definition, and both have IGNORE NULLs? We don't want to
! * add two modified definitions to pstate, so we'll only break
! * if thiswin is an exact match - if not we'll keep looking for
! * a window definition with the same name *and* same frame
! * options.
! */
! wfunc->winref = winref;
! refwin = thiswin;
! if(windef->frameOptions == FRAMEOPTION_DEFAULTS)
! break; /* don't need to clone, so just use this one */
! }
!
! if (refwin && /* we need to have found the parent window */
! thiswin->refname &&
! strcmp(thiswin->refname, windef->name) == 0 && /* it reference the right parent */
! thiswin->frameOptions == windef->frameOptions)
{
+ /* found a clone window specification that we can re-use */
wfunc->winref = winref;
+ refwin = thiswin;
break;
}
}
!
! if (refwin == NULL) /* didn't find it? */
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("window \"%s\" does not exist", windef->name),
parser_errposition(pstate, windef->location)));
+ else if (windef->frameOptions != refwin->frameOptions /* we can't use the clone */
+ && windef->frameOptions != FRAMEOPTION_DEFAULTS) /* we can't use the parent */
+ {
+ /*
+ * This means we've found the parent, but no clones (if there were
+ * any) had the correct frame options. We'll clone the parent we
+ * found (refwin), set the frame options we want and add the new
+ * clone to pstate:
+ */
+ WindowDef *clone = makeNode(WindowDef);
+
+ clone->name = NULL;
+ clone->refname = pstrdup(refwin->name);
+ clone->frameOptions = windef->frameOptions; /* Note windef! */
+ clone->startOffset = copyObject(refwin->startOffset);
+ clone->endOffset = copyObject(refwin->endOffset);
+ clone->location = refwin->location;
+
+ pstate->p_windowdefs = lappend(pstate->p_windowdefs, clone);
+ wfunc->winref = list_length(pstate->p_windowdefs);
+ }
}
else
{
*** a/src/backend/parser/parse_func.c
--- b/src/backend/parser/parse_func.c
***************
*** 726,731 **** ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
--- 726,747 ----
NameListToString(funcname)),
parser_errposition(pstate, location)));
+ if (over->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ {
+ /*
+ * As this is only implemented for the lead & lag window functions
+ * we'll filter out all aggregate functions too.
+ */
+ if (fdresult != FUNCDETAIL_WINDOWFUNC
+ || (strcmp("lead", strVal(llast(funcname))) != 0 &&
+ strcmp("lag", strVal(llast(funcname))) != 0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("RESPECT NULLS is only implemented for the lead and lag window functions"),
+ parser_errposition(pstate, location))); }
+ }
+
/*
* ordered aggs not allowed in windows yet
*/
*** a/src/backend/utils/adt/ruleutils.c
--- b/src/backend/utils/adt/ruleutils.c
***************
*** 4982,4992 **** get_rule_windowspec(WindowClause *wc, List *targetList,
bool needspace = false;
const char *sep;
ListCell *l;
appendStringInfoChar(buf, '(');
if (wc->refname)
{
! appendStringInfoString(buf, quote_identifier(wc->refname));
needspace = true;
}
/* partition clauses are always inherited, so only print if no refname */
--- 4982,4997 ----
bool needspace = false;
const char *sep;
ListCell *l;
+ size_t refname_len = 0;
+ int initial_buf_len = buf->len;
appendStringInfoChar(buf, '(');
if (wc->refname)
{
! const char *quoted_refname = quote_identifier(wc->refname);
!
! refname_len = strlen(quoted_refname);
! appendStringInfoString(buf, quoted_refname);
needspace = true;
}
/* partition clauses are always inherited, so only print if no refname */
***************
*** 5068,5074 **** get_rule_windowspec(WindowClause *wc, List *targetList,
/* we will now have a trailing space; remove it */
buf->len--;
}
! appendStringInfoChar(buf, ')');
}
/* ----------
--- 5073,5092 ----
/* we will now have a trailing space; remove it */
buf->len--;
}
!
! /*
! * We'll tidy up the output slightly; if we've got a refname, but haven't
! * overridden the partition-by, order-by or any of the frame flags
! * relevant inside the window def's ()s, then we'll be left with
! * "(<refname>)". We'll trim off the brackets in this case:
! */
! if (wc->refname && buf->len == initial_buf_len + refname_len + 1)
! {
! memcpy(buf->data + initial_buf_len, buf->data + initial_buf_len + 1, refname_len);
! buf->len -= 1; /* the trailing ")" */
! }
! else
! appendStringInfoChar(buf, ')');
}
/* ----------
***************
*** 7860,7866 **** get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
get_rule_expr((Node *) wfunc->aggfilter, context, false);
}
! appendStringInfoString(buf, ") OVER ");
foreach(l, context->windowClause)
{
--- 7878,7884 ----
get_rule_expr((Node *) wfunc->aggfilter, context, false);
}
! appendStringInfoString(buf, ") ");
foreach(l, context->windowClause)
{
***************
*** 7868,7873 **** get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
--- 7886,7895 ----
if (wc->winref == wfunc->winref)
{
+ if (wc->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ appendStringInfoString(buf, "IGNORE NULLS ");
+ appendStringInfoString(buf, "OVER ");
+
if (wc->name)
appendStringInfoString(buf, quote_identifier(wc->name));
else
*** a/src/backend/utils/adt/windowfuncs.c
--- b/src/backend/utils/adt/windowfuncs.c
***************
*** 13,19 ****
--- 13,21 ----
*/
#include "postgres.h"
+ #include "nodes/bitmapset.h"
#include "utils/builtins.h"
+ #include "utils/memutils.h"
#include "windowapi.h"
/*
***************
*** 24,29 **** typedef struct rank_context
--- 26,47 ----
int64 rank; /* current rank */
} rank_context;
+
+ typedef struct leadlag_const_context
+ {
+ /* the index of the lead / lagged value */
+ int64 next;
+
+ /* how many non-NULL tuples before we can start emitting? */
+ int64 lag_wait;
+ } leadlag_const_context;
+
+ /*
+ * lead-lag process helpers
+ */
+ #define ISNULL_INDEX(i) (2 * (i))
+ #define HAVESCANNED_INDEX(i) ((2 * (i)) + 1)
+
/*
* ntile process information
*/
***************
*** 38,43 **** typedef struct
--- 56,65 ----
static bool rank_up(WindowObject winobj);
static Datum leadlag_common(FunctionCallInfo fcinfo,
bool forward, bool withoffset, bool withdefault);
+ static Datum win_get_arg_ignore_nulls(WindowObject winobj, int argno,
+ int offset, int seektype,
+ bool const_offset, bool *isnull,
+ bool *isout);
/*
***************
*** 280,286 **** window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
! * withdefault indicates we have a default third argument.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
--- 302,309 ----
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
! * withdefault indicates we have a default third argument. We'll only
! * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
***************
*** 290,303 **** leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
! bool isnull;
! bool isout;
if (withoffset)
{
offset = DatumGetInt32(WinGetFuncArgCurrent(winobj, 1, &isnull));
if (isnull)
PG_RETURN_NULL();
const_offset = get_fn_expr_arg_stable(fcinfo->flinfo, 1);
}
else
--- 313,337 ----
int32 offset;
bool const_offset;
Datum result;
! bool isnull = false;
! bool isout = false;
! bool ignore_nulls;
!
! /* is IGNORE NULLS specified? */
! ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
offset = DatumGetInt32(WinGetFuncArgCurrent(winobj, 1, &isnull));
if (isnull)
PG_RETURN_NULL();
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't
+ * mark at all).
+ */
const_offset = get_fn_expr_arg_stable(fcinfo->flinfo, 1);
}
else
***************
*** 305,325 **** leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
! result = WinGetFuncArgInPartition(winobj, 0,
! (forward ? offset : -offset),
! WINDOW_SEEK_CURRENT,
! const_offset,
! &isnull, &isout);
if (isout)
{
/*
! * target row is out of the partition; supply default value if
! * provided. otherwise it'll stay NULL
*/
if (withdefault)
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
}
if (isnull)
--- 339,382 ----
offset = 1;
const_offset = true;
}
+ if (!forward)
+ {
+ offset = -offset;
+ }
! if (ignore_nulls)
! {
! result = win_get_arg_ignore_nulls(winobj, 0,
! offset,
! WINDOW_SEEK_CURRENT,
! const_offset,
! &isnull, &isout);
! }
! else
! {
! result = WinGetFuncArgInPartition(winobj, 0,
! offset,
! WINDOW_SEEK_CURRENT,
! const_offset,
! &isnull, &isout);
! }
if (isout)
{
/*
! * Target row is out of the partition; supply default value if
! * provided.
*/
if (withdefault)
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
+ else
+ {
+ /*
+ * Don't return whatever's lying around in result, force the
+ * output to null if there's no default.
+ */
+ Assert(isnull);
+ }
}
if (isnull)
***************
*** 329,334 **** leadlag_common(FunctionCallInfo fcinfo,
--- 386,714 ----
}
/*
+ * win_get_arg_ignore_nulls
+ *
+ * Like WinGetFuncArgInPartition, but skips over any rows where the argument
+ * evaluates to NULL as though they did not exist. Offset is relative to the
+ * current row (positive or negative). If offset is zero, *isout is set false,
+ * and the argument at the current row is returned, even if it is NULL.
+ *
+ * Some argument names differ from WinGetFuncArgInPartition, but the meanings
+ * are the same.
+ */
+ static Datum
+ win_get_arg_ignore_nulls(WindowObject winobj, int argno,
+ int offset, int seektype, bool const_offset,
+ bool *result_isnull, bool *result_isout)
+ {
+ /*
+ * A ** pointer as we keep a Bitmapset * in the partition context, and
+ * WinGetPartitionLocalMemory returns a pointer to whatever's in the
+ * context.
+ */
+ Bitmapset **null_values;
+ bool local_isout = false;
+ bool local_isnull = false;
+ Datum result;
+
+ /*
+ * Special case: when applying the offset, we're ignoring all rows where
+ * the argument evaluates to NULL. But if the offset is zero, the standard
+ * requires that we evaluate the argument at the current row even if it is
+ * NULL.
+ */
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ result_isnull, result_isout);
+ Assert (*result_isout == false);
+ return result;
+ }
+
+ if (const_offset)
+ {
+ leadlag_const_context *context;
+
+ /*
+ * For positive offset, the mark will always be set to the current
+ * row. We initialize by scanning until we find 'offset' non-NULL
+ * values, and remember that index in a context variable, 'next'. Each
+ * time this function is called, if the current argument value is
+ * NULL, we just return the same value. If the current row is
+ * non-NULL, we scan forward to find one more non-NULL value, update
+ * 'next' to that index, and return the new value.
+ *
+ * For a negative offset, the mark will always be set to the index of
+ * the 'offset'th non-NULL value before the current row if it
+ * exists. In addition to 'next', we also need to track another
+ * context variable 'lag_wait'; otherwise we don't know if we've
+ * actually seen 'offset' non-NULL values or not. The 'lag_wait' is
+ * initialized to 'offset' and counts down with each non-NULL value
+ * encountered, always returning NULL until it reaches zero. When
+ * lag_count is zero, we save the value at 'next'. If the argument
+ * value at the current row is NULL, we then scan forward until we
+ * find the next non-NULL value, and mark it and update 'next'. The
+ * saved value is returned.
+ */
+
+ context = WinGetPartitionLocalMemory(winobj,
+ sizeof(leadlag_const_context));
+
+ /*
+ * First time through, initialize.
+ */
+ if (WinGetCurrentPosition(winobj) == 0)
+ {
+ /*
+ * If offset is positive, we need to scan through looking for
+ * 'offset' non-NULL values. Do not set the mark.
+ */
+ if (offset >= 0)
+ {
+ int i, j;
+
+ for (i = 0, j = -1; i < offset && !local_isout; i++)
+ {
+ do
+ {
+ j++;
+ result = WinGetFuncArgInPartition(winobj, 0,
+ j,
+ WINDOW_SEEK_HEAD,
+ false,
+ &local_isnull,
+ &local_isout);
+ }
+ while (local_isnull && !local_isout);
+ }
+
+ context->next = j;
+ context->lag_wait = 0;
+ }
+ else
+ {
+ /* 'next' will be incremented before being read */
+ context->next = -1;
+ context->lag_wait = -offset;
+ }
+ }
+
+ /*
+ * Look at the current row to check whether it's NULL or not. If the
+ * offset is positive, use this opportunity to mark the position;
+ * otherwise it will be done later.
+ */
+ WinGetFuncArgInPartition(winobj, 0,
+ 0,
+ WINDOW_SEEK_CURRENT,
+ (offset >= 0),
+ &local_isnull, &local_isout);
+
+ /* offset negative, and haven't seen enough non-NULL values yet */
+ if (context->lag_wait > 0)
+ {
+ if (!local_isnull)
+ context->lag_wait--;
+
+ /* track the current position */
+ context->next++;
+
+ *result_isnull = true;
+ *result_isout = false;
+ return (Datum) 0;
+ }
+
+ /*
+ * For negative offset, we fetch the value from before the scan.
+ */
+ if (offset < 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ context->next,
+ WINDOW_SEEK_HEAD,
+ false,
+ result_isnull, result_isout);
+ }
+
+ if (!local_isnull)
+ {
+ int i = context->next;
+
+ do
+ {
+ i++;
+ WinGetFuncArgInPartition(winobj, 0,
+ i,
+ WINDOW_SEEK_HEAD,
+ (offset < 0),
+ &local_isnull, &local_isout);
+ }
+ while (local_isnull && !local_isout);
+
+ context->next = i;
+ }
+
+ /*
+ * For positive offset, we fetch the value from after the scan.
+ */
+ if (offset >= 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ context->next,
+ WINDOW_SEEK_HEAD,
+ false,
+ result_isnull, result_isout);
+ }
+
+ return result;
+ }
+ else
+ {
+ int64 scanning,
+ current = WinGetCurrentPosition(winobj);
+
+ bool local_forward = (offset >= 0);
+
+ /*
+ * This case is a little complicated; we're defining "IGNORE NULLS" as
+ * "run the query, and pretend the rows with nulls in them don't
+ * exist". This means that we'll scan from the current row an 'offset'
+ * number of non-null rows, and then return that one.
+ *
+ * As the offset isn't constant we need efficient random access to the
+ * partition, as we'll check upto O(partition size) tuples for each
+ * row we're calculating the window function value for.
+ */
+
+ null_values = (Bitmapset **) WinGetPartitionLocalMemory(winobj, sizeof(Bitmapset *));
+
+ if (*null_values == NULL)
+ {
+ MemoryContext oldcxt;
+
+ /*
+ * Accessing tuples is expensive, so we'll keep track of the ones
+ * we've accessed (more specifically, if they're null or not).
+ * We'll need one bit for whether the value is null and one bit
+ * for whether we've checked that tuple or not. We'll keep these
+ * two bits together (as opposed to having two separate bitmaps)
+ * to improve cache locality.
+ *
+ * However, we'd lose the efficient gains if we keep having to
+ * resize the Bitmapset (by setting higher and higher bits). We
+ * know the maximum number of bits we'll ever need, so we'll use
+ * bms_make_singleton to force our Bitmapset up to the required
+ * size.
+ */
+ int64 bits_needed = 2 * WinGetPartitionRowCount(winobj);
+
+ oldcxt = MemoryContextSwitchTo(GetMemoryChunkContext(null_values));
+ *null_values = bms_make_singleton(bits_needed + 1);
+ MemoryContextSwitchTo(oldcxt);
+ }
+
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be
+ * in the opposite direction to the way we're scanning. We'll then
+ * force offset to be positive to make counting down the rows easier.
+ */
+ local_forward = offset == 0 ? local_forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; local_forward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ local_isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the
+ * default value, but not whatever's left in result. We'll use
+ * the isnull flag to say "ignore it"!
+ */
+ local_isnull = true;
+ result = (Datum) 0;
+
+ break;
+ }
+
+ if (bms_is_member(HAVESCANNED_INDEX(scanning), *null_values))
+ {
+ local_isnull = bms_is_member(ISNULL_INDEX(scanning), *null_values);
+ }
+ else
+ {
+ /*
+ * first time we've accessed this index; let's see if it's
+ * null:
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &local_isnull, &local_isout);
+ if (local_isout)
+ break;
+
+ bms_add_member(*null_values, HAVESCANNED_INDEX(scanning));
+ if (local_isnull)
+ {
+ bms_add_member(*null_values, ISNULL_INDEX(scanning));
+ }
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a
+ * chance that we may stop iterating here:
+ */
+ if (!local_isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &local_isnull, &local_isout);
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to
+ * the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*--------
+ * A slight edge case. Consider:
+ *
+ * =================
+ * A | lag(A, 1)
+ * =================
+ * 1 | NULL
+ * 2 | 1
+ * NULL | ?
+ * =================
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ *--------
+ */
+ --offset;
+ }
+ }
+ }
+
+ *result_isnull = local_isnull;
+ *result_isout = local_isout;
+ return result;
+ }
+
+ /*
* lag
* returns the value of VE evaluated on a row that is 1
* row before the current row within a partition,
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 420,438 **** typedef struct SortBy
* For entries in a WINDOW list, "name" is the window name being defined.
* For OVER clauses, we use "name" for the "OVER window" syntax, or "refname"
* for the "OVER (window)" syntax, which is subtly different --- the latter
! * implies overriding the window frame clause.
*/
typedef struct WindowDef
{
NodeTag type;
! char *name; /* window's own name */
! char *refname; /* referenced window name, if any */
! List *partitionClause; /* PARTITION BY expression list */
! List *orderClause; /* ORDER BY (list of SortBy) */
! int frameOptions; /* frame_clause options, see below */
! Node *startOffset; /* expression for starting bound, if any */
! Node *endOffset; /* expression for ending bound, if any */
! int location; /* parse location, or -1 if none/unknown */
} WindowDef;
/*
--- 420,457 ----
* For entries in a WINDOW list, "name" is the window name being defined.
* For OVER clauses, we use "name" for the "OVER window" syntax, or "refname"
* for the "OVER (window)" syntax, which is subtly different --- the latter
! * implies overriding the window frame clause. The semantics of each override
! * depends on the field.
*/
typedef struct WindowDef
{
NodeTag type;
! /* Window's own name. This must be NULL for overrides. */
! char *name;
! /* Referenced window name, if any. This must be present on overrides. */
! char *refname;
! /*
! * PARTITION BY expression list. If an override leaves this NULL, the
! * parent's partitionClause will be used.
! */
! List *partitionClause;
! /*
! * ORDER BY (list of SortBy). This field is ignored in overrides - the
! * parent's value will always be used.
! */
! List *orderClause;
! /*
! * The remaining fields in this struct must be specified on overrides,
! * even if the override's value is the same as the parent's.
! */
! /* frame_clause options, see below */
! int frameOptions;
! /* Expression for starting bound, if any */
! Node *startOffset;
! /* expression for ending bound, if any */
! Node *endOffset;
! /* parse location, or -1 if none/unknown */
! int location;
} WindowDef;
/*
***************
*** 457,462 **** typedef struct WindowDef
--- 476,482 ----
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+ #define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
*** a/src/include/parser/kwlist.h
--- b/src/include/parser/kwlist.h
***************
*** 180,185 **** PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
--- 180,186 ----
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+ PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
***************
*** 312,317 **** PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
--- 313,319 ----
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+ PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
*** a/src/include/windowapi.h
--- b/src/include/windowapi.h
***************
*** 46,51 **** extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
--- 46,53 ----
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+ extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
*** a/src/test/regress/expected/window.out
--- b/src/test/regress/expected/window.out
***************
*** 1822,1824 **** SELECT i, b, bool_and(b) OVER w, bool_or(b) OVER w
--- 1822,2064 ----
5 | t | t | t
(5 rows)
+ -- check we haven't reserved words that might break backwards-compatibility:
+ CREATE TABLE reserved (
+ ignore text,
+ respect text,
+ nulls text
+ );
+ DROP TABLE reserved;
+ -- testing ignore nulls functionality
+ CREATE TEMPORARY TABLE dogs (
+ name text,
+ breed text,
+ age smallint
+ );
+ INSERT INTO dogs VALUES
+ ('ajax', 'mythological', NULL),
+ ('alfred', NULL, 8),
+ ('bones', 'shar pei', NULL),
+ ('churchill', 'bulldog', NULL),
+ ('lassie', NULL, 4),
+ ('mickey', 'poodle', 7),
+ ('molly', 'poodle', NULL),
+ ('rover', 'shar pei', 3);
+ -- test view definitions are preserved
+ CREATE TEMP VIEW v_dogs AS
+ SELECT
+ name,
+ sum(age) OVER (order by age rows between 1 preceding and 1 following) as sum_rows,
+ lag(age, 1) IGNORE NULLS OVER (ORDER BY name DESC) AS lagged_by_1,
+ lag(age, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM dogs
+ WINDOW w as (ORDER BY name ASC);
+ SELECT pg_get_viewdef('v_dogs');
+ pg_get_viewdef
+ --------------------------------------------------------------------------------------------------
+ SELECT dogs.name, +
+ sum(dogs.age) OVER (ORDER BY dogs.age ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows,+
+ lag(dogs.age, 1) IGNORE NULLS OVER (ORDER BY dogs.name DESC) AS lagged_by_1, +
+ lag(dogs.age, 2) IGNORE NULLS OVER w AS lagged_by_2 +
+ FROM dogs +
+ WINDOW w AS (ORDER BY dogs.name);
+ (1 row)
+
+ -- (1) lags by constant
+ SELECT name, lag(age) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred |
+ bones | 8
+ churchill |
+ lassie |
+ mickey | 4
+ molly | 7
+ rover |
+ (8 rows)
+
+ SELECT name, lag(age) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred |
+ bones | 8
+ churchill |
+ lassie |
+ mickey | 4
+ molly | 7
+ rover |
+ (8 rows)
+
+ SELECT name, lag(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred |
+ bones | 8
+ churchill | 8
+ lassie | 8
+ mickey | 4
+ molly | 7
+ rover | 7
+ (8 rows)
+
+ -- (2) leads by constant
+ SELECT name, lead(age) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 8
+ alfred |
+ bones |
+ churchill | 4
+ lassie | 7
+ mickey |
+ molly | 3
+ rover |
+ (8 rows)
+
+ SELECT name, lead(age) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 8
+ alfred |
+ bones |
+ churchill | 4
+ lassie | 7
+ mickey |
+ molly | 3
+ rover |
+ (8 rows)
+
+ SELECT name, lead(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 8
+ alfred | 4
+ bones | 4
+ churchill | 4
+ lassie | 7
+ mickey | 3
+ molly | 3
+ rover |
+ (8 rows)
+
+ -- (3) lags by expression
+ SELECT name, lag(age * 2) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred |
+ bones | 16
+ churchill |
+ lassie |
+ mickey | 8
+ molly | 14
+ rover |
+ (8 rows)
+
+ SELECT name, lag(age * 2) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred |
+ bones | 16
+ churchill |
+ lassie |
+ mickey | 8
+ molly | 14
+ rover |
+ (8 rows)
+
+ SELECT name, lag(age * 2) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred |
+ bones | 16
+ churchill | 16
+ lassie | 16
+ mickey | 8
+ molly | 14
+ rover | 14
+ (8 rows)
+
+ -- (4) leads by expression
+ SELECT name, lead(age * 2) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 16
+ alfred |
+ bones |
+ churchill | 8
+ lassie | 14
+ mickey |
+ molly | 6
+ rover |
+ (8 rows)
+
+ SELECT name, lead(age * 2) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 16
+ alfred |
+ bones |
+ churchill | 8
+ lassie | 14
+ mickey |
+ molly | 6
+ rover |
+ (8 rows)
+
+ SELECT name, lead(age * 2) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 16
+ alfred | 8
+ bones | 8
+ churchill | 8
+ lassie | 14
+ mickey | 6
+ molly | 6
+ rover |
+ (8 rows)
+
+ -- these should be errors as the functionality isn't implemented yet:
+ SELECT name, first_value(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+ LINE 1: SELECT name, first_value(age) IGNORE NULLS OVER (ORDER BY na...
+ ^
+ SELECT name, max(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ ERROR: RESPECT NULLS is only implemented for the lead and lag window functions
+ LINE 1: SELECT name, max(age) IGNORE NULLS OVER (ORDER BY name) FROM...
+ ^
+ -- ensure that a zero offset still returns the current value, even if NULL
+ SELECT name, lead(age, 0) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax |
+ alfred | 8
+ bones |
+ churchill |
+ lassie | 4
+ mickey | 7
+ molly |
+ rover | 3
+ (8 rows)
+
+ SELECT name, lag(age, 0) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred | 8
+ bones |
+ churchill |
+ lassie | 4
+ mickey | 7
+ molly |
+ rover | 3
+ (8 rows)
+
+ DROP TABLE dogs CASCADE;
+ NOTICE: drop cascades to view v_dogs
*** a/src/test/regress/sql/window.sql
--- b/src/test/regress/sql/window.sql
***************
*** 641,643 **** SELECT to_char(SUM(n::float8) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FO
--- 641,710 ----
SELECT i, b, bool_and(b) OVER w, bool_or(b) OVER w
FROM (VALUES (1,true), (2,true), (3,false), (4,false), (5,true)) v(i,b)
WINDOW w AS (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING);
+
+ -- check we haven't reserved words that might break backwards-compatibility:
+ CREATE TABLE reserved (
+ ignore text,
+ respect text,
+ nulls text
+ );
+ DROP TABLE reserved;
+
+ -- testing ignore nulls functionality
+
+ CREATE TEMPORARY TABLE dogs (
+ name text,
+ breed text,
+ age smallint
+ );
+
+ INSERT INTO dogs VALUES
+ ('ajax', 'mythological', NULL),
+ ('alfred', NULL, 8),
+ ('bones', 'shar pei', NULL),
+ ('churchill', 'bulldog', NULL),
+ ('lassie', NULL, 4),
+ ('mickey', 'poodle', 7),
+ ('molly', 'poodle', NULL),
+ ('rover', 'shar pei', 3);
+
+ -- test view definitions are preserved
+ CREATE TEMP VIEW v_dogs AS
+ SELECT
+ name,
+ sum(age) OVER (order by age rows between 1 preceding and 1 following) as sum_rows,
+ lag(age, 1) IGNORE NULLS OVER (ORDER BY name DESC) AS lagged_by_1,
+ lag(age, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM dogs
+ WINDOW w as (ORDER BY name ASC);
+ SELECT pg_get_viewdef('v_dogs');
+
+ -- (1) lags by constant
+ SELECT name, lag(age) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ -- (2) leads by constant
+ SELECT name, lead(age) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ -- (3) lags by expression
+ SELECT name, lag(age * 2) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age * 2) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age * 2) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ -- (4) leads by expression
+ SELECT name, lead(age * 2) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age * 2) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age * 2) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ -- these should be errors as the functionality isn't implemented yet:
+ SELECT name, first_value(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, max(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ -- ensure that a zero offset still returns the current value, even if NULL
+ SELECT name, lead(age, 0) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age, 0) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ DROP TABLE dogs CASCADE;
On Sun, 2014-07-06 at 21:11 -0700, Jeff Davis wrote:
On Wed, 2014-04-16 at 12:50 +0100, Nicholas White wrote:
Thanks for the detailed feedback, I'm sorry it took so long to
incorporate it. I've attached the latest version of the patch, fixing
in particular:
Looking a little more:
* No tests exercise non-const offsets
* No tests for default clauses with IGNORE NULLS
* The use of bitmapsets is quite ugly. It would be nice if the API would
grow the BMS within the memory context in which it was allocated, but I
don't even see that the BMS is necessary. Why not just allocate a
fixed-size array of bits, and forget the BMS?
* Is there a reason you're leaving out first_value/last_value/nth_value?
I think they could be supported without a lot of extra work.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, 2014-07-07 at 01:21 -0700, Jeff Davis wrote:
On Sun, 2014-07-06 at 21:11 -0700, Jeff Davis wrote:
On Wed, 2014-04-16 at 12:50 +0100, Nicholas White wrote:
Thanks for the detailed feedback, I'm sorry it took so long to
incorporate it. I've attached the latest version of the patch, fixing
in particular:
As innocent as this patch seemed at first, it actually opens up a lot of
questions.
Attached is the (incomplete) edit of the patch so far.
Changes from your patch:
* changed test to be locale-insensitive
* lots of refactoring in the execution itself
* fix offset 0 case
* many test improvements
* remove bitmapset and just use an array bitmap
* fix error message typo
Open Issues:
I don't think exposing the frame options is a good idea. That's an
internal concept now, but putting it in windowapi.h will mean that it
needs to live forever.
The struct is private, so there's no easy hack to access the frame
options directly. That means that we need to work with the existing API
functions, which is OK because I think that everything we want to do can
go into WinGetFuncArgInPartition(). If we do the same thing for
WinGetFuncArgInFrame(), then first/last/nth also work.
That leaves the questions:
* Do we want IGNORE NULLS to work for every window function, or only a
specified subset?
* If it only works for some window functions, is that hard-coded or
driven by the catalog?
* If it works for all window functions, could it cause some
pre-existing functions to behave strangely?
Also, I'm re-thinking Dean's comments here:
/messages/by-id/CAEZATCWT3=P88nv2ThTjvRDLpOsVtAPxaVPe=MaWe-x=GuhSmg@mail.gmail.com
He brings up a few good points. I will look into the frame vs. window
option, though it looks like you've already at least fixed the crash.
His other point about actually eliminating the NULLs from the window
itself is interesting, but I don't think it works. IGNORE NULLS ignores
*other* rows with NULL, but (per spec) does not ignore the current row.
That sounds awkward if you've already removed the NULL rows from the
window, but maybe there's something that could work.
And there are a few other things I'm still looking into, but hopefully
they don't raise new issues.
Regards,
Jeff Davis
Attachments:
lead_lag_jeff_20140710.patchtext/x-patch; charset=UTF-8; name=lead_lag_jeff_20140710.patchDownload
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***************
*** 13164,13169 **** SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
--- 13164,13170 ----
lag(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
***************
*** 13178,13184 **** SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
! <replaceable class="parameter">default</replaceable> to null
</entry>
</row>
--- 13179,13187 ----
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
! <replaceable class="parameter">default</replaceable> to null. If
! <literal>IGNORE NULLS</> is specified then the function will be evaluated
! as if the rows containing nulls didn't exist.
</entry>
</row>
***************
*** 13191,13196 **** SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
--- 13194,13200 ----
lead(<replaceable class="parameter">value</replaceable> <type>any</>
[, <replaceable class="parameter">offset</replaceable> <type>integer</>
[, <replaceable class="parameter">default</replaceable> <type>any</> ]])
+ [ { RESPECT | IGNORE } NULLS ]
</function>
</entry>
<entry>
***************
*** 13205,13211 **** SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
! <replaceable class="parameter">default</replaceable> to null
</entry>
</row>
--- 13209,13217 ----
<replaceable class="parameter">default</replaceable> are evaluated
with respect to the current row. If omitted,
<replaceable class="parameter">offset</replaceable> defaults to 1 and
! <replaceable class="parameter">default</replaceable> to null. If
! <literal>IGNORE NULLS</> is specified then the function will be evaluated
! as if the rows containing nulls didn't exist.
</entry>
</row>
***************
*** 13299,13309 **** SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
! <literal>IGNORE NULLS</> option for <function>lead</>, <function>lag</>,
! <function>first_value</>, <function>last_value</>, and
! <function>nth_value</>. This is not implemented in
! <productname>PostgreSQL</productname>: the behavior is always the
! same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
--- 13305,13314 ----
<note>
<para>
The SQL standard defines a <literal>RESPECT NULLS</> or
! <literal>IGNORE NULLS</> option for <function>first_value</>,
! <function>last_value</>, and <function>nth_value</>. This is not
! implemented in <productname>PostgreSQL</productname>: the behavior is
! always the same as the standard's default, namely <literal>RESPECT NULLS</>.
Likewise, the standard's <literal>FROM FIRST</> or <literal>FROM LAST</>
option for <function>nth_value</> is not implemented: only the
default <literal>FROM FIRST</> behavior is supported. (You can achieve
*** a/src/backend/executor/nodeWindowAgg.c
--- b/src/backend/executor/nodeWindowAgg.c
***************
*** 2494,2499 **** WinGetCurrentPosition(WindowObject winobj)
--- 2494,2510 ----
}
/*
+ * WinGetFrameOptions
+ * Returns the frame option flags
+ */
+ int
+ WinGetFrameOptions(WindowObject winobj)
+ {
+ Assert(WindowObjectIsValid(winobj));
+ return winobj->winstate->frameOptions;
+ }
+
+ /*
* WinGetPartitionRowCount
* Return total number of rows contained in the current partition.
*
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***************
*** 301,306 **** static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
--- 301,307 ----
%type <list> TriggerEvents TriggerOneEvent
%type <value> TriggerFuncArg
%type <node> TriggerWhen
+ %type <ival> opt_ignore_nulls
%type <list> event_trigger_when_list event_trigger_value_list
%type <defelt> event_trigger_when_item
***************
*** 566,572 **** static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
! IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
--- 567,573 ----
HANDLER HAVING HEADER_P HOLD HOUR_P
! IDENTITY_P IF_P IGNORE ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
***************
*** 596,602 **** static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
! RESET RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
--- 597,603 ----
RANGE READ REAL REASSIGN RECHECK RECURSIVE REF REFERENCES REFRESH REINDEX
RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
! RESET RESPECT RESTART RESTRICT RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK
ROW ROWS RULE
SAVEPOINT SCHEMA SCROLL SEARCH SECOND_P SECURITY SELECT SEQUENCE SEQUENCES
***************
*** 11957,11975 **** window_definition:
}
;
! over_clause: OVER window_specification
! { $$ = $2; }
! | OVER ColId
{
WindowDef *n = makeNode(WindowDef);
! n->name = $2;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
! n->frameOptions = FRAMEOPTION_DEFAULTS;
n->startOffset = NULL;
n->endOffset = NULL;
! n->location = @2;
$$ = n;
}
| /*EMPTY*/
--- 11958,11985 ----
}
;
! opt_ignore_nulls:
! IGNORE NULLS_P { $$ = FRAMEOPTION_IGNORE_NULLS; }
! | RESPECT NULLS_P { $$ = 0; }
! | /* EMPTY */ { $$ = 0; }
! ;
!
! over_clause: opt_ignore_nulls OVER window_specification
! {
! $3->frameOptions |= $1;
! $$ = $3;
! }
! | opt_ignore_nulls OVER ColId
{
WindowDef *n = makeNode(WindowDef);
! n->name = $3;
n->refname = NULL;
n->partitionClause = NIL;
n->orderClause = NIL;
! n->frameOptions = FRAMEOPTION_DEFAULTS | $1;
n->startOffset = NULL;
n->endOffset = NULL;
! n->location = @3;
$$ = n;
}
| /*EMPTY*/
***************
*** 12963,12968 **** unreserved_keyword:
--- 12973,12979 ----
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
***************
*** 13051,13056 **** unreserved_keyword:
--- 13062,13068 ----
| REPLACE
| REPLICA
| RESET
+ | RESPECT
| RESTART
| RESTRICT
| RETURNS
*** a/src/backend/parser/parse_agg.c
--- b/src/backend/parser/parse_agg.c
***************
*** 694,721 **** transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
{
Index winref = 0;
ListCell *lc;
Assert(windef->refname == NULL &&
windef->partitionClause == NIL &&
! windef->orderClause == NIL &&
! windef->frameOptions == FRAMEOPTION_DEFAULTS);
foreach(lc, pstate->p_windowdefs)
{
! WindowDef *refwin = (WindowDef *) lfirst(lc);
!
winref++;
! if (refwin->name && strcmp(refwin->name, windef->name) == 0)
{
wfunc->winref = winref;
break;
}
}
! if (lc == NULL) /* didn't find it? */
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("window \"%s\" does not exist", windef->name),
parser_errposition(pstate, windef->location)));
}
else
{
--- 694,775 ----
{
Index winref = 0;
ListCell *lc;
+ WindowDef *refwin = NULL;
Assert(windef->refname == NULL &&
windef->partitionClause == NIL &&
! windef->orderClause == NIL);
foreach(lc, pstate->p_windowdefs)
{
! WindowDef *thiswin = (WindowDef *) lfirst(lc);
winref++;
!
! if (thiswin->name && strcmp(thiswin->name, windef->name) == 0)
! {
! /*
! * "thiswin" is the window we want - but we have to tweak the
! * definition slightly as some window options (e.g. IGNORE
! * NULLS) can't be specified a standalone window definition;
! * they can only be specified when invoking a window function
! * over a window definition. However, we don't want to modify
! * the window def itself (as that'll affect other window
! * functions that use it - so if we need to make changes to it
! * we'll clone refwin, change the clone and add the clone to
! * the list of window definitions in pstate.
! *
! * There's one catch: what if a statement has two (or more)
! * window function calls that reference the same window
! * definition, and both have IGNORE NULLs? We don't want to
! * add two modified definitions to pstate, so we'll only break
! * if thiswin is an exact match - if not we'll keep looking for
! * a window definition with the same name *and* same frame
! * options.
! */
! wfunc->winref = winref;
! refwin = thiswin;
! if(windef->frameOptions == FRAMEOPTION_DEFAULTS)
! break; /* don't need to clone, so just use this one */
! }
!
! if (refwin && /* we need to have found the parent window */
! thiswin->refname &&
! strcmp(thiswin->refname, windef->name) == 0 && /* it reference the right parent */
! thiswin->frameOptions == windef->frameOptions)
{
+ /* found a clone window specification that we can re-use */
wfunc->winref = winref;
+ refwin = thiswin;
break;
}
}
!
! if (refwin == NULL) /* didn't find it? */
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("window \"%s\" does not exist", windef->name),
parser_errposition(pstate, windef->location)));
+ else if (windef->frameOptions != refwin->frameOptions /* we can't use the clone */
+ && windef->frameOptions != FRAMEOPTION_DEFAULTS) /* we can't use the parent */
+ {
+ /*
+ * This means we've found the parent, but no clones (if there were
+ * any) had the correct frame options. We'll clone the parent we
+ * found (refwin), set the frame options we want and add the new
+ * clone to pstate:
+ */
+ WindowDef *clone = makeNode(WindowDef);
+
+ clone->name = NULL;
+ clone->refname = pstrdup(refwin->name);
+ clone->frameOptions = windef->frameOptions; /* Note windef! */
+ clone->startOffset = copyObject(refwin->startOffset);
+ clone->endOffset = copyObject(refwin->endOffset);
+ clone->location = refwin->location;
+
+ pstate->p_windowdefs = lappend(pstate->p_windowdefs, clone);
+ wfunc->winref = list_length(pstate->p_windowdefs);
+ }
}
else
{
*** a/src/backend/parser/parse_func.c
--- b/src/backend/parser/parse_func.c
***************
*** 726,731 **** ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
--- 726,747 ----
NameListToString(funcname)),
parser_errposition(pstate, location)));
+ if (over->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ {
+ /*
+ * As this is only implemented for the lead & lag window functions
+ * we'll filter out all aggregate functions too.
+ */
+ if (fdresult != FUNCDETAIL_WINDOWFUNC
+ || (strcmp("lead", strVal(llast(funcname))) != 0 &&
+ strcmp("lag", strVal(llast(funcname))) != 0))
+ {
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("IGNORE NULLS is only implemented for the lead and lag window functions"),
+ parser_errposition(pstate, location))); }
+ }
+
/*
* ordered aggs not allowed in windows yet
*/
*** a/src/backend/utils/adt/ruleutils.c
--- b/src/backend/utils/adt/ruleutils.c
***************
*** 4982,4992 **** get_rule_windowspec(WindowClause *wc, List *targetList,
bool needspace = false;
const char *sep;
ListCell *l;
appendStringInfoChar(buf, '(');
if (wc->refname)
{
! appendStringInfoString(buf, quote_identifier(wc->refname));
needspace = true;
}
/* partition clauses are always inherited, so only print if no refname */
--- 4982,4997 ----
bool needspace = false;
const char *sep;
ListCell *l;
+ size_t refname_len = 0;
+ int initial_buf_len = buf->len;
appendStringInfoChar(buf, '(');
if (wc->refname)
{
! const char *quoted_refname = quote_identifier(wc->refname);
!
! refname_len = strlen(quoted_refname);
! appendStringInfoString(buf, quoted_refname);
needspace = true;
}
/* partition clauses are always inherited, so only print if no refname */
***************
*** 5068,5074 **** get_rule_windowspec(WindowClause *wc, List *targetList,
/* we will now have a trailing space; remove it */
buf->len--;
}
! appendStringInfoChar(buf, ')');
}
/* ----------
--- 5073,5092 ----
/* we will now have a trailing space; remove it */
buf->len--;
}
!
! /*
! * We'll tidy up the output slightly; if we've got a refname, but haven't
! * overridden the partition-by, order-by or any of the frame flags
! * relevant inside the window def's ()s, then we'll be left with
! * "(<refname>)". We'll trim off the brackets in this case:
! */
! if (wc->refname && buf->len == initial_buf_len + refname_len + 1)
! {
! memcpy(buf->data + initial_buf_len, buf->data + initial_buf_len + 1, refname_len);
! buf->len -= 1; /* the trailing ")" */
! }
! else
! appendStringInfoChar(buf, ')');
}
/* ----------
***************
*** 7860,7866 **** get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
get_rule_expr((Node *) wfunc->aggfilter, context, false);
}
! appendStringInfoString(buf, ") OVER ");
foreach(l, context->windowClause)
{
--- 7878,7884 ----
get_rule_expr((Node *) wfunc->aggfilter, context, false);
}
! appendStringInfoString(buf, ") ");
foreach(l, context->windowClause)
{
***************
*** 7868,7873 **** get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context)
--- 7886,7895 ----
if (wc->winref == wfunc->winref)
{
+ if (wc->frameOptions & FRAMEOPTION_IGNORE_NULLS)
+ appendStringInfoString(buf, "IGNORE NULLS ");
+ appendStringInfoString(buf, "OVER ");
+
if (wc->name)
appendStringInfoString(buf, quote_identifier(wc->name));
else
*** a/src/backend/utils/adt/windowfuncs.c
--- b/src/backend/utils/adt/windowfuncs.c
***************
*** 13,19 ****
--- 13,21 ----
*/
#include "postgres.h"
+ #include "nodes/bitmapset.h"
#include "utils/builtins.h"
+ #include "utils/memutils.h"
#include "windowapi.h"
/*
***************
*** 24,29 **** typedef struct rank_context
--- 26,53 ----
int64 rank; /* current rank */
} rank_context;
+
+ typedef struct leadlag_const_context
+ {
+ /* the index of the lead / lagged value */
+ int64 current_nonnull;
+
+ /* how many non-NULL tuples before we can start emitting? */
+ int64 lag_wait;
+ } leadlag_const_context;
+
+ /*
+ * lead-lag process helpers
+ */
+ #define BITMAP_EXISTS(bitmap, index) \
+ (((bitmap)[(index) / 4] >> (((index) % 4)*2 + 0)) & 0x01)
+ #define BITMAP_NONNULL(bitmap, index) \
+ (((bitmap)[(index) / 4] >> (((index) % 4)*2 + 1)) & 0x01)
+ #define BITMAP_SET_EXISTS(bitmap, index) \
+ ((bitmap)[(index) / 4] |= (0x1 << (((index) % 4)*2 + 0)))
+ #define BITMAP_SET_NONNULL(bitmap, index) \
+ ((bitmap)[(index) / 4] |= (0x1 << (((index) % 4)*2 + 1)))
+
/*
* ntile process information
*/
***************
*** 38,43 **** typedef struct
--- 62,71 ----
static bool rank_up(WindowObject winobj);
static Datum leadlag_common(FunctionCallInfo fcinfo,
bool forward, bool withoffset, bool withdefault);
+ static Datum win_get_arg_ignore_nulls(WindowObject winobj, int argno,
+ int offset, int seektype,
+ bool const_offset, bool *isnull,
+ bool *isout);
/*
***************
*** 280,286 **** window_ntile(PG_FUNCTION_ARGS)
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
! * withdefault indicates we have a default third argument.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
--- 308,315 ----
* common operation of lead() and lag()
* For lead() forward is true, whereas for lag() it is false.
* withoffset indicates we have an offset second argument.
! * withdefault indicates we have a default third argument. We'll only
! * return this default if the offset we want is outside of the partition.
*/
static Datum
leadlag_common(FunctionCallInfo fcinfo,
***************
*** 290,303 **** leadlag_common(FunctionCallInfo fcinfo,
int32 offset;
bool const_offset;
Datum result;
! bool isnull;
! bool isout;
if (withoffset)
{
offset = DatumGetInt32(WinGetFuncArgCurrent(winobj, 1, &isnull));
if (isnull)
PG_RETURN_NULL();
const_offset = get_fn_expr_arg_stable(fcinfo->flinfo, 1);
}
else
--- 319,343 ----
int32 offset;
bool const_offset;
Datum result;
! bool isnull = false;
! bool isout = false;
! bool ignore_nulls;
!
! /* is IGNORE NULLS specified? */
! ignore_nulls = (WinGetFrameOptions(winobj) & FRAMEOPTION_IGNORE_NULLS) != 0;
if (withoffset)
{
offset = DatumGetInt32(WinGetFuncArgCurrent(winobj, 1, &isnull));
if (isnull)
PG_RETURN_NULL();
+
+ /*
+ * We want to set the markpos (the earliest tuple we can access) as
+ * aggressively as possible to save memory, but if the offset isn't
+ * constant we really need random access on the partition (so can't
+ * mark at all).
+ */
const_offset = get_fn_expr_arg_stable(fcinfo->flinfo, 1);
}
else
***************
*** 305,325 **** leadlag_common(FunctionCallInfo fcinfo,
offset = 1;
const_offset = true;
}
! result = WinGetFuncArgInPartition(winobj, 0,
! (forward ? offset : -offset),
! WINDOW_SEEK_CURRENT,
! const_offset,
! &isnull, &isout);
if (isout)
{
/*
! * target row is out of the partition; supply default value if
! * provided. otherwise it'll stay NULL
*/
if (withdefault)
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
}
if (isnull)
--- 345,388 ----
offset = 1;
const_offset = true;
}
+ if (!forward)
+ {
+ offset = -offset;
+ }
! if (ignore_nulls)
! {
! result = win_get_arg_ignore_nulls(winobj, 0,
! offset,
! WINDOW_SEEK_CURRENT,
! const_offset,
! &isnull, &isout);
! }
! else
! {
! result = WinGetFuncArgInPartition(winobj, 0,
! offset,
! WINDOW_SEEK_CURRENT,
! const_offset,
! &isnull, &isout);
! }
if (isout)
{
/*
! * Target row is out of the partition; supply default value if
! * provided.
*/
if (withdefault)
result = WinGetFuncArgCurrent(winobj, 2, &isnull);
+ else
+ {
+ /*
+ * Don't return whatever's lying around in result, force the
+ * output to null if there's no default.
+ */
+ Assert(isnull);
+ }
}
if (isnull)
***************
*** 329,334 **** leadlag_common(FunctionCallInfo fcinfo,
--- 392,674 ----
}
/*
+ * win_get_arg_ignore_nulls
+ *
+ * Like WinGetFuncArgInPartition, but skips over any rows where the argument
+ * evaluates to NULL as though they did not exist. Offset is relative to the
+ * current row (positive or negative). If offset is zero, *isout is set false,
+ * and the argument at the current row is returned, even if it is NULL.
+ *
+ * Some argument names differ from WinGetFuncArgInPartition, but the meanings
+ * are the same.
+ */
+ static Datum
+ win_get_arg_ignore_nulls(WindowObject winobj, int argno,
+ int offset, int seektype, bool const_offset,
+ bool *result_isnull, bool *result_isout)
+ {
+ bool local_isout = false;
+ bool local_isnull = false;
+ Datum result;
+
+ Assert(seektype == WINDOW_SEEK_CURRENT);
+
+ /*
+ * Special case: when applying the offset, we're ignoring all rows where
+ * the argument evaluates to NULL. But if the offset is zero, the standard
+ * requires that we evaluate the argument at the current row even if it is
+ * NULL.
+ */
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ offset,
+ WINDOW_SEEK_CURRENT,
+ const_offset,
+ result_isnull, result_isout);
+ Assert (*result_isout == false);
+ return result;
+ }
+
+ if (const_offset)
+ {
+ leadlag_const_context *context;
+
+ /*
+ * When the offset is constant, we can mark the position of the first
+ * tuple we'll need. This optimization allows unneeded tuples to be
+ * freed, which also avoids searching through them unnecessarily.
+ *
+ * For a positive offset, the mark will be set to the current row,
+ * because we need to know whether it's NULL.
+ *
+ * For a negative offset, the mark will always be set to the index of
+ * the 'offset'th non-NULL value before the current row, if it
+ * exists.
+ */
+
+ context = WinGetPartitionLocalMemory(winobj,
+ sizeof(leadlag_const_context));
+
+ /*
+ * First time through, initialize.
+ */
+ if (WinGetCurrentPosition(winobj) == 0)
+ {
+ context->current_nonnull = -1;
+
+ /*
+ * If offset is positive, scan until 'offset' non-NULL values are
+ * seen, updating 'current_nonnull'. Do not set the mark.
+ */
+ if (offset >= 0)
+ {
+ int i;
+
+ for (i = 0; i < offset && !local_isout; i++)
+ {
+ do
+ {
+ WinGetFuncArgInPartition(winobj, 0,
+ ++context->current_nonnull,
+ WINDOW_SEEK_HEAD,
+ false,
+ &local_isnull,
+ &local_isout);
+ }
+ while (local_isnull && !local_isout);
+ }
+
+ context->lag_wait = 0;
+ }
+ else
+ context->lag_wait = -offset;
+ }
+
+ /*
+ * Look at the current row to check whether it's NULL or not. If the
+ * offset is positive, use this opportunity to mark the position;
+ * otherwise it will be done later.
+ */
+ WinGetFuncArgInPartition(winobj, 0,
+ 0,
+ WINDOW_SEEK_CURRENT,
+ (offset >= 0),
+ &local_isnull, &local_isout);
+
+ /* offset negative, and haven't seen enough non-NULL values yet */
+ if (context->lag_wait > 0)
+ {
+ if (!local_isnull)
+ {
+ context->lag_wait--;
+ if (context->current_nonnull == -1)
+ context->current_nonnull = WinGetCurrentPosition(winobj);
+ }
+
+ *result_isnull = true;
+ *result_isout = true;
+ return (Datum) 0;
+ }
+
+ /* For negative offset, we fetch the value before scanning forward */
+ if (offset < 0)
+ result = WinGetFuncArgInPartition(winobj, 0,
+ context->current_nonnull,
+ WINDOW_SEEK_HEAD,
+ false,
+ result_isnull, result_isout);
+
+ /*
+ * If the current position's value is non-NULL, scan forward to find
+ * the next non-NULL value and update 'current_nonnull'. If offset is
+ * negative, then mark it.
+ *
+ * If the current position's value is NULL, then nothing changes.
+ */
+ if (!local_isnull)
+ {
+ do
+ {
+ WinGetFuncArgInPartition(winobj, 0,
+ ++context->current_nonnull,
+ WINDOW_SEEK_HEAD,
+ (offset < 0),
+ &local_isnull, &local_isout);
+ }
+ while (local_isnull && !local_isout);
+ }
+
+ /* For positive offset, we fetch the value after scanning forward */
+ if (offset >= 0)
+ result = WinGetFuncArgInPartition(winobj, 0,
+ context->current_nonnull,
+ WINDOW_SEEK_HEAD,
+ false,
+ result_isnull, result_isout);
+
+ return result;
+ }
+ else
+ {
+ char *bitmap;
+ int64 scanning,
+ current = WinGetCurrentPosition(winobj);
+
+ bool local_forward = (offset >= 0);
+
+ /*
+ * Create a bitmap with at least 2 * n positions. The first bit
+ * records whether we've examined that row before, and the second
+ * records whether it's non-NULL. No initialization is required,
+ * because it begins with all zeros, as it should.
+ *
+ * This bitmap allows us to randomly access the 'offset'th non-NULL
+ * value without scanning 'offset' (or more) tuple in the window each
+ * time. We still have to scan forward through the bitmap to skip over
+ * NULLs, but that's much cheaper.
+ */
+ bitmap = WinGetPartitionLocalMemory(
+ winobj, (WinGetPartitionRowCount(winobj) / 4) + 1);
+
+ /*
+ * We use offset >= 0 instead of just forward as the offset might be
+ * in the opposite direction to the way we're scanning. We'll then
+ * force offset to be positive to make counting down the rows easier.
+ */
+ local_forward = offset == 0 ? local_forward : (offset > 0);
+ offset = abs(offset);
+
+ for (scanning = current;; local_forward ? ++scanning : --scanning)
+ {
+ if (scanning < 0 || scanning >= WinGetPartitionRowCount(winobj))
+ {
+ local_isout = true;
+
+ /*
+ * As we're out of the window we want to return NULL or the
+ * default value, but not whatever's left in result. We'll use
+ * the isnull flag to say "ignore it"!
+ */
+ local_isnull = true;
+ result = (Datum) 0;
+
+ break;
+ }
+
+ if (BITMAP_EXISTS(bitmap, scanning))
+ {
+ local_isnull = !BITMAP_NONNULL(bitmap, scanning);
+ }
+ else
+ {
+ /*
+ * first time we've accessed this index; let's see if it's
+ * null:
+ */
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &local_isnull, &local_isout);
+ if (local_isout)
+ break;
+
+ BITMAP_SET_EXISTS(bitmap, scanning);
+ if (!local_isnull)
+ BITMAP_SET_NONNULL(bitmap, scanning);
+ }
+
+ /*
+ * Now the isnull flag is set correctly. If !isnull there's a
+ * chance that we may stop iterating here:
+ */
+ if (!local_isnull)
+ {
+ if (offset == 0)
+ {
+ result = WinGetFuncArgInPartition(winobj, 0,
+ scanning,
+ WINDOW_SEEK_HEAD,
+ false,
+ &local_isnull, &local_isout);
+
+ break;
+ }
+ else
+ --offset; /* it's not null, so we're one step closer to
+ * the value we want */
+ }
+ else if (scanning == current)
+ {
+ /*--------
+ * A slight edge case. Consider:
+ *
+ * =================
+ * A | lag(A, 1)
+ * =================
+ * 1 | NULL
+ * 2 | 1
+ * NULL | ?
+ * =================
+ *
+ * Does a lag of one when the current value is null mean go back to the first
+ * non-null value (i.e. 2), or find the previous non-null value of the first
+ * non-null value (i.e. 1)? We're implementing the former semantics, so we'll
+ * need to correct slightly:
+ *--------
+ */
+ --offset;
+ }
+ }
+ }
+
+ *result_isnull = local_isnull;
+ *result_isout = local_isout;
+ return result;
+ }
+
+ /*
* lag
* returns the value of VE evaluated on a row that is 1
* row before the current row within a partition,
*** a/src/include/nodes/parsenodes.h
--- b/src/include/nodes/parsenodes.h
***************
*** 420,438 **** typedef struct SortBy
* For entries in a WINDOW list, "name" is the window name being defined.
* For OVER clauses, we use "name" for the "OVER window" syntax, or "refname"
* for the "OVER (window)" syntax, which is subtly different --- the latter
! * implies overriding the window frame clause.
*/
typedef struct WindowDef
{
NodeTag type;
! char *name; /* window's own name */
! char *refname; /* referenced window name, if any */
! List *partitionClause; /* PARTITION BY expression list */
! List *orderClause; /* ORDER BY (list of SortBy) */
! int frameOptions; /* frame_clause options, see below */
! Node *startOffset; /* expression for starting bound, if any */
! Node *endOffset; /* expression for ending bound, if any */
! int location; /* parse location, or -1 if none/unknown */
} WindowDef;
/*
--- 420,457 ----
* For entries in a WINDOW list, "name" is the window name being defined.
* For OVER clauses, we use "name" for the "OVER window" syntax, or "refname"
* for the "OVER (window)" syntax, which is subtly different --- the latter
! * implies overriding the window frame clause. The semantics of each override
! * depends on the field.
*/
typedef struct WindowDef
{
NodeTag type;
! /* Window's own name. This must be NULL for overrides. */
! char *name;
! /* Referenced window name, if any. This must be present on overrides. */
! char *refname;
! /*
! * PARTITION BY expression list. If an override leaves this NULL, the
! * parent's partitionClause will be used.
! */
! List *partitionClause;
! /*
! * ORDER BY (list of SortBy). This field is ignored in overrides - the
! * parent's value will always be used.
! */
! List *orderClause;
! /*
! * The remaining fields in this struct must be specified on overrides,
! * even if the override's value is the same as the parent's.
! */
! /* frame_clause options, see below */
! int frameOptions;
! /* Expression for starting bound, if any */
! Node *startOffset;
! /* expression for ending bound, if any */
! Node *endOffset;
! /* parse location, or -1 if none/unknown */
! int location;
} WindowDef;
/*
***************
*** 457,462 **** typedef struct WindowDef
--- 476,482 ----
#define FRAMEOPTION_END_VALUE_PRECEDING 0x00800 /* end is V. P. */
#define FRAMEOPTION_START_VALUE_FOLLOWING 0x01000 /* start is V. F. */
#define FRAMEOPTION_END_VALUE_FOLLOWING 0x02000 /* end is V. F. */
+ #define FRAMEOPTION_IGNORE_NULLS 0x04000
#define FRAMEOPTION_START_VALUE \
(FRAMEOPTION_START_VALUE_PRECEDING | FRAMEOPTION_START_VALUE_FOLLOWING)
*** a/src/include/parser/kwlist.h
--- b/src/include/parser/kwlist.h
***************
*** 180,185 **** PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
--- 180,186 ----
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+ PG_KEYWORD("ignore", IGNORE, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
***************
*** 313,318 **** PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD)
--- 314,320 ----
PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD)
PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD)
PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD)
+ PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD)
PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD)
PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD)
PG_KEYWORD("returning", RETURNING, RESERVED_KEYWORD)
*** a/src/include/windowapi.h
--- b/src/include/windowapi.h
***************
*** 46,51 **** extern void *WinGetPartitionLocalMemory(WindowObject winobj, Size sz);
--- 46,53 ----
extern int64 WinGetCurrentPosition(WindowObject winobj);
extern int64 WinGetPartitionRowCount(WindowObject winobj);
+ extern int WinGetFrameOptions(WindowObject winobj);
+
extern void WinSetMarkPosition(WindowObject winobj, int64 markpos);
extern bool WinRowsArePeers(WindowObject winobj, int64 pos1, int64 pos2);
*** a/src/test/regress/expected/window.out
--- b/src/test/regress/expected/window.out
***************
*** 1822,1824 **** SELECT i, b, bool_and(b) OVER w, bool_or(b) OVER w
--- 1822,2251 ----
5 | t | t | t
(5 rows)
+ -- check we haven't reserved words that might break backwards-compatibility:
+ CREATE TABLE reserved (
+ ignore text,
+ respect text,
+ nulls text
+ );
+ DROP TABLE reserved;
+ -- testing ignore nulls functionality
+ CREATE TEMPORARY TABLE dogs (
+ name text,
+ breed text,
+ age int
+ );
+ INSERT INTO dogs VALUES
+ ('ajax', 'mythological', NULL),
+ ('alfred', NULL, 8),
+ ('bones', 'shar pei', NULL),
+ ('churchill', 'bulldog', NULL),
+ ('lassie', NULL, 4),
+ ('mickey', 'poodle', 7),
+ ('molly', 'poodle', NULL),
+ ('rover', 'shar pei', 3);
+ -- test view definitions are preserved
+ CREATE TEMP VIEW v_dogs AS
+ SELECT
+ name,
+ sum(age) OVER (order by age rows between 1 preceding and 1 following) as sum_rows,
+ lag(age, 1) IGNORE NULLS OVER (ORDER BY name DESC) AS lagged_by_1,
+ lag(age, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM dogs
+ WINDOW w as (ORDER BY name ASC);
+ SELECT pg_get_viewdef('v_dogs');
+ pg_get_viewdef
+ --------------------------------------------------------------------------------------------------
+ SELECT dogs.name, +
+ sum(dogs.age) OVER (ORDER BY dogs.age ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS sum_rows,+
+ lag(dogs.age, 1) IGNORE NULLS OVER (ORDER BY dogs.name DESC) AS lagged_by_1, +
+ lag(dogs.age, 2) IGNORE NULLS OVER w AS lagged_by_2 +
+ FROM dogs +
+ WINDOW w AS (ORDER BY dogs.name);
+ (1 row)
+
+ CREATE FUNCTION volatile_int(INT) RETURNS INT AS
+ $$ BEGIN RETURN $1; END; $$
+ LANGUAGE PLPGSQL VOLATILE;
+ -- (1) lags by constant
+ SELECT name, lag(age) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred |
+ bones | 8
+ churchill |
+ lassie |
+ mickey | 4
+ molly | 7
+ rover |
+ (8 rows)
+
+ SELECT name, lag(age) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred |
+ bones | 8
+ churchill |
+ lassie |
+ mickey | 4
+ molly | 7
+ rover |
+ (8 rows)
+
+ SELECT name, lag(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred |
+ bones | 8
+ churchill | 8
+ lassie | 8
+ mickey | 4
+ molly | 7
+ rover | 7
+ (8 rows)
+
+ -- (2) leads by constant
+ SELECT name, lead(age) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 8
+ alfred |
+ bones |
+ churchill | 4
+ lassie | 7
+ mickey |
+ molly | 3
+ rover |
+ (8 rows)
+
+ SELECT name, lead(age) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 8
+ alfred |
+ bones |
+ churchill | 4
+ lassie | 7
+ mickey |
+ molly | 3
+ rover |
+ (8 rows)
+
+ SELECT name, lead(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 8
+ alfred | 4
+ bones | 4
+ churchill | 4
+ lassie | 7
+ mickey | 3
+ molly | 3
+ rover |
+ (8 rows)
+
+ -- (3) lags by expression
+ SELECT name, lag(age * 2, volatile_int(1)) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred |
+ bones | 16
+ churchill |
+ lassie |
+ mickey | 8
+ molly | 14
+ rover |
+ (8 rows)
+
+ SELECT name, lag(age * 2, volatile_int(1)) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred |
+ bones | 16
+ churchill |
+ lassie |
+ mickey | 8
+ molly | 14
+ rover |
+ (8 rows)
+
+ SELECT name, lag(age * 2, volatile_int(1)) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred |
+ bones | 16
+ churchill | 16
+ lassie | 16
+ mickey | 8
+ molly | 14
+ rover | 14
+ (8 rows)
+
+ -- (4) leads by expression
+ SELECT name, lead(age * 2, volatile_int(1)) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 16
+ alfred |
+ bones |
+ churchill | 8
+ lassie | 14
+ mickey |
+ molly | 6
+ rover |
+ (8 rows)
+
+ SELECT name, lead(age * 2, volatile_int(1)) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 16
+ alfred |
+ bones |
+ churchill | 8
+ lassie | 14
+ mickey |
+ molly | 6
+ rover |
+ (8 rows)
+
+ SELECT name, lead(age * 2, volatile_int(1)) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 16
+ alfred | 8
+ bones | 8
+ churchill | 8
+ lassie | 14
+ mickey | 6
+ molly | 6
+ rover |
+ (8 rows)
+
+ -- (4) defaults
+ SELECT name, lead(age, 1, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 8
+ alfred | 4
+ bones | 4
+ churchill | 4
+ lassie | 7
+ mickey | 3
+ molly | 3
+ rover | -1
+ (8 rows)
+
+ SELECT name, lead(age, 2, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 4
+ alfred | 7
+ bones | 7
+ churchill | 7
+ lassie | 3
+ mickey | -1
+ molly | -1
+ rover | -1
+ (8 rows)
+
+ SELECT name, lead(age, 3, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 7
+ alfred | 3
+ bones | 3
+ churchill | 3
+ lassie | -1
+ mickey | -1
+ molly | -1
+ rover | -1
+ (8 rows)
+
+ SELECT name, lead(age, volatile_int(1), -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 8
+ alfred | 4
+ bones | 4
+ churchill | 4
+ lassie | 7
+ mickey | 3
+ molly | 3
+ rover | -1
+ (8 rows)
+
+ SELECT name, lead(age, volatile_int(2), -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 4
+ alfred | 7
+ bones | 7
+ churchill | 7
+ lassie | 3
+ mickey | -1
+ molly | -1
+ rover | -1
+ (8 rows)
+
+ SELECT name, lead(age, volatile_int(3), -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax | 7
+ alfred | 3
+ bones | 3
+ churchill | 3
+ lassie | -1
+ mickey | -1
+ molly | -1
+ rover | -1
+ (8 rows)
+
+ SELECT name, lag(age, 1, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax | -1
+ alfred | -1
+ bones | 8
+ churchill | 8
+ lassie | 8
+ mickey | 4
+ molly | 7
+ rover | 7
+ (8 rows)
+
+ SELECT name, lag(age, 2, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax | -1
+ alfred | -1
+ bones | -1
+ churchill | -1
+ lassie | -1
+ mickey | 8
+ molly | 4
+ rover | 4
+ (8 rows)
+
+ SELECT name, lag(age, 3, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax | -1
+ alfred | -1
+ bones | -1
+ churchill | -1
+ lassie | -1
+ mickey | -1
+ molly | 8
+ rover | 8
+ (8 rows)
+
+ SELECT name, lag(age, volatile_int(1), -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax | -1
+ alfred | -1
+ bones | 8
+ churchill | 8
+ lassie | 8
+ mickey | 4
+ molly | 7
+ rover | 7
+ (8 rows)
+
+ SELECT name, lag(age, volatile_int(2), -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax | -1
+ alfred | -1
+ bones | -1
+ churchill | -1
+ lassie | -1
+ mickey | 8
+ molly | 4
+ rover | 4
+ (8 rows)
+
+ SELECT name, lag(age, volatile_int(3), -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax | -1
+ alfred | -1
+ bones | -1
+ churchill | -1
+ lassie | -1
+ mickey | -1
+ molly | 8
+ rover | 8
+ (8 rows)
+
+ -- these should be errors as the functionality isn't implemented yet:
+ SELECT name, first_value(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ ERROR: IGNORE NULLS is only implemented for the lead and lag window functions
+ LINE 1: SELECT name, first_value(age) IGNORE NULLS OVER (ORDER BY na...
+ ^
+ SELECT name, max(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ ERROR: IGNORE NULLS is only implemented for the lead and lag window functions
+ LINE 1: SELECT name, max(age) IGNORE NULLS OVER (ORDER BY name) FROM...
+ ^
+ -- ensure that a zero offset still returns the current value, even if NULL
+ SELECT name, lead(age, 0) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax |
+ alfred | 8
+ bones |
+ churchill |
+ lassie | 4
+ mickey | 7
+ molly |
+ rover | 3
+ (8 rows)
+
+ SELECT name, lag(age, 0) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred | 8
+ bones |
+ churchill |
+ lassie | 4
+ mickey | 7
+ molly |
+ rover | 3
+ (8 rows)
+
+ SELECT name, lead(age, 0, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lead
+ -----------+------
+ ajax |
+ alfred | 8
+ bones |
+ churchill |
+ lassie | 4
+ mickey | 7
+ molly |
+ rover | 3
+ (8 rows)
+
+ SELECT name, lag(age, 0, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ name | lag
+ -----------+-----
+ ajax |
+ alfred | 8
+ bones |
+ churchill |
+ lassie | 4
+ mickey | 7
+ molly |
+ rover | 3
+ (8 rows)
+
+ DROP TABLE dogs CASCADE;
+ NOTICE: drop cascades to view v_dogs
+ DROP FUNCTION volatile_int(INT);
*** a/src/test/regress/sql/window.sql
--- b/src/test/regress/sql/window.sql
***************
*** 641,643 **** SELECT to_char(SUM(n::float8) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FO
--- 641,732 ----
SELECT i, b, bool_and(b) OVER w, bool_or(b) OVER w
FROM (VALUES (1,true), (2,true), (3,false), (4,false), (5,true)) v(i,b)
WINDOW w AS (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING);
+
+ -- check we haven't reserved words that might break backwards-compatibility:
+ CREATE TABLE reserved (
+ ignore text,
+ respect text,
+ nulls text
+ );
+ DROP TABLE reserved;
+
+ -- testing ignore nulls functionality
+
+ CREATE TEMPORARY TABLE dogs (
+ name text,
+ breed text,
+ age int
+ );
+
+ INSERT INTO dogs VALUES
+ ('ajax', 'mythological', NULL),
+ ('alfred', NULL, 8),
+ ('bones', 'shar pei', NULL),
+ ('churchill', 'bulldog', NULL),
+ ('lassie', NULL, 4),
+ ('mickey', 'poodle', 7),
+ ('molly', 'poodle', NULL),
+ ('rover', 'shar pei', 3);
+
+ -- test view definitions are preserved
+ CREATE TEMP VIEW v_dogs AS
+ SELECT
+ name,
+ sum(age) OVER (order by age rows between 1 preceding and 1 following) as sum_rows,
+ lag(age, 1) IGNORE NULLS OVER (ORDER BY name DESC) AS lagged_by_1,
+ lag(age, 2) IGNORE NULLS OVER w AS lagged_by_2
+ FROM dogs
+ WINDOW w as (ORDER BY name ASC);
+ SELECT pg_get_viewdef('v_dogs');
+
+ CREATE FUNCTION volatile_int(INT) RETURNS INT AS
+ $$ BEGIN RETURN $1; END; $$
+ LANGUAGE PLPGSQL VOLATILE;
+
+ -- (1) lags by constant
+ SELECT name, lag(age) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ -- (2) leads by constant
+ SELECT name, lead(age) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ -- (3) lags by expression
+ SELECT name, lag(age * 2, volatile_int(1)) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age * 2, volatile_int(1)) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age * 2, volatile_int(1)) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ -- (4) leads by expression
+ SELECT name, lead(age * 2, volatile_int(1)) OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age * 2, volatile_int(1)) RESPECT NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age * 2, volatile_int(1)) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ -- (4) defaults
+ SELECT name, lead(age, 1, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age, 2, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age, 3, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age, volatile_int(1), -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age, volatile_int(2), -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age, volatile_int(3), -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ SELECT name, lag(age, 1, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age, 2, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age, 3, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age, volatile_int(1), -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age, volatile_int(2), -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age, volatile_int(3), -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ -- these should be errors as the functionality isn't implemented yet:
+ SELECT name, first_value(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, max(age) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ -- ensure that a zero offset still returns the current value, even if NULL
+ SELECT name, lead(age, 0) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age, 0) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lead(age, 0, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+ SELECT name, lag(age, 0, -1) IGNORE NULLS OVER (ORDER BY name) FROM dogs ORDER BY name;
+
+ DROP TABLE dogs CASCADE;
+ DROP FUNCTION volatile_int(INT);
On Thu, 2014-07-10 at 23:43 -0700, Jeff Davis wrote:
On Mon, 2014-07-07 at 01:21 -0700, Jeff Davis wrote:
On Sun, 2014-07-06 at 21:11 -0700, Jeff Davis wrote:
On Wed, 2014-04-16 at 12:50 +0100, Nicholas White wrote:
Thanks for the detailed feedback, I'm sorry it took so long to
incorporate it. I've attached the latest version of the patch, fixing
in particular:As innocent as this patch seemed at first, it actually opens up a lot of
questions.Attached is the (incomplete) edit of the patch so far.
I haven't received much feedback so far on my changes, and I don't think
I will finish hacking on this in the next few days, so I will mark it
"returned with feedback".
I don't think it's too far away, but my changes are substantial enough
(and incomplete enough) that some feedback is required.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Jeff
(Reviving an old thread for 2014...)
* Jeff Davis (pgsql@j-davis.com) wrote:
On Thu, 2014-07-10 at 23:43 -0700, Jeff Davis wrote:
On Mon, 2014-07-07 at 01:21 -0700, Jeff Davis wrote:
On Sun, 2014-07-06 at 21:11 -0700, Jeff Davis wrote:
On Wed, 2014-04-16 at 12:50 +0100, Nicholas White wrote:
Thanks for the detailed feedback, I'm sorry it took so long to
incorporate it. I've attached the latest version of the patch, fixing
in particular:As innocent as this patch seemed at first, it actually opens up a lot of
questions.Attached is the (incomplete) edit of the patch so far.
I haven't received much feedback so far on my changes, and I don't think
I will finish hacking on this in the next few days, so I will mark it
"returned with feedback".I don't think it's too far away, but my changes are substantial enough
(and incomplete enough) that some feedback is required.
Would you have time to work on this for 9.7..? I came across a
real-world use case for exactly this capability and was sorely
disappointed to discover we didn't support it even though there had been
discussion for years on it, which quite a few interested parties.
I'll take a look at reviewing your last version of the patch but wanted
to get an idea of if you still had interest and might be able to help.
Thanks!
Stephen
Old thread link:
/messages/by-id/CA+=vxNa5_N1q5q5OkxC0aQnNdbo2Ru6GVw+86wk+oNsUNJDLig@mail.gmail.com
On Thu, Apr 14, 2016 at 1:29 PM, Stephen Frost <sfrost@snowman.net> wrote:
Jeff
(Reviving an old thread for 2014...)
Would you have time to work on this for 9.7..? I came across a
real-world use case for exactly this capability and was sorely
disappointed to discover we didn't support it even though there had been
discussion for years on it, which quite a few interested parties.I'll take a look at reviewing your last version of the patch but wanted
to get an idea of if you still had interest and might be able to help.Thanks!
Stephen
There are actually quite a few issues remaining here.
First, I think the syntax is still implemented in a bad way. Right now
it's part of the OVER clause, and the IGNORE NULLS gets put into the
frame options. It doesn't match the way the spec defines the grammar,
and I don't see how it really makes sense that it's a part of the
frame options or the window object at all. I believe that it should be
a part of the FuncCall, and end up in the FuncExpr, so the flag can be
read when the function is called without exposing frame options (which
we don't want to do). I think the reason it was done as a part of the
window object is so that it could save state between calls for the
purpose of optimization, but I think that can be done using fn_extra.
This change should remove a lot of the complexity about trying to
share window definitions, etc.
Second, we need a way to tell which functions support IGNORE NULLS and
which do not. The grammar as implemented in the patch seems to allow
it for any function with an OVER clause (which can include ordinary
aggregates). Then the parse analysis hard-codes that only LEAD and LAG
accept IGNORE NULLS, and throws an error otherwise. Neither of these
things seem right. I think we need a need catalog support to say
whether a function honors IGNORE|RESPECT NULLS or not, which means we
also need support in CREATE FUNCTION.
I think the execution is pretty good, except that (a) we need to keep
the state in fn_extra rather than the winstate; and (b) we should get
rid of the bitmaps and just do a naive scan unless we really think
non-constant offsets will be important. We can always optimize more
later.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Just doing a drive-by...
On Fri, May 20, 2016 at 3:50 PM, Jeff Davis <pgsql@j-davis.com> wrote:
Old thread link:
/messages/by-id/CA+=vxNa5_N1q5q5OkxC0aQnNdbo2Ru6GVw+86wk+oNsUNJDLig@mail.gmail.com
On Thu, Apr 14, 2016 at 1:29 PM, Stephen Frost <sfrost@snowman.net> wrote:
Jeff
(Reviving an old thread for 2014...)
Would you have time to work on this for 9.7..? I came across a
real-world use case for exactly this capability and was sorely
disappointed to discover we didn't support it even though there had been
discussion for years on it, which quite a few interested parties.
First, I think the syntax is still implemented in a bad way. Right now
it's part of the OVER clause, and the IGNORE NULLS gets put into the
frame options. It doesn't match the way the spec defines the grammar,
and I don't see how it really makes sense that it's a part of the
frame options or the window object at all.
How does the relatively new FILTER clause play into this, if at all?
I think we need a need catalog support to say
whether a function honors IGNORE|RESPECT NULLS or not, which means we
also need support in CREATE FUNCTION.
We already have "STRICT" for deciding whether a function processes nulls.
Wouldn't this need to exist on the "CREATE AGGREGATE"
Rhetorical question: I presume we are going to punt on the issue, since it
is non-standard, but what is supposed to happen with a window invocation
that ignores nulls but has a non-strict function that returns a non-null on
null input?
David J.
On Fri, May 20, 2016 at 1:41 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
How does the relatively new FILTER clause play into this, if at all?
My interpretation of the standard is that FILTER is not allowable for
a window function, and IGNORE|RESPECT NULLS is not allowable for an
ordinary aggregate.
So if we support IGNORE|RESPECT NULLS for anything other than a window
function, we have to come up with our own semantics.
We already have "STRICT" for deciding whether a function processes nulls.
Wouldn't this need to exist on the "CREATE AGGREGATE"
STRICT defines behavior at DDL time. I was suggesting that we might
want a DDL-time flag to indicate whether a function can make use of
the query-time IGNORE|RESPECT NULLS option. In other words, most
functions wouldn't know what to do with IGNORE|RESPECT NULLS, but
perhaps some would if we allowed them the option.
Perhaps I didn't understand your point?
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, May 23, 2016 at 12:01 PM, Jeff Davis <pgsql@j-davis.com> wrote:
On Fri, May 20, 2016 at 1:41 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:How does the relatively new FILTER clause play into this, if at all?
My interpretation of the standard is that FILTER is not allowable for
a window function, and IGNORE|RESPECT NULLS is not allowable for an
ordinary aggregate.So if we support IGNORE|RESPECT NULLS for anything other than a window
function, we have to come up with our own semantics.We already have "STRICT" for deciding whether a function processes nulls.
Wouldn't this need to exist on the "CREATE AGGREGATE"STRICT defines behavior at DDL time. I was suggesting that we might
want a DDL-time flag to indicate whether a function can make use of
the query-time IGNORE|RESPECT NULLS option. In other words, most
functions wouldn't know what to do with IGNORE|RESPECT NULLS, but
perhaps some would if we allowed them the option.Perhaps I didn't understand your point?
The "this" in the quote doesn't refer to STRICT but rather the
non-existence feature that we are talking about.
I am trying to make the point that the DDL syntax for "RESPECT|IGNORE
NULLS" should be on the "CREATE AGGREGATE" page and not the "CREATE
FUNCTION" page.
As far as a plain function cares its only responsibility with respect to
NULLs is defined by the STRICT property. As this behavior only manifests
in aggregate situations the corresponding property should be defined there
as well.
David J.
My interpretation of the standard is that FILTER is not allowable for
a window function, and IGNORE|RESPECT NULLS is not allowable for an
ordinary aggregate.
Yes, it is clear.
So if we support IGNORE|RESPECT NULLS for anything other than a window
function, we have to come up with our own semantics.
I don't think this clause is useful for aggregates especially while we
already have the FILTER clause. Though, I can see this error message
being useful:
ERROR: IGNORE NULLS is only implemented for the lead and lag window functions
Can we still give this message when the syntax is not part of the OVER clause?
Thank you for returning back to this patch.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 23 May 2016 at 17:01, Jeff Davis <pgsql@j-davis.com> wrote:
On Fri, May 20, 2016 at 1:41 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:How does the relatively new FILTER clause play into this, if at all?
My interpretation of the standard is that FILTER is not allowable for
a window function, and IGNORE|RESPECT NULLS is not allowable for an
ordinary aggregate.
That may be so, but we already support FILTER for all windows
functions as well as aggregates:
https://www.postgresql.org/docs/current/static/sql-expressions.html#SYNTAX-AGGREGATES
https://www.postgresql.org/docs/current/static/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS
so to be clear, what we're talking about here is just supporting SQL
standard syntax for window functions, rather than adding any new
functionality, right?
So if we support IGNORE|RESPECT NULLS for anything other than a window
function, we have to come up with our own semantics.
Given that we can already do this using FILTER for aggregates, and
that IGNORE|RESPECT NULLS for aggregates is not part of the SQL
standard, I see no reason to support it.
Regards,
Dean
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
"Dean" == Dean Rasheed <dean.a.rasheed@gmail.com> writes:
Dean> That may be so, but we already support FILTER for all windows
Dean> functions as well as aggregates:
Not so:
"If FILTER is specified, then only the input rows for which the
filter_clause evaluates to true are fed to the window function; other
rows are discarded. Only window functions that are aggregates accept a
FILTER clause."
(Per spec, FILTER appears only in the <aggregate function> production,
which is just one of the several options for <window function>.)
--
Andrew (irc:RhodiumToad)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 30 May 2016 at 15:44, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
"Dean" == Dean Rasheed <dean.a.rasheed@gmail.com> writes:
Dean> That may be so, but we already support FILTER for all windows
Dean> functions as well as aggregates:Not so:
"If FILTER is specified, then only the input rows for which the
filter_clause evaluates to true are fed to the window function; other
rows are discarded. Only window functions that are aggregates accept a
FILTER clause."
Ah, yes. It's all coming back to me now.
Regards,
Dean
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers