Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

Started by Oliver Fordalmost 8 years ago122 messageshackers

ojford@gmail.com

almost 8 years ago

Adds the options RESPECT/IGNORE NULLS (null treatment clause) and FROM
FIRST/LAST to the non-aggregate window functions.

A previous patch
(/messages/by-id/CA+=vxNa5_N1q5q5OkxC0aQnNdbo2Ru6GVw+86wk+oNsUNJDLig@mail.gmail.com)
partially implemented this feature. However, that patch worked by
adding the null treatment clause to the window frame's frameOptions
variable, and consequently had the limitation that it wasn't possible
to reuse a window frame definition in a single query where two
functions were called that had different null treatment options. This
meant that the patch was never committed. The attached path takes a
different approach which gets around this limitation.

For example, the following query would not work correctly with the
implementation in the old patch but does with the attached patch:

WITH cte (x) AS (
select null union select 1 union select 2)
SELECT x,
first_value(x) over w as with_default,
first_value(x) respect nulls over w as with_respect,
first_value(x) ignore nulls over w as with_ignore
from cte WINDOW w as (order by x nulls first rows between unbounded
preceding and unbounded following);
x | with_default | with_respect | with_ignore
---+--------------+--------------+-------------
| | | 1
1 | | | 1
2 | | | 1
(3 rows)

== Implementation ==

The patch adds two types to the pg_type catalog: "ignorenulls" and
"fromlast". These types are of the Boolean category, and work as
wrappers around the bool type. They are used as function arguments to
extra versions of the window functions that take additional boolean
arguments. RESPECT NULLS and FROM FIRST are ignored by the parser, but
IGNORE NULLS and FROM LAST lead to the extra versions being called
with arguments to ignore nulls and order from last.

== Testing ==

Updated documentation and added regression tests. All existing tests
pass. This change will need a catversion bump.
Thanks to Krasiyan Andreev for initially testing this patch.

David Fetter

david@fetter.org

over 7 years ago

In reply to: Oliver Ford (#1)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

On Fri, Jul 13, 2018 at 01:52:00PM +0100, Oliver Ford wrote:

Adds the options RESPECT/IGNORE NULLS (null treatment clause) and FROM
FIRST/LAST to the non-aggregate window functions.

Please find attached an updated version for OID drift.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Krasiyan Andreev

krasiyan@gmail.com

over 7 years ago

In reply to: David Fetter (#2)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

Hi,
Patch applies and compiles, all included tests and building of the docs
pass.
I am using last version from more than two months ago in production
environment with real data and I didn't find any bugs,
so I'm marking this patch as ready for committer in the commitfest app.

На сб, 28.07.2018 г. в 22:00 ч. David Fetter <david@fetter.org> написа:

Show quoted text

On Fri, Jul 13, 2018 at 01:52:00PM +0100, Oliver Ford wrote:

Adds the options RESPECT/IGNORE NULLS (null treatment clause) and FROM
FIRST/LAST to the non-aggregate window functions.

Please find attached an updated version for OID drift.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Andrew Gierth

andrew@tao11.riddles.org.uk

over 7 years ago

In reply to: Krasiyan Andreev (#3)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

"Krasiyan" == Krasiyan Andreev <krasiyan@gmail.com> writes:

Krasiyan> Hi,

Krasiyan> Patch applies and compiles, all included tests and building
Krasiyan> of the docs pass. I am using last version from more than two
Krasiyan> months ago in production environment with real data and I
Krasiyan> didn't find any bugs, so I'm marking this patch as ready for
Krasiyan> committer in the commitfest app.

Unfortunately, reviewing it from a committer perspective - I can't
possibly commit this as it stands, and anything I did to it would be
basically a rewrite of much of it.

Some of the problems could be fixed. For example the type names could be
given pg_* prefixes (it's clearly not acceptable to create random
special-purpose boolean subtypes in pg_catalog and _not_ give them such
a prefix), and the precedence hackery in gram.y could have comments
added (gram.y is already bad enough; _anything_ fancy with precedence
has to be described in the comments). But I don't like that hack with
the special types at all, and I think that needs a better solution.

Normally I'd push hard to try and get some solution that's sufficiently
generic to allow user-defined functions to make use of the feature. But
I think the SQL spec people have managed to make that literally
impossible in this case, what with the FROM keyword appearing in the
middle of a production and not followed by anything sufficiently
distinctive to even use for extra token lookahead.

Also, as has been pointed out in a number of previous features, we're
starting to accumulate identifiers that are reserved in subtly different
ways from our basic four-category system (which is itself a significant
elaboration compared to the spec's simple reserved/unreserved
distinction). As I recall this objection was specifically raised for
CUBE, but justified there by the existence of the contrib/cube extension
(and the fact that the standard CUBE() construct is used only in very
specific places in the syntax). This patch would make lead / lag /
first_value / last_value / nth_value syntactically "special" while not
actually reserving them (beyond having them in unreserved_keywords); I
think serious consideration should be given to whether they should
instead become col_name_keywords (which would, I believe, make it
unnecessary to mess with precedence).

Anyone have any thoughts or comments on the above?

--
Andrew (irc:RhodiumToad)

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Andrew Gierth (#4)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

Normally I'd push hard to try and get some solution that's sufficiently
generic to allow user-defined functions to make use of the feature. But
I think the SQL spec people have managed to make that literally
impossible in this case, what with the FROM keyword appearing in the
middle of a production and not followed by anything sufficiently
distinctive to even use for extra token lookahead.

Yeah. Is there any appetite for a "Just Say No" approach? That is,
refuse to implement the spec's syntax on the grounds that it's too
brain-dead to even consider, and instead provide some less random,
more extensibility-friendly way to accomplish the same thing?

The FROM FIRST/LAST bit seems particularly badly thought through,
because AFAICS it is flat out ambiguous with a normal FROM clause
immediately following the window function call. The only way to
make it not so would be to make FIRST and LAST be fully reserved,
which is neither a good idea nor spec-compliant.

In short, there's a really good case to be made here that the SQL
committee is completely clueless about syntax design, and so we
shouldn't follow this particular pied piper.

... This patch would make lead / lag /
first_value / last_value / nth_value syntactically "special" while not
actually reserving them (beyond having them in unreserved_keywords); I
think serious consideration should be given to whether they should
instead become col_name_keywords (which would, I believe, make it
unnecessary to mess with precedence).

I agree that messing with the precedence rules is likely to have
unforeseen and undesirable side-effects. Generally, if you need
to create a precedence rule, that's because your grammar is
ambiguous. Precedence fixes that in a well-behaved way only for
cases that actually are very much like operator precedence rules.
Otherwise, you may just be papering over something that isn't
working very well. See e.g. commits 670a6c7a2 and 12b716457
for past cases where we learned that the hard way. (The latter
also points out that if you must have a precedence hack, it's
safer to hack individual rules than to stick precedences onto
terminal symbols.) In the case at hand, since the proposed patch
doesn't make FIRST and LAST be fully reserved, it seems just about
certain that it can be made to misbehave, including failing on
queries that were and should remain legal.

regards, tom lane

Andrew Gierth

andrew@tao11.riddles.org.uk

over 7 years ago

In reply to: Tom Lane (#5)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom> The FROM FIRST/LAST bit seems particularly badly thought through,
Tom> because AFAICS it is flat out ambiguous with a normal FROM clause
Tom> immediately following the window function call. The only way to
Tom> make it not so would be to make FIRST and LAST be fully reserved,
Tom> which is neither a good idea nor spec-compliant.

In the actual spec syntax it's not ambiguous at all because NTH_VALUE is
a reserved word (as are LEAD, LAG, FIRST_VALUE and LAST_VALUE), and OVER
is a mandatory clause in its syntax, so a FROM appearing before the OVER
must be part of a FROM FIRST/LAST and not introducing a FROM-clause.

In our syntax, if we made NTH_VALUE etc. a col_name_keyword (and thus
not legal as a function name outside its own special syntax) it would
also become unambiguous.

i.e. given this token sequence (with . marking the current posision):

select nth_value(x) . from first ignore

if we know up front that "nth_value" is a window function and not any
other kind of function, we know that we have to shift the "from" rather
than reducing the select-list because we haven't seen an "over" yet.
(Neither "first" nor "ignore" are reserved, so "select foo(x) from first
ignore;" is a valid and complete query, and without reserving the
function name we'd need at least four tokens of lookahead to decide
otherwise.)

This is why I think the col_name_keyword option needs to be given
serious consideration - it still doesn't reserve the names as strongly
as the spec does, but enough to make the standard syntax work without
needing any dubious hacks.

--
Andrew (irc:RhodiumToad)

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Andrew Gierth (#6)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> The FROM FIRST/LAST bit seems particularly badly thought through,
Tom> because AFAICS it is flat out ambiguous with a normal FROM clause
Tom> immediately following the window function call. The only way to
Tom> make it not so would be to make FIRST and LAST be fully reserved,
Tom> which is neither a good idea nor spec-compliant.

In the actual spec syntax it's not ambiguous at all because NTH_VALUE is
a reserved word (as are LEAD, LAG, FIRST_VALUE and LAST_VALUE), and OVER
is a mandatory clause in its syntax, so a FROM appearing before the OVER
must be part of a FROM FIRST/LAST and not introducing a FROM-clause.

Hmm ...

In our syntax, if we made NTH_VALUE etc. a col_name_keyword (and thus
not legal as a function name outside its own special syntax) it would
also become unambiguous.
i.e. given this token sequence (with . marking the current posision):
select nth_value(x) . from first ignore
if we know up front that "nth_value" is a window function and not any
other kind of function, we know that we have to shift the "from" rather
than reducing the select-list because we haven't seen an "over" yet.

I don't really find that to be a desirable solution, because quite aside
from the extensibility problem, it would mean that a lot of errors
become "syntax error" where we formerly gave a more useful message.

This does open up a thought about how to proceed, though. I'd been
trying to think of a way to solve this using base_yylex's ability to
do some internal lookahead and change token types based on that.
If you just think of recognizing FROM FIRST/LAST, you get nowhere
because that's still legal in other contexts. But if you were to
look for FROM followed by FIRST/LAST followed by IGNORE/RESPECT/OVER,
I think that could only validly happen in this syntax. It'd take
some work to extend base_yylex to look ahead 2 tokens not one, but
I'm sure that could be done. (You'd also need a lookahead rule to
match "IGNORE/RESPECT NULLS OVER", but that seems just as doable.)
Then the relevant productions use FROM_LA, IGNORE_LA, RESPECT_LA
instead of the corresponding bare tokens, and the grammar no longer
has an ambiguity problem.

regards, tom lane

Andrew Gierth

andrew@tao11.riddles.org.uk

over 7 years ago

In reply to: Tom Lane (#7)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom> If you just think of recognizing FROM FIRST/LAST, you get nowhere
Tom> because that's still legal in other contexts. But if you were to
Tom> look for FROM followed by FIRST/LAST followed by
Tom> IGNORE/RESPECT/OVER, I think that could only validly happen in
Tom> this syntax.

No; you need to go four tokens ahead in total, not three. Assuming
nth_value is unreserved, then

select nth_value(x) from first ignore;

is a valid query that has nth_value(x) as an expression, "first" as a
table name and "ignore" as its alias. Only when you see NULLS after
IGNORE, or OVER after FIRST/LAST, do you know that you're looking at
a window function and not a from clause.

So FROM_LA would have to mean "FROM" followed by any of:

FIRST IGNORE NULLS
LAST IGNORE NULLS
FIRST RESPECT NULLS
LAST RESPECT NULLS
FIRST OVER
LAST OVER

Remember that while OVER is reserved, all of FIRST, LAST, RESPECT and
IGNORE are unreserved.

Tom> It'd take some work to extend base_yylex to look ahead 2 tokens
Tom> not one, but I'm sure that could be done. (You'd also need a
Tom> lookahead rule to match "IGNORE/RESPECT NULLS OVER", but that
Tom> seems just as doable.) Then the relevant productions use FROM_LA,
Tom> IGNORE_LA, RESPECT_LA instead of the corresponding bare tokens,
Tom> and the grammar no longer has an ambiguity problem.

Yeah, but at the cost of having to extend base_yylex to go 3 tokens
ahead (not 2) rather than the current single lookahead slot.

Doable, certainly (probably not much harder to do 3 than 2 actually)

--
Andrew (irc:RhodiumToad)

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Andrew Gierth (#8)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> If you just think of recognizing FROM FIRST/LAST, you get nowhere
Tom> because that's still legal in other contexts. But if you were to
Tom> look for FROM followed by FIRST/LAST followed by
Tom> IGNORE/RESPECT/OVER, I think that could only validly happen in
Tom> this syntax.

No; you need to go four tokens ahead in total, not three. Assuming
nth_value is unreserved, then
select nth_value(x) from first ignore;
is a valid query that has nth_value(x) as an expression, "first" as a
table name and "ignore" as its alias.

No, because once IGNORE is a keyword, even unreserved, it's not legal
as an AS-less alias. We'd be breaking queries like that no matter what.

(I know there are people around here who'd like to remove that
restriction, but it's not happening anytime soon IMO.)

regards, tom lane

#10

Andrew Gierth

andrew@tao11.riddles.org.uk

over 7 years ago

In reply to: Tom Lane (#9)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

select nth_value(x) from first ignore;

Tom> No, because once IGNORE is a keyword, even unreserved, it's not
Tom> legal as an AS-less alias.

That rule only applies in the select-list, not in the FROM clause; table
aliases in FROM are just ColId, so they can be anything except a fully
reserved or type_func_name keyword.

--
Andrew (irc:RhodiumToad)

#11

Andrew Gierth

andrew@tao11.riddles.org.uk

over 7 years ago

In reply to: Tom Lane (#7)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

So I've tried to rough out a decision tree for the various options on
how this might be implemented (discarding the "use precedence hacks"
option). Opinions? Additions?

(formatted for emacs outline-mode)

* 1. use lexical lookahead

  +: relatively straightforward parser changes
  +: no new reserved words
  +: has the option of working extensibly with all functions

-: base_yylex needs extending to 3 lookahead tokens

** 1.1. Allow from/ignore clause on all (or all non-agg) window function calls

If the clauses are legal on all window functions, what to do about existing
window functions for which the clauses do not make sense?

*** 1.1.1. Ignore the clause when the function isn't aware of it

  +: simple
  -: somewhat surprising for users perhaps?

*** 1.1.2. Change the behavior of the windowapi in some consistent way

  Not sure if this can work.
  +: fairly simple(maybe?) and predictable
  -: changes the behavior of existing window functions

** 1.2. Allow from/ignore clause on only certain functions

  +: avoids any unexpected behavior
  -: needs some way to control what functions allow it

*** 1.2.1. Check the function name in parse analysis against a fixed list.

  +: simple
  -: not extensible

*** 1.2.2. Provide some option in CREATE FUNCTION

  +: extensible
  -: fairly intrusive, adding stuff to create function and pg_proc

*** 1.2.3. Do something magical with function argument types

  +: doesn't need changes in create function / pg_proc
  -: it's an ugly hack

* 2. reserve nth_value etc. as functions

  +: follows the spec reasonably well
  +: less of a hack than extending base_yylex

-: new reserved words
-: more parser rules
-: not extensible

(now goto 1.2.1)

* 3. "just say no" to the spec

e.g. add new functions like lead_ignore_nulls(), or add extra boolean
args to lead() etc. telling them to skip nulls

  +: simple
  -: doesn't conform to spec
  -: using extra args isn't quite the right semantics

--
Andrew (irc:RhodiumToad)

#12

Tom Lane

tgl@sss.pgh.pa.us

over 7 years ago

In reply to: Andrew Gierth (#11)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

So I've tried to rough out a decision tree for the various options on
how this might be implemented (discarding the "use precedence hacks"
option). Opinions? Additions?

I think it'd be worth at least drafting an implementation for the
lexical-lookahead fix. I think it's likely that we'll need to extend
base_yylex to do more lookahead in the future even if we don't do it
for this, given the SQL committee's evident love for COBOL-ish syntax
and lack of regard for what you can do in LALR(1).

The questions of how we interface to the individual window functions
are really independent of how we handle the parsing problem. My
first inclination is to just pass the flags down to the window functions
(store them in WindowObject and provide some additional inquiry functions
in windowapi.h) and let them deal with it.

If the clauses are legal on all window functions, what to do about existing
window functions for which the clauses do not make sense?

Option 1: do nothing, document that nothing happens if w.f. doesn't
implement it.

Option 2: record whether the inquiry functions got called. At end of
query, error out if they weren't and the options were used.

It's also worth wondering if we couldn't just implement the flags in
some generic fashion and not need to involve the window functions at
all. FROM LAST, for example, could and perhaps should be implemented
by inverting the sort order. Possibly IGNORE NULLS could be implemented
inside the WinGetFuncArgXXX functions? These behaviors might or might
not make much sense with other window functions, but that doesn't seem
like it's our problem.

regards, tom lane

#13

Andrew Gierth

andrew@tao11.riddles.org.uk

over 7 years ago

In reply to: Tom Lane (#12)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

So I've tried to rough out a decision tree for the various options
on how this might be implemented (discarding the "use precedence
hacks" option). Opinions? Additions?

Tom> I think it'd be worth at least drafting an implementation for the
Tom> lexical-lookahead fix. I think it's likely that we'll need to
Tom> extend base_yylex to do more lookahead in the future even if we
Tom> don't do it for this, given the SQL committee's evident love for
Tom> COBOL-ish syntax and lack of regard for what you can do in
Tom> LALR(1).

That's not _quite_ a fair criticism of the SQL committee; they're not
ignoring the capabilities of parsers, they're just not making their
syntax robust in the presence of the kinds of extensions and
generalizations that we want to do.

Without exception that I know of, every time we've run into a problem
that needed ugly precedence rules or extra lookahead it's been caused by
us not reserving something that is reserved in the spec, or by us
allowing constructs that are not in the spec at all (postfix operators,
especially), or by us deliberately generalizing what the spec allows
(e.g. allowing full expressions where the spec only allows column
references, or allowing extra parens around subselects) or where we've
repurposed syntax from the spec in an incompatible way (as in
ANY(array)).

Anyway, for the time being I will mark this patch as "returned with
feedback".

If the clauses are legal on all window functions, what to do about
existing window functions for which the clauses do not make sense?

Tom> Option 1: do nothing, document that nothing happens if w.f.
Tom> doesn't implement it.

That was 1.1.1 on my list.

Tom> Option 2: record whether the inquiry functions got called. At end
Tom> of query, error out if they weren't and the options were used.

Erroring at the _end_ of the query seems a bit of a potential surprise.

Tom> It's also worth wondering if we couldn't just implement the flags
Tom> in some generic fashion and not need to involve the window
Tom> functions at all.

That was what I meant by option 1.1.2 on my list.

Tom> FROM LAST, for example, could and perhaps should be implemented by
Tom> inverting the sort order.

Actually that can't work for reasons brought up in the recent discussion
of optimization of window function sorts: if you change the sort order
you potentially disturb the ordering of peer rows, and the spec requires
that an (nth_value(x,n) from last over w) and (otherfunc(x) over w) for
order-equivalent windows "w" must see the peer rows in the same order.

So FROM LAST really does have to keep the original sort order, and count
backwards from the end of the window.

Tom> Possibly IGNORE NULLS could be implemented inside the
Tom> WinGetFuncArgXXX functions? These behaviors might or might not
Tom> make much sense with other window functions, but that doesn't seem
Tom> like it's our problem.

That's about what I was thinking for option 1.1.2, yes.

--
Andrew (irc:RhodiumToad)

#14

Andrew Gierth

andrew@tao11.riddles.org.uk

over 7 years ago

In reply to: Krasiyan Andreev (#3)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

"Krasiyan" == Krasiyan Andreev <krasiyan@gmail.com> writes:

Krasiyan> I am using last version from more than two months ago in
Krasiyan> production environment with real data and I didn't find any
Krasiyan> bugs, so I'm marking this patch as ready for committer in the
Krasiyan> commitfest app.

Oliver (or anyone else), do you plan to continue working on this in the
immediate future, in line with the comments from myself and Tom in this
thread? If so I'll bump it to the next CF, otherwise I'll mark it
"returned with feedback".

I'm happy to help out with further work on this patch if needed, time
permitting.

--
Andrew (irc:RhodiumToad)

#15

Stephen Frost

sfrost@snowman.net

almost 6 years ago

In reply to: Andrew Gierth (#11)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

Greetings,

This seems to have died out, and that's pretty unfortunate because this
is awfully useful SQL standard syntax that people look for and wish we
had.

* Andrew Gierth (andrew@tao11.riddles.org.uk) wrote:

So I've tried to rough out a decision tree for the various options on
how this might be implemented (discarding the "use precedence hacks"
option). Opinions? Additions?

(formatted for emacs outline-mode)

* 1. use lexical lookahead
+: relatively straightforward parser changes
+: no new reserved words
+: has the option of working extensibly with all functions
-: base_yylex needs extending to 3 lookahead tokens

This sounds awful grotty and challenging to do and get right, and the
alternative (just reserving these, as the spec does) doesn't seem so
draconian as to be that much of an issue.

* 2. reserve nth_value etc. as functions
+: follows the spec reasonably well
+: less of a hack than extending base_yylex
-: new reserved words
-: more parser rules
-: not extensible

For my 2c, at least, reserving these strikes me as entirely reasonable.
Yes, it sucks that we have to partially-reserve some additional
keywords, but such is life. I get that we'll throw syntax errors
sometimes when we might have given a better error, but I think we can
accept that.

(now goto 1.2.1)

Hmm, not sure this was right? but sure, I'll try...

*** 1.2.1. Check the function name in parse analysis against a fixed list.
+: simple
-: not extensible

Seems like this is more-or-less required since we'd be reserving them..?

*** 1.2.2. Provide some option in CREATE FUNCTION
+: extensible
-: fairly intrusive, adding stuff to create function and pg_proc

How would this work though, if we reserve the functions as keywords..?
Maybe I'm not entirely following, but wouldn't attempts to use other
functions end up with syntax errors in at least some of the cases,
meaning that having other functions support this wouldn't really work?
I don't particularly like the idea that some built-in functions would
always work but others would work but only some of the time.

*** 1.2.3. Do something magical with function argument types
+: doesn't need changes in create function / pg_proc
-: it's an ugly hack

Not really a fan of 'ugly hack'.

* 3. "just say no" to the spec

e.g. add new functions like lead_ignore_nulls(), or add extra boolean
args to lead() etc. telling them to skip nulls
+: simple
-: doesn't conform to spec
-: using extra args isn't quite the right semantics

Ugh, no thank you.

Thanks!

Stephen

#16

Krasiyan Andreev

krasiyan@gmail.com

almost 6 years ago

In reply to: Stephen Frost (#15)

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

Thank you very much for feedback and yes, that is very useful SQL syntax.
Maybe you miss my previous answer, but you are right, that patch is
currently dead,
because some important design questions must be discussed here, before
patch rewriting.

I have dropped support of from first/last for nth_value(),
but also I reimplemented it in a different way,
by using negative number for the position argument, to be able to get the
same frame in exact reverse order.
After that patch becomes much more simple and some concerns about
precedence hack has gone.

I have not renamed special bool type "ignorenulls"
(I know that it is not acceptable way for calling extra version
of window functions, but also it makes things very easy and it can reuse
frames),
but I removed the other special bool type "fromlast".

Attached file was for PostgreSQL 13 (master git branch, last commit fest),
everything was working and patch was at the time in very good shape, all
tests was passed.

I read previous review and suggestions from Tom about special bool type and
unreserved keywords and also,
that IGNORE NULLS could be implemented inside the WinGetFuncArgXXX
functions,
but I am not sure how exactly to proceed (some example will be very
helpful).

На чт, 30.04.2020 г. в 21:58 Stephen Frost <sfrost@snowman.net> написа:

Show quoted text

Greetings,

This seems to have died out, and that's pretty unfortunate because this
is awfully useful SQL standard syntax that people look for and wish we
had.

* Andrew Gierth (andrew@tao11.riddles.org.uk) wrote:
So I've tried to rough out a decision tree for the various options on
how this might be implemented (discarding the "use precedence hacks"
option). Opinions? Additions?

(formatted for emacs outline-mode)

* 1. use lexical lookahead
+: relatively straightforward parser changes
+: no new reserved words
+: has the option of working extensibly with all functions
-: base_yylex needs extending to 3 lookahead tokens
This sounds awful grotty and challenging to do and get right, and the
alternative (just reserving these, as the spec does) doesn't seem so
draconian as to be that much of an issue.
* 2. reserve nth_value etc. as functions
+: follows the spec reasonably well
+: less of a hack than extending base_yylex
-: new reserved words
-: more parser rules
-: not extensible
For my 2c, at least, reserving these strikes me as entirely reasonable.
Yes, it sucks that we have to partially-reserve some additional
keywords, but such is life. I get that we'll throw syntax errors
sometimes when we might have given a better error, but I think we can
accept that.

(now goto 1.2.1)

Hmm, not sure this was right? but sure, I'll try...

*** 1.2.1. Check the function name in parse analysis against a fixed

list.
+: simple
-: not extensible
Seems like this is more-or-less required since we'd be reserving them..?
*** 1.2.2. Provide some option in CREATE FUNCTION
+: extensible
-: fairly intrusive, adding stuff to create function and pg_proc
How would this work though, if we reserve the functions as keywords..?
Maybe I'm not entirely following, but wouldn't attempts to use other
functions end up with syntax errors in at least some of the cases,
meaning that having other functions support this wouldn't really work?
I don't particularly like the idea that some built-in functions would
always work but others would work but only some of the time.
*** 1.2.3. Do something magical with function argument types
+: doesn't need changes in create function / pg_proc
-: it's an ugly hack
Not really a fan of 'ugly hack'.
* 3. "just say no" to the spec

e.g. add new functions like lead_ignore_nulls(), or add extra boolean
args to lead() etc. telling them to skip nulls
+: simple
-: doesn't conform to spec
-: using extra args isn't quite the right semantics
Ugh, no thank you.

Thanks!

Stephen

Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

Attachments:

Attachments:

Attachments:

Attachments: