FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

Started by Josh Berkusover 10 years ago7 messages
#1Josh Berkus
josh@agliodbs.com

Folks:

SELECT
device_id,
count(*)::INT as present,
count(*)::INT FILTER (WHERE valid) as valid_count,
mode()::INT WITHIN GROUP (order by val) as mode,
percentile_disc(0.5)::INT WITHIN GROUP (order by val)
as median
FROM dataflow_0913
GROUP BY device_id
ORDER BY device_id;

ERROR: syntax error at or near "FILTER"
LINE 4: count(*)::INT FILTER (WHERE valid)
as valid_count,

The error is right, that's invalid syntax. I can't insert a ::INT
between the aggregate() and FILTER. However, the error message is also
rather confusing to the user; they're likely to look for their mistake
in the wrong place. The same goes for WITHIN GROUP (and OVER, too, I
think).

Is there some kind of possible HINT we could add to make this easier to
debug?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Josh Berkus (#1)
Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

On Thu, Apr 16, 2015 at 12:22 PM, Josh Berkus <josh@agliodbs.com> wrote:

Folks:

SELECT
device_id,
count(*)::INT as present,
count(*)::INT FILTER (WHERE valid) as valid_count,
mode()::INT WITHIN GROUP (order by val) as mode,
percentile_disc(0.5)::INT WITHIN GROUP (order by val)
as median
FROM dataflow_0913
GROUP BY device_id
ORDER BY device_id;

ERROR: syntax error at or near "FILTER"
LINE 4: count(*)::INT FILTER (WHERE valid)
as valid_count,

The error is right, that's invalid syntax. I can't insert a ::INT
between the aggregate() and FILTER. However, the error message is also
rather confusing to the user; they're likely to look for their mistake
in the wrong place. The same goes for WITHIN GROUP (and OVER, too, I
think).

​​SELECT count(*)::int OVER ()
FROM ( VALUES (1),(2),(3) ) src;

Is there some kind of possible HINT we could add to make this easier to
debug?

​Do you have a suggested hint so that the effort to make it work can be
compared to its usefulness?

​For kicks I ran the following - since "val::type" is simply another syntax
for "type(val)"...

SELECT ceil(count(*)) OVER ()
FROM ( VALUES (1),(2),(3) ) src

SQL Error: ERROR: OVER specified, but ceil is not a window function nor an
aggregate function
LINE 1: SELECT ceil(count(*)) OVER ()

The non-hint has been around as long as window functions and hasn't really
come up as an issue - not enough so to motivate a change at least. Is the
idiomatic usage of FILTER and WITHIN GROUP making this error more likely?

The foot-gun in your blog post is more problematic but also seemingly
impossible to avoid except through education of the user. It would not be
unreasonable to accept that the current error is acting like a canary and
forcing the user to go read the documentation on OVER/FILTER/WITHIN GROUP
and learn to write the expression as a single unit.

If this is not covered adequately enough in the documentation then that
should be remedied. Did you evaluate the documentation in that light while
preparing your blog post?

David J.

#3Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

On 04/16/2015 01:01 PM, David G. Johnston wrote:

If this is not covered adequately enough in the documentation then that
should be remedied. Did you evaluate the documentation in that light
while preparing your blog post?

Your response seems very defensive. I'm unclear on why discussing
improving an error message should get such a hostile response; it
certainly doesn't invite a serious reply.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Josh Berkus (#3)
Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

On Thu, Apr 16, 2015 at 1:06 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 04/16/2015 01:01 PM, David G. Johnston wrote:

If this is not covered adequately enough in the documentation then that
should be remedied. Did you evaluate the documentation in that light
while preparing your blog post?

Your response seems very defensive. I'm unclear on why discussing
improving an error message should get such a hostile response; it
certainly doesn't invite a serious reply.

Mostly because ​I have no clue how difficult it would be​...

​I'm all for improvement but obviously if it was that simple you would have
just proposed the desired wording and asked someone to commit it. I was
trying to understand (and communicate my understanding) why it wasn't done
better in the first place.

Beyond that there is still the need to teach the user the correct syntax of
these functions - and not only point out when they get it wrong so they
blindly fix their typo - in order to better avoid the situation where the
syntax is valid but the outcome is unexpected. Two separate problems of
which the later seems more important than the former - and probably
somewhat easier to implement.

David J.

#5Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

On 04/16/2015 01:18 PM, David G. Johnston wrote:

On Thu, Apr 16, 2015 at 1:06 PM, Josh Berkus <josh@agliodbs.com
<mailto:josh@agliodbs.com>>wrote:

On 04/16/2015 01:01 PM, David G. Johnston wrote:

If this is not covered adequately enough in the documentation then that
should be remedied. Did you evaluate the documentation in that light
while preparing your blog post?

Your response seems very defensive. I'm unclear on why discussing
improving an error message should get such a hostile response; it
certainly doesn't invite a serious reply.

Mostly because ​I have no clue how difficult it would be​...

​I'm all for improvement but obviously if it was that simple you would
have just proposed the desired wording and asked someone to commit it.
I was trying to understand (and communicate my understanding) why it
wasn't done better in the first place.

Right. First, I don't know that it's possible for the parser to figure
out that error as opposed to actual mistakes in the filter clause, or
just straight gibberish. Second, I'm not sure how to write a succinct
HINT which gets the point across.

However, the error we're giving for windowing suggests that we can
improve things. In the example you quote:

SQL Error: ERROR: OVER specified, but ceil is not a window function nor
an aggregate function
LINE 1: SELECT ceil(count(*)) OVER ()

... that's actually a lot clearer error. However, we still get the
same error with casting:

josh=# select count(*)::INT OVER (order by device_id, val) from
dataflow_0913 ;
ERROR: syntax error at or near "OVER"
LINE 1: select count(*)::INT OVER (order by device_id, val) from dat...

... so maybe we can't.

Beyond that there is still the need to teach the user the correct syntax
of these functions - and not only point out when they get it wrong so
they blindly fix their typo - in order to better avoid the situation
where the syntax is valid but the outcome is unexpected. Two separate
problems of which the later seems more important than the former - and
probably somewhat easier to implement.

Absolutely. Hence my blog post; I'm hoping that people will at least
get a google hit on the error text.

The problem is, in the main docs, how do we reasonably document
*combining* features? In Syntax? My head hurts just thinking about it.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#1)
Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

Josh Berkus <josh@agliodbs.com> writes:

ERROR: syntax error at or near "FILTER"
LINE 4: count(*)::INT FILTER (WHERE valid) as valid_count,

The error is right, that's invalid syntax. I can't insert a ::INT
between the aggregate() and FILTER. However, the error message is also
rather confusing to the user; they're likely to look for their mistake
in the wrong place. The same goes for WITHIN GROUP (and OVER, too, I
think).

Is there some kind of possible HINT we could add to make this easier to
debug?

Probably not; or at least, IMO it would be a fundamental misjudgment to
go in with the idea of improving this one single case. It'd be cool if
we could improve reporting of syntax errors in general, though.

Unfortunately Bison doesn't provide a whole heckuva lot of support for
that. The only simple thing it offers is %error-verbose, which I've
experimented with in the past and come away with the impression that
it'd be completely useless for most people :-(. It seems to basically
add information about which tokens would be valid next at the point of
the error report. That isn't any help for the sort of user conceptual
error you're talking about here.

You could imagine teaching yyerror() to have some SQL-specific knowledge
that it could apply by comparing the current lookahead token to the
current parse state stack ... but neither of those things are exposed
to it by Bison. We could probably get the lookahead token indirectly
by inspecting the lexer's state, but it's not clear that's enough to
do much. In this particular example, that would mean showing the exact
same hint whenever a syntax error is reported at an OVER token; and
I'm afraid that it'd be mighty hard to write a hint for that that
wouldn't frequently do more harm than good. (Note in particular that
since OVER, FILTER, and WITHIN are unreserved keywords, they might be
used as simple column/table/function names.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#1)
Re: FILTER/WITHIN GROUP vs. expressions; is a HINT possible here?

On 04/16/2015 04:36 PM, Tom Lane wrote:

You could imagine teaching yyerror() to have some SQL-specific knowledge
that it could apply by comparing the current lookahead token to the
current parse state stack ... but neither of those things are exposed
to it by Bison. We could probably get the lookahead token indirectly
by inspecting the lexer's state, but it's not clear that's enough to
do much. In this particular example, that would mean showing the exact
same hint whenever a syntax error is reported at an OVER token; and

Yeah, and that's what we *don't* want to do. That's why I was wondering
if it was even possible to figure out the "extra syntax" case.

I'm afraid that it'd be mighty hard to write a hint for that that
wouldn't frequently do more harm than good. (Note in particular that
since OVER, FILTER, and WITHIN are unreserved keywords, they might be
used as simple column/table/function names.)

No kidding. Heck, I just spent 1/2 hour figuring out a bug which was
actually caused by the fact that a user created a view called "mode".

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers