Better handling of parse errors

pgman@candle.pha.pa.us

over 23 years ago

In reply to: Gavin Sherry (#1)

Re: Better handling of parse errors

Can we see some sample output?

---------------------------------------------------------------------------

Gavin Sherry wrote:

Hi all,

Attached is a small patch to scan.l for consideration. It hands
yyerror() the position in the query string of the token which caused a
parse error. It is not even close to an implementation of error handling
a-la SQL99 but it certainly makes debugging complicated queries easier.

I've done some testing and it appears to hit the offending token pretty
accurately.

Can anyone find a way to break this? If not, I'd love to see it in 7.3.

Gavin

Content-Description:

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

swm@linuxworld.com.au

over 23 years ago

In reply to: Bruce Momjian (#2)

Re: Better handling of parse errors

On Tue, 6 Aug 2002, Bruce Momjian wrote:

Can we see some sample output?

template1=# select * frum pg_class;
ERROR: parser: parse error at or near "frum" at character 10
template1=# select relname from pg_class a excepr select relname from
pg_class limit a;
ERROR: parser: parse error at or near "excepr" at character 32
template1=# create table a (desc int);
ERROR: parser: parse error at or near "desc" at character 17

Gavin

Show quoted text

---------------------------------------------------------------------------

Gavin Sherry wrote:

Hi all,

Attached is a small patch to scan.l for consideration. It hands
yyerror() the position in the query string of the token which caused a
parse error. It is not even close to an implementation of error handling
a-la SQL99 but it certainly makes debugging complicated queries easier.

I've done some testing and it appears to hit the offending token pretty
accurately.

Can anyone find a way to break this? If not, I'd love to see it in 7.3.

Gavin

Content-Description:

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

swm@linuxworld.com.au

over 23 years ago

In reply to: Bruce Momjian (#2)

Re: Better handling of parse errors

On Tue, 6 Aug 2002, Bruce Momjian wrote:

Can we see some sample output?

One other important one:

template1=# select
template1-# *
template1-#
template1-# frum
template1-# pg_class;
ERROR: parser: parse error at or near "frum" at character 10

Note that it counts non-printable characters. This could be both good and
bad.

Gavin

tgl@sss.pgh.pa.us

over 23 years ago

In reply to: Gavin Sherry (#1)

Re: Better handling of parse errors

Gavin Sherry <swm@linuxworld.com.au> writes:

Attached is a small patch to scan.l for consideration. It hands
yyerror() the position in the query string of the token which caused a
parse error.

Isn't that the hard way to do it? Seems like you could just subtract
scanbuf from the error pointer, instead of adding overhead to the basic
lex loop.

Things that need to be decided (and documented somewhere):

Is the number an offset (counted from 0) or an index (counted from 1)?
Which end of the token does it point at? Can the message be phrased
so as to make it reasonably clear what the number is?

A related change I'd been meaning to make is to get it to say
parse error at end of input
when that's the case, rather than the rather useless
parse error at or near ""
that you get now. I'd still be inclined to just say "end of input"
in that case, and not bother with a character count.

regards, tom lane

pgman@candle.pha.pa.us

over 23 years ago

In reply to: Tom Lane (#5)

Re: Better handling of parse errors

Gavin, have you answered these issues brought up about the patch?

---------------------------------------------------------------------------

Tom Lane wrote:

Gavin Sherry <swm@linuxworld.com.au> writes:

Attached is a small patch to scan.l for consideration. It hands
yyerror() the position in the query string of the token which caused a
parse error.

Isn't that the hard way to do it? Seems like you could just subtract
scanbuf from the error pointer, instead of adding overhead to the basic
lex loop.

Things that need to be decided (and documented somewhere):

Is the number an offset (counted from 0) or an index (counted from 1)?
Which end of the token does it point at? Can the message be phrased
so as to make it reasonably clear what the number is?

A related change I'd been meaning to make is to get it to say
parse error at end of input
when that's the case, rather than the rather useless
parse error at or near ""
that you get now. I'd still be inclined to just say "end of input"
in that case, and not bother with a character count.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

swm@linuxworld.com.au

over 23 years ago

In reply to: Bruce Momjian (#6)

Re: Better handling of parse errors

Bruce,

I was working on this on the train this morning. There are a few
possibilities I'd like to look at before I submit another patch.

Before I do, there is one important question that I didn't raise when I
submitted the first patch (which was only a proof of concept kind of
patch). Namely: do we want to modify every 7.2 error message? Many people
have written error message parsers into their applications to make up for
the fact that Postgres doesn't use error codes or issue 'standardised'
error messages.

Is this going to annoy people too much? Should it be delayed until we have
SQL99 diagnostics/error codes?

Gavin

On Wed, 14 Aug 2002, Bruce Momjian wrote:

Show quoted text

Gavin, have you answered these issues brought up about the patch?

---------------------------------------------------------------------------

Tom Lane wrote:

Gavin Sherry <swm@linuxworld.com.au> writes:

Attached is a small patch to scan.l for consideration. It hands
yyerror() the position in the query string of the token which caused a
parse error.

Isn't that the hard way to do it? Seems like you could just subtract
scanbuf from the error pointer, instead of adding overhead to the basic
lex loop.

Things that need to be decided (and documented somewhere):

Is the number an offset (counted from 0) or an index (counted from 1)?
Which end of the token does it point at? Can the message be phrased
so as to make it reasonably clear what the number is?

A related change I'd been meaning to make is to get it to say
parse error at end of input
when that's the case, rather than the rather useless
parse error at or near ""
that you get now. I'd still be inclined to just say "end of input"
in that case, and not bother with a character count.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

pgman@candle.pha.pa.us

over 23 years ago

In reply to: Gavin Sherry (#7)

Re: Better handling of parse errors

I don't think anyone will mind, but you can throw a message to 'general'
just to be sure.

---------------------------------------------------------------------------

Gavin Sherry wrote:

Bruce,

I was working on this on the train this morning. There are a few
possibilities I'd like to look at before I submit another patch.

Before I do, there is one important question that I didn't raise when I
submitted the first patch (which was only a proof of concept kind of
patch). Namely: do we want to modify every 7.2 error message? Many people
have written error message parsers into their applications to make up for
the fact that Postgres doesn't use error codes or issue 'standardised'
error messages.

Is this going to annoy people too much? Should it be delayed until we have
SQL99 diagnostics/error codes?

Gavin

On Wed, 14 Aug 2002, Bruce Momjian wrote:

Gavin, have you answered these issues brought up about the patch?

---------------------------------------------------------------------------

Tom Lane wrote:

Gavin Sherry <swm@linuxworld.com.au> writes:

Attached is a small patch to scan.l for consideration. It hands
yyerror() the position in the query string of the token which caused a
parse error.

Isn't that the hard way to do it? Seems like you could just subtract
scanbuf from the error pointer, instead of adding overhead to the basic
lex loop.

Things that need to be decided (and documented somewhere):

Is the number an offset (counted from 0) or an index (counted from 1)?
Which end of the token does it point at? Can the message be phrased
so as to make it reasonably clear what the number is?

A related change I'd been meaning to make is to get it to say
parse error at end of input
when that's the case, rather than the rather useless
parse error at or near ""
that you get now. I'd still be inclined to just say "end of input"
in that case, and not bother with a character count.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

tgl@sss.pgh.pa.us

over 23 years ago

In reply to: Gavin Sherry (#7)

Re: Better handling of parse errors

Gavin Sherry <swm@linuxworld.com.au> writes:

... do we want to modify every 7.2 error message?

Nyet ... but I don't think tacking an offset onto the end of
"parse error at or near foo" messages is likely to cause the
sort of generalized havoc you suggest ...

I'm on record as favoring a thorough redesign of the error-reporting
facility, and I hope to see that happen in conjunction with a frontend
protocol change in 7.4. But I think your proposed change in this patch
is fairly harmless and could be done now.

regards, tom lane

#10

swm@linuxworld.com.au

over 23 years ago

In reply to: Tom Lane (#9)

1 attachment(s)

Re: [HACKERS] Better handling of parse errors

On Wed, 14 Aug 2002, Tom Lane wrote:

Gavin Sherry <swm@linuxworld.com.au> writes:

... do we want to modify every 7.2 error message?

Nyet ... but I don't think tacking an offset onto the end of
"parse error at or near foo" messages is likely to cause the
sort of generalized havoc you suggest ...

In that case, attached is a patch which locates the beginning of the
offending token more efficiently (per your suggestion of using
scanbuf). The new patch does the same as before:

template1=# select * frum pg_class;
ERROR: parser: parse error at or near "frum" at character 10

It also implement's Tom's suggestion:

template1=# select * from pg_class where\g
ERROR: parse: parse error at end of input

Gavin

#11

http://candle.pha.pa.us/cgi-bin/pgpatches

pgman@candle.pha.pa.us

over 23 years ago

In reply to: Gavin Sherry (#10)

Re: [HACKERS] Better handling of parse errors

Your patch has been added to the PostgreSQL unapplied patches list at:

I will try to apply it within the next 48 hours.

---------------------------------------------------------------------------

Gavin Sherry wrote:

On Wed, 14 Aug 2002, Tom Lane wrote:

Gavin Sherry <swm@linuxworld.com.au> writes:

... do we want to modify every 7.2 error message?

Nyet ... but I don't think tacking an offset onto the end of
"parse error at or near foo" messages is likely to cause the
sort of generalized havoc you suggest ...

In that case, attached is a patch which locates the beginning of the
offending token more efficiently (per your suggestion of using
scanbuf). The new patch does the same as before:

template1=# select * frum pg_class;
ERROR: parser: parse error at or near "frum" at character 10

It also implement's Tom's suggestion:

template1=# select * from pg_class where\g
ERROR: parse: parse error at end of input

Gavin

Content-Description:

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#12

pgman@candle.pha.pa.us

over 23 years ago

In reply to: Gavin Sherry (#10)

Re: [HACKERS] Better handling of parse errors

Patch applied. Thanks.

---------------------------------------------------------------------------

Gavin Sherry wrote:

On Wed, 14 Aug 2002, Tom Lane wrote:

Gavin Sherry <swm@linuxworld.com.au> writes:

... do we want to modify every 7.2 error message?

Nyet ... but I don't think tacking an offset onto the end of
"parse error at or near foo" messages is likely to cause the
sort of generalized havoc you suggest ...

In that case, attached is a patch which locates the beginning of the
offending token more efficiently (per your suggestion of using
scanbuf). The new patch does the same as before:

template1=# select * frum pg_class;
ERROR: parser: parse error at or near "frum" at character 10

It also implement's Tom's suggestion:

template1=# select * from pg_class where\g
ERROR: parse: parse error at end of input

Gavin

Content-Description:

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#13

Peter Eisentraut

peter_e@gmx.net

over 23 years ago

In reply to: Gavin Sherry (#10)

Re: [PATCHES] Better handling of parse errors

Gavin Sherry writes:

In that case, attached is a patch which locates the beginning of the
offending token more efficiently (per your suggestion of using
scanbuf).

In the regression tests there are a couple of cases that could be
improved:

In strings.sql:

-- illegal string continuation syntax
SELECT 'first line'
' - next line' /* this comment is not allowed here */
' - third line'
AS "Illegal comment within continuation";
ERROR: parser: parse error at or near "' - third line'" at character 89

Character 89 is the end of the "third line" line, but the parse error is
at the beginning of that line.

In create_function_1.sql:

CREATE FUNCTION test1 (int) RETURNS int LANGUAGE sql
AS 'not even SQL';
ERROR: parser: parse error at or near "not" at character 1

Clearly confusing.

--
Peter Eisentraut peter_e@gmx.net

#14

tgl@sss.pgh.pa.us

over 23 years ago

In reply to: Peter Eisentraut (#13)

Re: [PATCHES] Better handling of parse errors

Peter Eisentraut <peter_e@gmx.net> writes:

In strings.sql:

-- illegal string continuation syntax
SELECT 'first line'
' - next line' /* this comment is not allowed here */
' - third line'
AS "Illegal comment within continuation";
ERROR: parser: parse error at or near "' - third line'" at character 89

Character 89 is the end of the "third line" line, but the parse error is
at the beginning of that line.

This is fixed as of my later commit.

In create_function_1.sql:

CREATE FUNCTION test1 (int) RETURNS int LANGUAGE sql
AS 'not even SQL';
ERROR: parser: parse error at or near "not" at character 1

Clearly confusing.

"Character 1" is correct as of the context that the parser is working
in, namely the function body. I don't think we can do much to change
that, but perhaps we could make the message read like
ERROR: parser: parse error at or near "not" at character 1 of function body
This would require giving the parser some sort of context-identifying
string to tack onto the message, but that doesn't seem too hard.

regards, tom lane

#15

swm@linuxworld.com.au

over 23 years ago

In reply to: Tom Lane (#14)

Re: [PATCHES] Better handling of parse errors

On Sun, 18 Aug 2002, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

In strings.sql:

-- illegal string continuation syntax
SELECT 'first line'
' - next line' /* this comment is not allowed here */
' - third line'
AS "Illegal comment within continuation";
ERROR: parser: parse error at or near "' - third line'" at character 89

Character 89 is the end of the "third line" line, but the parse error is
at the beginning of that line.

This is fixed as of my later commit.

In create_function_1.sql:

CREATE FUNCTION test1 (int) RETURNS int LANGUAGE sql
AS 'not even SQL';
ERROR: parser: parse error at or near "not" at character 1

Clearly confusing.

"Character 1" is correct as of the context that the parser is working
in, namely the function body. I don't think we can do much to change
that, but perhaps we could make the message read like
ERROR: parser: parse error at or near "not" at character 1 of function body
This would require giving the parser some sort of context-identifying
string to tack onto the message, but that doesn't seem too hard.

Tom,

Reworking the code to taken into account token_start seems to work.

elog(ERROR, "parser: %s at or near \"%s\" at character %i",
message,token_start ? token_start : yytext,
token_start ? (unsigned int)(token_start - scanbuf + 1) :
(unsigned int)(yytext - scanbuf + 1));

I will submit a patch once I do some more testing to find other possible
situations where this plays up.

Gavin

#16

tgl@sss.pgh.pa.us

over 23 years ago

In reply to: Gavin Sherry (#15)

Re: [PATCHES] Better handling of parse errors

Gavin Sherry <swm@linuxworld.com.au> writes:

Reworking the code to taken into account token_start seems to work.

Yes, I did that last night ...

regards, tom lane

#17

Peter Eisentraut

peter_e@gmx.net

over 23 years ago

In reply to: Tom Lane (#14)

Re: [PATCHES] Better handling of parse errors

Tom Lane writes:

"Character 1" is correct as of the context that the parser is working
in, namely the function body. I don't think we can do much to change
that, but perhaps we could make the message read like
ERROR: parser: parse error at or near "not" at character 1 of function body

That would be better.

--
Peter Eisentraut peter_e@gmx.net

#18