A bug in scan.l

Started by Gokulakannan Somasundaramover 16 years ago5 messages
#1Gokulakannan Somasundaram
gokul007@gmail.com

There is a rule like this in scan.l

uescapefail
("-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*"-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU])

I think this should be corrected to

uescapefail
("-"|[uU][eE][sS][cC][aA][pP][eE]{space}*"-"|[uU][eE][sS][cC][aA][pP][eE]{space}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{space}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{space}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU])

I have replaced whitespace with space. This has to be done because
whitespace allows comments. This would cause conflict between some of the
alternatives. I found this, while trying to make this rule work with LL(1).
Just thought, it might be useful.

Thanks,
Gokul.

#2Gokulakannan Somasundaram
gokul007@gmail.com
In reply to: Gokulakannan Somasundaram (#1)
Re: A bug in scan.l

The previous change should follow with this.

uescape [uU][eE][sS][cC][aA][pP][eE]{space}*{quote}[^']{quote}

Thanks,
Gokul.

On Wed, Sep 2, 2009 at 7:35 AM, Gokulakannan Somasundaram <
gokul007@gmail.com> wrote:

Show quoted text

There is a rule like this in scan.l

uescapefail
("-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*"-"|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{whitespace}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU])

I think this should be corrected to

uescapefail
("-"|[uU][eE][sS][cC][aA][pP][eE]{space}*"-"|[uU][eE][sS][cC][aA][pP][eE]{space}*{quote}[^']|[uU][eE][sS][cC][aA][pP][eE]{space}*{quote}|[uU][eE][sS][cC][aA][pP][eE]{space}*|[uU][eE][sS][cC][aA][pP]|[uU][eE][sS][cC][aA]|[uU][eE][sS][cC]|[uU][eE][sS]|[uU][eE]|[uU])

I have replaced whitespace with space. This has to be done because
whitespace allows comments. This would cause conflict between some of the
alternatives. I found this, while trying to make this rule work with LL(1).
Just thought, it might be useful.

Thanks,
Gokul.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gokulakannan Somasundaram (#1)
Re: A bug in scan.l

Gokulakannan Somasundaram <gokul007@gmail.com> writes:

I have replaced whitespace with space. This has to be done because
whitespace allows comments. This would cause conflict between some of the
alternatives. I found this, while trying to make this rule work with LL(1).

Um, if it's ambiguous, why doesn't flex complain?

regards, tom lane

#4Gokulakannan Somasundaram
gokul007@gmail.com
In reply to: Tom Lane (#3)
Re: A bug in scan.l

Well, i am at a very beginner level with Flex. I could see how flex works
with it even if it is a ambiguity. Since it matches the rule with the
maximum text and we don't allow a new line character in the rule, it works
fine. Even in LL(1), it works fine, but throws warnings. So i just thought
of suggesting to remove the ambiguity.
But do we need to allow comments as part of unicode escapes?

Thanks,
Gokul.

On Wed, Sep 2, 2009 at 8:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Gokulakannan Somasundaram <gokul007@gmail.com> writes:

I have replaced whitespace with space. This has to be done because
whitespace allows comments. This would cause conflict between some of the
alternatives. I found this, while trying to make this rule work with

LL(1).

Um, if it's ambiguous, why doesn't flex complain?

regards, tom lane

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gokulakannan Somasundaram (#4)
Re: A bug in scan.l

Gokulakannan Somasundaram <gokul007@gmail.com> writes:

Well, i am at a very beginner level with Flex. I could see how flex works
with it even if it is a ambiguity. Since it matches the rule with the
maximum text and we don't allow a new line character in the rule, it works
fine. Even in LL(1), it works fine, but throws warnings. So i just thought
of suggesting to remove the ambiguity.

Well, that whole rule is only there for implementation-specific reasons
--- a flex scanner is faster if it doesn't need to back up.  You might
be best off to just remove the anti-backup rules in the LL translation.

But do we need to allow comments as part of unicode escapes?

If they're like normal strings, yes.

regression=# select 'this is' -- comment
regression-# ' one string';
?column?
--------------------
this is one string
(1 row)

Don't blame us, blame the SQL committee. This was not one of their
better ideas IMO.

regards, tom lane