use-regular-expressions-to-simplify-less_greater-and-not_equals.patch
we can use regular expressions (<>|!=) to cover "<>" and "!=". There is no
need to have two definitions less_greater and not_equals, because it will confuse developer.
So, we can use only not_equals to cover this operator set.
Please review my patch.
Thanks.
Attachments:
use-regular-expressions-to-simplify-less_greater-and-not_equals.patchapplication/octet-streamDownload
Date: Wed, 11 Aug 2021 10:41:18 +0800
Subject: [PATCH] backend/parser: use regular expressions to simplify
less_greater and not_equals
we can use regular expressions (<>|!=) to cover "<>" and "!=", so there is no
need to have two definitions less_greater and not_equals, it will confuse developer.
So, we can use only not_equals to cover this operator set.
Signed-off-by: Shihao Sun <sunshihao.ssh@alibabab-inc.com>
Signed-off-by: Xiaojian Fan <funnyxj.fxj@alibaba-inc.com>
---
src/backend/parser/scan.l | 9 +--------
src/fe_utils/psqlscan.l | 7 +------
src/interfaces/ecpg/preproc/pgc.l | 8 +-------
3 files changed, 3 insertions(+), 21 deletions(-)
diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l
index 6e6824faeb..0297270ed4 100644
--- a/src/backend/parser/scan.l
+++ b/src/backend/parser/scan.l
@@ -363,8 +363,7 @@ colon_equals ":="
equals_greater "=>"
less_equals "<="
greater_equals ">="
-less_greater "<>"
-not_equals "!="
+not_equals (<>|!=)
/*
* "self" is the set of chars that should be returned as single-character
@@ -842,12 +841,6 @@ other .
return GREATER_EQUALS;
}
-{less_greater} {
- /* We accept both "<>" and "!=" as meaning NOT_EQUALS */
- SET_YYLLOC();
- return NOT_EQUALS;
- }
-
{not_equals} {
/* We accept both "<>" and "!=" as meaning NOT_EQUALS */
SET_YYLLOC();
diff --git a/src/fe_utils/psqlscan.l b/src/fe_utils/psqlscan.l
index 0fab48a382..3e73c967ab 100644
--- a/src/fe_utils/psqlscan.l
+++ b/src/fe_utils/psqlscan.l
@@ -301,8 +301,7 @@ colon_equals ":="
equals_greater "=>"
less_equals "<="
greater_equals ">="
-less_greater "<>"
-not_equals "!="
+not_equals (<>|!=)
/*
* "self" is the set of chars that should be returned as single-character
@@ -618,10 +617,6 @@ other .
ECHO;
}
-{less_greater} {
- ECHO;
- }
-
{not_equals} {
ECHO;
}
diff --git a/src/interfaces/ecpg/preproc/pgc.l b/src/interfaces/ecpg/preproc/pgc.l
index 7a0356638d..7c8821b569 100644
--- a/src/interfaces/ecpg/preproc/pgc.l
+++ b/src/interfaces/ecpg/preproc/pgc.l
@@ -329,8 +329,7 @@ colon_equals ":="
equals_greater "=>"
less_equals "<="
greater_equals ">="
-less_greater "<>"
-not_equals "!="
+not_equals (<>|!=)
/*
* "self" is the set of chars that should be returned as single-character
@@ -783,11 +782,6 @@ cppline {space}*#([^i][A-Za-z]*|{if}|{ifdef}|{ifndef}|{import})((\/\*[^*/]*\*+
return GREATER_EQUALS;
}
-{less_greater} {
- /* We accept both "<>" and "!=" as meaning NOT_EQUALS */
- return NOT_EQUALS;
- }
-
{not_equals} {
/* We accept both "<>" and "!=" as meaning NOT_EQUALS */
return NOT_EQUALS;
--
2.32.0
"=?UTF-8?B?5a2Z6K+X5rWpKOaAneaJjSk=?=" <sunshihao.ssh@alibaba-inc.com> writes:
we can use regular expressions (<>|!=) to cover "<>" and "!=". There is no
need to have two definitions less_greater and not_equals, because it will confuse developer.
So, we can use only not_equals to cover this operator set.
I do not find this an improvement. Yeah, it's a bit shorter, but it's
less clear; not least because the comment explaining the <>-means-!=
behavior is no longer anywhere near the code that implements that
behavior. It would also get in the way if we ever had reason to treat <>
and != as something other than exact equivalents.
regards, tom lane
On Tue, Aug 10, 2021 at 8:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
"=?UTF-8?B?5a2Z6K+X5rWpKOaAneaJjSk=?=" <sunshihao.ssh@alibaba-inc.com>
writes:we can use regular expressions (<>|!=) to cover "<>" and "!=".
There is no
need to have two definitions less_greater and not_equals, because it
will confuse developer.
So, we can use only not_equals to cover this operator set.
I do not find this an improvement.
Agreed, though mostly on a separation of responsibilities principle.
Labelling the distinctive tokens and then assigning them meaning are two
different things.
Yeah, it's a bit shorter, but it's
less clear;
not least because the comment explaining the <>-means-!=
behavior is no longer anywhere near the code that implements that
behavior.
The patch should just remove the comment for not_equals as well: if both
tokens are given the same name they would have to be treated identically.
The fact they have the same name is clearer in that the equivalency
knowledge is immediate and impossible to miss (if one doesn't go and check
how "less_greater" and "not_equal" are interpreted.)
It would also get in the way if we ever had reason to treat <>
and != as something other than exact equivalents.
This is unconvincing both on the odds of being able to treat them
differently and the fact that the degree of effort to overcome this
obstacle seems minimal - about the same amount as reverting this patch, not
accounting for the coding of the new behavior.
In short, for me, is a principled separation of concerns worth the slight
loss is clarity? I'll stick to my status quo choice, but not strongly.
Some more context as to exactly what kind of confusion this setup is
causing could help tip the scale.
David J.
On 8/10/21 11:27 PM, 孙诗浩(思才) wrote:
we can use regular expressions (<>|!=) to cover "<>" and "!=".
There is no
need to have two definitions less_greater and not_equals, because it
will confuse developer.
So, we can use only not_equals to cover this operator set.
I don't understand the problem being solved here. This looks like a
change with no discernable benefit. I don't agree that the current code
is substantially more likely to confuse developers. And the lexer is not
something that gets a lot of change.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes:
On 8/10/21 11:27 PM, 孙诗浩(思才) wrote:
we can use regular expressions (<>|!=) to cover "<>" and "!=".
I don't understand the problem being solved here. This looks like a
change with no discernable benefit. I don't agree that the current code
is substantially more likely to confuse developers. And the lexer is not
something that gets a lot of change.
By my count, the response to this has been two definite "no"s, one
leaning-to-no, and nobody speaking in favor. Nor has the submitter
provided a stronger argument for it. I'm going to mark this
rejected in the CF app.
regards, tom lane