SQL92

Started by Jackson, DeJuanover 27 years ago16 messages
#1Jackson, DeJuan
djackson@cpsgroup.com

Does the SQL92 standard have to be purchased from the ANSI Board?
Well, my actual question has to do with LIKE. Could anyone who has a
copy of a standard send me the PATTERN syntax/meanings for LIKE in
SQL92?
Thanks,
-DEJ

#2Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: Jackson, DeJuan (#1)
Re: [HACKERS] SQL92

Does the SQL92 standard have to be purchased from the ANSI Board?

afaik, yes.

Well, my actual question has to do with LIKE. Could anyone who has a
copy of a standard send me the PATTERN syntax/meanings for LIKE in
SQL92?

Well, you may be hoping for more than is in any of your reference books,
but there isn't any more to tell :)

% ==> match any (sub)string
_ ==> match any single character
everything else matches itself

In SQL92 (but not yet in Postgres; can't remember if there is a
workaround):

LIKE 'pattern' ESCAPE 'char'

allows a single character "char" if it precedes the two pattern matching
characters to demote the "%" or "_" to act like a normal single
character. The Postgres regex stuff is much more powerful.

Sorry, I think that's it :(

- Tom

#3Jackson, DeJuan
djackson@cpsgroup.com
In reply to: Thomas G. Lockhart (#2)
RE: [HACKERS] SQL92

Does the SQL92 standard have to be purchased from the ANSI Board?

afaik, yes.

Well, my actual question has to do with LIKE. Could anyone who has

a

copy of a standard send me the PATTERN syntax/meanings for LIKE in
SQL92?

Well, you may be hoping for more than is in any of your reference
books,
but there isn't any more to tell :)

% ==> match any (sub)string
_ ==> match any single character
everything else matches itself

In SQL92 (but not yet in Postgres; can't remember if there is a
workaround):

LIKE 'pattern' ESCAPE 'char'

allows a single character "char" if it precedes the two pattern
matching
characters to demote the "%" or "_" to act like a normal single
character. The Postgres regex stuff is much more powerful.

Sorry, I think that's it :(

- Tom

Well, In all of the major Databases that I have worked with there is
also the "[character_set]" matching operator.
So a pattern of '[A-D]%' would match all word beginning with "A", "B",
"C", or "D", and "[^character_set]" matches everything but the set.
Looking at the current PgSQL like code this is ignored as well as the
ESCAPE syntax. I'm trying to resolve the problem we have of LIKE not
matching the pattern "%%" to anything even "%".
But, looking at the code I see that "%%" or "%%%%%%%%" should be equal
to "%" so it's not the LIKE matching code that's causing the problem
(also, read I'm looking in the wrong spot).

Does anybody know if the "[character_set]" stuff is part of the
standard?
Can anyone point me to another place where the pattern could/would get
munged?
Also, wouldn't doing a memmove from the DATA portion of a varlena struct
cause it problems if it was accessed again?
Thanks again,
-DEJ

#4Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: Jackson, DeJuan (#3)
Re: [HACKERS] SQL92

Well, In all of the major Databases that I have worked with there is
also the "[character_set]" matching operator.
Does anybody know if the "[character_set]" stuff is part of the
standard?

Hoping against hope, eh? afaik those are all extensions (though SQL3 has
some enhanced functionality with the SIMILAR operator). From my copy of
the second draft standard published in 1992:

<like predicate> uses the triadic operator LIKE (or the inverse,
NOT LIKE), operating on three character strings and returning
a Boolean. LIKE determines whether or not a character string
"matches" a given "pattern" (also a character string). The char-
acters '%' (percent) and '_' (underscore) have special meaning when
they occur in the pattern. The optional third argument is a charac-
ter string containing exactly one character, known as the "escape
character", for use when a percent or underscore is required in the
pattern without its special meaning.

What version of Postgres are you running? I vaguely recall some
complaints and fixes in the sort-of-recent past. Don't know if it
affected simple patterns or only more complicated stuff.

Can you send a simple

SELECT text 'your string here' LIKE 'your pattern here';

which doesn't behave as you would expect? That would make this a bit
less theoretical...

- Tom

#5Jose' Soares
jose@sferacarta.com
In reply to: Jackson, DeJuan (#3)
Re: [HACKERS] SQL92

Jackson, DeJuan wrote:

Does the SQL92 standard have to be purchased from the ANSI Board?

afaik, yes.

Well, my actual question has to do with LIKE. Could anyone who has

a

copy of a standard send me the PATTERN syntax/meanings for LIKE in
SQL92?

Well, you may be hoping for more than is in any of your reference
books,
but there isn't any more to tell :)

% ==> match any (sub)string
_ ==> match any single character
everything else matches itself

In SQL92 (but not yet in Postgres; can't remember if there is a
workaround):

LIKE 'pattern' ESCAPE 'char'

allows a single character "char" if it precedes the two pattern
matching
characters to demote the "%" or "_" to act like a normal single
character. The Postgres regex stuff is much more powerful.

Sorry, I think that's it :(

- Tom

Well, In all of the major Databases that I have worked with there is
also the "[character_set]" matching operator.
So a pattern of '[A-D]%' would match all word beginning with "A", "B",
"C", or "D", and "[^character_set]" matches everything but the set.
Looking at the current PgSQL like code this is ignored as well as the
ESCAPE syntax. I'm trying to resolve the problem we have of LIKE not
matching the pattern "%%" to anything even "%".
But, looking at the code I see that "%%" or "%%%%%%%%" should be equal
to "%" so it's not the LIKE matching code that's causing the problem
(also, read I'm looking in the wrong spot).

Does anybody know if the "[character_set]" stuff is part of the
standard?

The SQL92 LIKE:

character-string-expression [NOT] LIKE pattern [ESCAPE
'escape-char']

allows only the special characters % and _

You need The SQL3 SIMILAR

character-string-expression [NOT] SIMILAR TO pattern [ESCAPE
'escape-char']

in this case pattern can involve additional special characters, not just
% and _ as in LIKE, but every regular expression or "[character_set]"
like you said. (Refer to "A Guide to SQL Standard 4th edition
Date-Rarwen, page 505).

Jose'

#6Jackson, DeJuan
djackson@cpsgroup.com
In reply to: Jose' Soares (#5)
RE: [HACKERS] SQL92

Well, In all of the major Databases that I have worked with there is
also the "[character_set]" matching operator.
Does anybody know if the "[character_set]" stuff is part of the
standard?

Hoping against hope, eh? afaik those are all extensions (though SQL3
has
some enhanced functionality with the SIMILAR operator). From my copy
of
the second draft standard published in 1992:

<like predicate> uses the triadic operator LIKE (or the inverse,
NOT LIKE), operating on three character strings and returning
a Boolean. LIKE determines whether or not a character string
"matches" a given "pattern" (also a character string). The char-
acters '%' (percent) and '_' (underscore) have special meaning when
they occur in the pattern. The optional third argument is a charac-
ter string containing exactly one character, known as the "escape
character", for use when a percent or underscore is required in the
pattern without its special meaning.

What version of Postgres are you running? I vaguely recall some
complaints and fixes in the sort-of-recent past. Don't know if it
affected simple patterns or only more complicated stuff.

Can you send a simple

SELECT text 'your string here' LIKE 'your pattern here';

which doesn't behave as you would expect? That would make this a bit
less theoretical...

- Tom

Well, this all started with this message:

Since I didn't get any feedback from the list about it I decided to jump
into the code myself.
Upon finding the like.c I looked through the code and found that
"[character_set]"'s were ignored and that the problem expressed in the
above message was not evident in the code. (btw this is the CVS version
of 6.4; I cvs update daily.)
So, I asked weather the "[character_set]" stuff was in the standard,
which it is not.
And I asked if I should be looking somewhere else for munging of "%%".
(I'm currently looking at src/backend/utils/adt/like.c)

So, that's where I stand.
-DEJ

#7Jackson, DeJuan
djackson@cpsgroup.com
In reply to: Jackson, DeJuan (#6)
RE: [HACKERS] SQL92

Jackson, DeJuan wrote:

Does the SQL92 standard have to be purchased from the ANSI

Board?

afaik, yes.

Well, my actual question has to do with LIKE. Could anyone who

has

a

copy of a standard send me the PATTERN syntax/meanings for LIKE

in

SQL92?

Well, you may be hoping for more than is in any of your reference
books,
but there isn't any more to tell :)

% ==> match any (sub)string
_ ==> match any single character
everything else matches itself

In SQL92 (but not yet in Postgres; can't remember if there is a
workaround):

LIKE 'pattern' ESCAPE 'char'

allows a single character "char" if it precedes the two pattern
matching
characters to demote the "%" or "_" to act like a normal single
character. The Postgres regex stuff is much more powerful.

Sorry, I think that's it :(

- Tom

Well, In all of the major Databases that I have worked with there is
also the "[character_set]" matching operator.
So a pattern of '[A-D]%' would match all word beginning with "A",

"B",

"C", or "D", and "[^character_set]" matches everything but the set.
Looking at the current PgSQL like code this is ignored as well as

the

ESCAPE syntax. I'm trying to resolve the problem we have of LIKE

not

matching the pattern "%%" to anything even "%".
But, looking at the code I see that "%%" or "%%%%%%%%" should be

equal

to "%" so it's not the LIKE matching code that's causing the problem
(also, read I'm looking in the wrong spot).

Does anybody know if the "[character_set]" stuff is part of the
standard?

The SQL92 LIKE:

character-string-expression [NOT] LIKE pattern [ESCAPE
'escape-char']

allows only the special characters % and _

You need The SQL3 SIMILAR

character-string-expression [NOT] SIMILAR TO pattern [ESCAPE
'escape-char']

in this case pattern can involve additional special characters, not
just
% and _ as in LIKE, but every regular expression or "[character_set]"
like you said. (Refer to "A Guide to SQL Standard 4th edition
Date-Rarwen, page 505).

Jose'

Well, I'm not looking for a new operator just trying to clarify what the
scope of the current one should be. But, thanks for the information,
Jose'.
If you have any clue where in the source-tree LIKE is implemented could
you let me know?
-DEJ

#8Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Jackson, DeJuan (#3)
Re: [HACKERS] SQL92

Does the SQL92 standard have to be purchased from the ANSI Board?

afaik, yes.

Well, my actual question has to do with LIKE. Could anyone who has

a

copy of a standard send me the PATTERN syntax/meanings for LIKE in
SQL92?

Well, you may be hoping for more than is in any of your reference
books,
but there isn't any more to tell :)

% ==> match any (sub)string
_ ==> match any single character
everything else matches itself

In SQL92 (but not yet in Postgres; can't remember if there is a
workaround):

LIKE 'pattern' ESCAPE 'char'

allows a single character "char" if it precedes the two pattern
matching
characters to demote the "%" or "_" to act like a normal single
character. The Postgres regex stuff is much more powerful.

Sorry, I think that's it :(

- Tom

Well, In all of the major Databases that I have worked with there is
also the "[character_set]" matching operator.
So a pattern of '[A-D]%' would match all word beginning with "A", "B",
"C", or "D", and "[^character_set]" matches everything but the set.
Looking at the current PgSQL like code this is ignored as well as the
ESCAPE syntax. I'm trying to resolve the problem we have of LIKE not
matching the pattern "%%" to anything even "%".
But, looking at the code I see that "%%" or "%%%%%%%%" should be equal
to "%" so it's not the LIKE matching code that's causing the problem
(also, read I'm looking in the wrong spot).

Does anybody know if the "[character_set]" stuff is part of the
standard?
Can anyone point me to another place where the pattern could/would get
munged?
Also, wouldn't doing a memmove from the DATA portion of a varlena struct
cause it problems if it was accessed again?
Thanks again,

Added to TODO list:

* have LIKE support character classes, 'pg_[a-c]%'

I did not realize we didn't have that.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#9Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Jose' Soares (#5)
Re: [HACKERS] SQL92

The SQL92 LIKE:

character-string-expression [NOT] LIKE pattern [ESCAPE
'escape-char']

allows only the special characters % and _

You need The SQL3 SIMILAR

character-string-expression [NOT] SIMILAR TO pattern [ESCAPE
'escape-char']

in this case pattern can involve additional special characters, not just
% and _ as in LIKE, but every regular expression or "[character_set]"
like you said. (Refer to "A Guide to SQL Standard 4th edition
Date-Rarwen, page 505).

Jose'

TODO item changed to:

* add SIMILAR to allow character classes, 'pg_[a-c]%'

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#10Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: Bruce Momjian (#9)
Re: [HACKERS] SQL92

You need The SQL3 SIMILAR
in this case pattern can involve additional special characters, not
just % and _ as in LIKE,

TODO item changed to:
* add SIMILAR to allow character classes, 'pg_[a-c]%'

The annoying thing is that our regex "~" operators do all of this and
more, but presumably don't have the right behavior for underscore and
for percent. Should we look at the regex code and try to get it to be
compatible (for v6.5...)?

- Tom

#11Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Thomas G. Lockhart (#10)
Re: [HACKERS] SQL92

You need The SQL3 SIMILAR
in this case pattern can involve additional special characters, not
just % and _ as in LIKE,

TODO item changed to:
* add SIMILAR to allow character classes, 'pg_[a-c]%'

The annoying thing is that our regex "~" operators do all of this and
more, but presumably don't have the right behavior for underscore and
for percent. Should we look at the regex code and try to get it to be
compatible (for v6.5...)?

Not sure.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#12Jackson, DeJuan
djackson@cpsgroup.com
In reply to: Bruce Momjian (#11)
RE: [HACKERS] SQL92

You need The SQL3 SIMILAR
in this case pattern can involve additional special characters,

not

just % and _ as in LIKE,

TODO item changed to:
* add SIMILAR to allow character classes, 'pg_[a-c]%'

The annoying thing is that our regex "~" operators do all of this

and

more, but presumably don't have the right behavior for underscore

and

for percent. Should we look at the regex code and try to get it to

be

compatible (for v6.5...)?

Not sure.

I thought that the regex code was already being revamped by the
originator.
-DEJ

#13Jackson, DeJuan
djackson@cpsgroup.com
In reply to: Jackson, DeJuan (#12)
RE: [HACKERS] SQL92

The SQL92 LIKE:

character-string-expression [NOT] LIKE pattern [ESCAPE
'escape-char']

allows only the special characters % and _

You need The SQL3 SIMILAR

character-string-expression [NOT] SIMILAR TO pattern [ESCAPE
'escape-char']

in this case pattern can involve additional special characters, not

just

% and _ as in LIKE, but every regular expression or

"[character_set]"

like you said. (Refer to "A Guide to SQL Standard 4th edition
Date-Rarwen, page 505).

Jose'

TODO item changed to:

* add SIMILAR to allow character classes, 'pg_[a-c]%'

So are we going to add the character class support to LIKE or SIMILAR or
both? Just looking for clarification. Or are we going to use the regex
code for both?
-DEJ

#14Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Jackson, DeJuan (#12)
Re: [HACKERS] SQL92

You need The SQL3 SIMILAR
in this case pattern can involve additional special characters,

not

just % and _ as in LIKE,

TODO item changed to:
* add SIMILAR to allow character classes, 'pg_[a-c]%'

The annoying thing is that our regex "~" operators do all of this

and

more, but presumably don't have the right behavior for underscore

and

for percent. Should we look at the regex code and try to get it to

be

compatible (for v6.5...)?

Not sure.

I thought that the regex code was already being revamped by the
originator.

He is speeding it up, but had no release date yet.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#15Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Jackson, DeJuan (#13)
Re: [HACKERS] SQL92

The SQL92 LIKE:

character-string-expression [NOT] LIKE pattern [ESCAPE
'escape-char']

allows only the special characters % and _

You need The SQL3 SIMILAR

character-string-expression [NOT] SIMILAR TO pattern [ESCAPE
'escape-char']

in this case pattern can involve additional special characters, not

just

% and _ as in LIKE, but every regular expression or

"[character_set]"

like you said. (Refer to "A Guide to SQL Standard 4th edition
Date-Rarwen, page 505).

Jose'

TODO item changed to:

* add SIMILAR to allow character classes, 'pg_[a-c]%'

So are we going to add the character class support to LIKE or SIMILAR or
both? Just looking for clarification. Or are we going to use the regex
code for both?

No idea. It is just a valid TODO item.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#16Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: Jackson, DeJuan (#13)
Re: [HACKERS] SQL92

* add SIMILAR to allow character classes, 'pg_[a-c]%'

So are we going to add the character class support to LIKE or SIMILAR
or both? Just looking for clarification. Or are we going to use the
regex code for both?

I would expect us to stay compatible with the standard for LIKE, and
perhaps have a superset of the SQL3 (draft) standard for SIMILAR. I'd
expect it to be a candidate for v6.5, or perhaps for v6.4.x.

But I'm not doing the work, and whoever does has a much stronger vote :)

- Tom