Possible locale issue with 7.4

Started by Bruno Wolff IIIover 22 years ago4 messages
#1Bruno Wolff III
bruno@wolff.to

In 7.4 I am finding that '(' (and some other punctuation) is not a member of
[:print:]. It is in 7.3. It is a member of [:graph:] in 7.4 (which is
supposed to be [:print:] - [:space:]).

The following is my 7.4 config:
./configure --prefix=/usr/local/pgsql --enable-integer-datetimes --with-pgport=5433

For 7.3 I used:
./configure --prefix=/usr/lib/pgsql --exec-prefix=/usr --with-perl --with-openssl --mandir=/usr/man --docdir=/usr/doc --enable-integer-datetimes

The following is an example of the problem:
area=> select version();
version
------------------------------------------------------------------------
PostgreSQL 7.4beta3 on i686-pc-linux-gnu, compiled by GCC egcs-2.91.66
(1 row)

area=> select '(' ~ '[[:print:]]';
?column?
----------
f
(1 row)

area=> select '(' ~ '[[:graph:]]';
?column?
----------
t
(1 row)

area=> select '0' ~ '[[:print:]]';
?column?
----------
t
(1 row)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruno Wolff III (#1)
Re: Possible locale issue with 7.4

Bruno Wolff III <bruno@wolff.to> writes:

In 7.4 I am finding that '(' (and some other punctuation) is not a member of
[:print:]. It is in 7.3. It is a member of [:graph:] in 7.4 (which is
supposed to be [:print:] - [:space:]).

This is not a locale problem, because I see it in C locale too.
[digs] Apparently this is an oversight in the new regex code we
lifted from Tcl 8.4.1:

switch ((enum classes) index)
{
case CC_PRINT:
case CC_ALNUM:
cv = getcvec(v, UCHAR_MAX, 1, 0);
if (cv)
{
for (i = 0; i <= UCHAR_MAX; i++)
{
if (pg_isalpha((chr) i))
addchr(cv, (chr) i);
}
addrange(cv, (chr) '0', (chr) '9');
}
break;

in other words, :print: is the same as :alnum:. This is obviously
a bug, will fix ... wonder if Henry Spencer knows about it?

regards, tom lane

#3Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Tom Lane (#2)
Re: Possible locale issue with 7.4

On Sun, Sep 28, 2003 at 08:09:31PM -0400, Tom Lane wrote:

Bruno Wolff III <bruno@wolff.to> writes:

In 7.4 I am finding that '(' (and some other punctuation) is not a member of
[:print:]. It is in 7.3. It is a member of [:graph:] in 7.4 (which is
supposed to be [:print:] - [:space:]).

This is not a locale problem, because I see it in C locale too.
[digs] Apparently this is an oversight in the new regex code we
lifted from Tcl 8.4.1:

Here
http://cvs.sourceforge.net/viewcvs.py/tcl/tcl/generic/regc_locale.c?rev=1.10&amp;view=auto

is the Tcl version. Is looks very similar (meaning, :print: is the
same as :alnum:). Note that the code hasn't changed since
Mon Jul 29 12:27:51 2002 UTC

but is marked with tags to version 8.4.4.

Maybe not too much people uses :print: ?

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"A wizard is never late, Frodo Baggins, nor is he early.
He arrives precisely when he means to." (Gandalf, en LoTR FoTR)

#4Bruno Wolff III
bruno@wolff.to
In reply to: Tom Lane (#2)
Re: Possible locale issue with 7.4

On Sun, Sep 28, 2003 at 20:09:31 -0400,
Tom Lane <tgl@sss.pgh.pa.us> wrote:

in other words, :print: is the same as :alnum:. This is obviously
a bug, will fix ... wonder if Henry Spencer knows about it?

The really cute thing is I only found it because I made a mistake.
I didn't want to include spaces in what I was using it for and really
should have been using [:graph:] instead.