lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Started by Alexander Farberover 9 years ago10 messagesgeneral
Jump to latest
#1Alexander Farber
alexander.farber@gmail.com

Hello fellow PostgreSQL users,

does anybody else observe the problem, that calling lower() method on UTF8
cyrillic strings works on Mac and Linux for version 9.5.3, but fails on
Windows 7 / 64 bit (I am using the unzippable version w/o installer)?

I am probably not providing enough information here... not sure what else
to write.

Here is my code (trying to lowercase a string before saving it in 2-dim.
array):

FOR _tile IN SELECT * FROM JSONB_ARRAY_ELEMENTS(in_tiles)
LOOP
_letter := _tile->>'letter';
_value := (_tile->>'value')::int;
_col := (_tile->>'col')::int + 1;
_row := (_tile->>'row')::int + 1;

RAISE NOTICE 'tile = %', _tile;

IF NOT words_valid_tile(_letter,
_value,
_col,
_row) THEN
RAISE EXCEPTION 'Invalid tile = %', _tile;
END IF;

IF (_letters[_col][_row] IS NOT NULL) THEN
RAISE EXCEPTION 'Cell already occupied %', _tile;
END IF;

_letters[_col][_row] := lower(_letter);
RAISE NOTICE 'letter = %', _letters[_col][_row]; -- STILL
UPPERCASE
_values[_col][_row] := _value;

END LOOP;

I wonder if there is a workaround for this problem on Windows

Thank you
Alex

#2Alexander Farber
alexander.farber@gmail.com
In reply to: Alexander Farber (#1)
Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Here the Windows-log excerpt (the 5 cyrillic letters stay uppercased):

LOG: database system is ready to accept connections
LOG: autovacuum launcher started
LOG: statement: SET client_encoding = 'UTF8';

LOG: execute <unnamed>: SELECT out_gid AS gid FROM words_play_game($1, $2,
$3::jsonb)
DETAIL: parameters: $1 = '1', $2 = '3', $3 =
'[{"col":7,"letter":"П","row":11,"value":2},{"col":7,"letter":"И","row":10,"value":1},{"col":7,"letter":"Л","row":9,"value":2},{"col":7,"letter":"О","row":8,"value":1},{"col":7,"letter":"П","row":7,"value":2}]'
NOTICE: tile = {"col": 7, "row": 11, "value": 2, "letter": "П"}
NOTICE: letter = П
NOTICE: tile = {"col": 7, "row": 10, "value": 1, "letter": "И"}
NOTICE: letter = И
NOTICE: tile = {"col": 7, "row": 9, "value": 2, "letter": "Л"}
NOTICE: letter = Л
NOTICE: tile = {"col": 7, "row": 8, "value": 1, "letter": "О"}
NOTICE: letter = О
NOTICE: tile = {"col": 7, "row": 7, "value": 2, "letter": "П"}
NOTICE: letter = П

#3Alexander Farber
alexander.farber@gmail.com
In reply to: Alexander Farber (#2)
Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

More info:

# \l+

Name | Owner | Encoding | Collate | Ctype |
-----------+---------+----------+---------+-------+
postgres | user1 | UTF8 | C | C |
template0 | user1 | UTF8 | C | C |
| | | | |
template1 | user1 | UTF8 | C | C |
| | | | |
words | user1 | UTF8 | C | C |

#4Charles Clavadetscher
clavadetscher@swisspug.org
In reply to: Alexander Farber (#3)
Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Hello

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander Farber
Sent: Montag, 8. August 2016 09:10
To: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

More info:

# \l+

Name | Owner | Encoding | Collate | Ctype |
-----------+---------+----------+---------+-------+
postgres | user1 | UTF8 | C | C |
template0 | user1 | UTF8 | C | C |
| | | | |
template1 | user1 | UTF8 | C | C |
| | | | |
words | user1 | UTF8 | C | C |

I cannot test on 9.5, which you are using, but I was able to reproduce the problem on 9.3:

kofadmin@kofdb.localhost=> CREATE DATABASE test TEMPLATE template0 ENCODING 'UTF8' LC_COLLATE 'C' LC_CTYPE 'C';
CREATE DATABASE
kofadmin@kofdb.localhost=> \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+----------------------------+----------------------------+-----------------------
kofdb | kofadmin | UTF8 | English_United States.1252 | English_United States.1252 |
test | kofadmin | UTF8 | C | C |

kofadmin@kofdb.localhost=> \c test
psql (9.6devel, server 9.3.6)
You are now connected to database "test" as user "kofadmin".
kofadmin@test.localhost=> select lower('P'); <-- This works (latin letters)
lower
-------
p
(1 row)

kofadmin@test.localhost=> select lower('И'); <-- This does not work (cyrillic)
lower
-------
И
(1 row)

kofadmin@test.localhost=> \c kofdb
psql (9.6devel, server 9.3.6)
You are now connected to database "kofdb" as user "kofadmin".
kofadmin@kofdb.localhost=> select lower('И'); <-- This works on a DB with another collation and ctype
lower
-------
и
(1 row)

It seems to be a problem with collation and or ctype.

What are the settings of the database on your Linux system where all works correct?

Bye
Charles

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Charles Clavadetscher
clavadetscher@swisspug.org
In reply to: Charles Clavadetscher (#4)
Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Hello

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Charles
Clavadetscher
Sent: Montag, 8. August 2016 09:30
To: 'Alexander Farber' <alexander.farber@gmail.com>; 'pgsql-general' <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Hello

From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander
Farber
Sent: Montag, 8. August 2016 09:10
To: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows,
but works on Mac, Linux

More info:

# \l+

Name | Owner | Encoding | Collate | Ctype |
-----------+---------+----------+---------+-------+
postgres | user1 | UTF8 | C | C |
template0 | user1 | UTF8 | C | C |
| | | | |
template1 | user1 | UTF8 | C | C |
| | | | |
words | user1 | UTF8 | C | C |

I cannot test on 9.5, which you are using, but I was able to reproduce the problem on 9.3:

kofadmin@kofdb.localhost=> CREATE DATABASE test TEMPLATE template0 ENCODING 'UTF8' LC_COLLATE 'C' LC_CTYPE 'C';
CREATE DATABASE kofadmin@kofdb.localhost=> \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
-----------+----------+----------+----------------------------+----------------------------+-----------------------
kofdb | kofadmin | UTF8 | English_United States.1252 | English_United States.1252 |
test | kofadmin | UTF8 | C | C |

kofadmin@kofdb.localhost=> \c test
psql (9.6devel, server 9.3.6)
You are now connected to database "test" as user "kofadmin".
kofadmin@test.localhost=> select lower('P'); <-- This works (latin letters) lower
-------
p
(1 row)

kofadmin@test.localhost=> select lower('И'); <-- This does not work (cyrillic) lower
-------
И
(1 row)

kofadmin@test.localhost=> \c kofdb
psql (9.6devel, server 9.3.6)
You are now connected to database "kofdb" as user "kofadmin".
kofadmin@kofdb.localhost=> select lower('И'); <-- This works on a DB with another collation and ctype lower
-------
и
(1 row)

It seems to be a problem with collation and or ctype.

What are the settings of the database on your Linux system where all works correct?

A possible workaround:

kofadmin@test.localhost=> CREATE COLLATION "en_US" (LOCALE = 'English_United States.1252');
CREATE COLLATION

kofadmin@test.localhost=> select * from pg_collation;
collname | collnamespace | collowner | collencoding | collcollate | collctype
----------+---------------+-----------+--------------+----------------------------+----------------------------
default | 11 | 10 | -1 | |
C | 11 | 10 | -1 | C | C
POSIX | 11 | 10 | -1 | POSIX | POSIX
en_US | 2200 | 16394 | 6 | English_United States.1252 | English_United States.1252
(4 rows)

kofadmin@test.localhost=> with x as (select 'И'::text collate "en_US" as letter) select lower(letter) from x ;
lower
-------
и
(1 row)

Sure, bit uncomfortable.
Bye
Charles

Bye
Charles

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Alexander Farber
alexander.farber@gmail.com
In reply to: Charles Clavadetscher (#5)
Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Thank you for the replies.

On CentOS 7 Linux with pgdg 9.5.3 the lower() method works and I have:

# \l
List of databases
Name | Owner | Encoding | Collate | Ctype |
-----------+----------+----------+-------------+-------------+
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
| | | | |
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
| | | | |
words | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |

#7Charles Clavadetscher
clavadetscher@swisspug.org
In reply to: Alexander Farber (#6)
Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Hello

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander Farber
Sent: Montag, 8. August 2016 10:12
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Thank you for the replies.
On CentOS 7 Linux with pgdg 9.5.3 the lower() method works and I have:

# \l
List of databases
Name | Owner | Encoding | Collate | Ctype |
-----------+----------+----------+-------------+-------------+
postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
| | | | |
template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
| | | | |
words | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |

Well yes, anogher collation and ctype.

I found an easier way to use the workaround after create collation:

kofadmin@test.localhost=> select lower(('И'::text collate "en_US")) ;
lower
-------
и
(1 row)

Maybe other more expert than me on this topic will suggest better solution.
Bye
Charles

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Alexander Farber
alexander.farber@gmail.com
In reply to: Charles Clavadetscher (#7)
Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Hello Charles, unfortunately on Windows 7 this fails:

psql (9.5.3)
Type "help" for help.

# select lower(('И'::text collate "en_US")) ;
ERROR: collation "en_US" for encoding "UTF8" does not exist
LINE 1: select lower(('?'::text collate "en_US")) ;
^
By the way I the following code works well for me on all 3 platforms:

CREATE TABLE words_verbs (
word varchar(255) PRIMARY KEY CHECK (
word ~ '^[А-Я]{2,}$' AND
word !~ '[ЖШ]Ы' AND
word !~ '[ЧЩ]Я' AND
word !~ 'Ц[ЮЯ]' AND
(word ~ '[ТЧ]ЬСЯ$' OR
word ~ '[ТЧ]Ь$' OR
word ~ 'ТИ$')),
hashed varchar(32) NOT NULL
);

but I understand that it is probably different methods on the lower layer
(pcre instead of some collating functions?)....

Regards
Alex

#9Charles Clavadetscher
clavadetscher@swisspug.org
In reply to: Alexander Farber (#8)
Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Hello Alexander

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander Farber
Sent: Montag, 8. August 2016 10:21
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Hello Charles, unfortunately on Windows 7 this fails:

psql (9.5.3)
Type "help" for help.

# select lower(('И'::text collate "en_US")) ;
ERROR: collation "en_US" for encoding "UTF8" does not exist LINE 1: select lower(('?'::text collate "en_US")) ;

I assume that you did not create the collation yet as I mentioned in a previous mail.

kofadmin@test.localhost=> CREATE COLLATION "en_US" (LOCALE = 'English_United States.1252');
CREATE COLLATION

Which locale can be created depends on those available on your OS.

Bye
Charles

By the way I the following code works well for me on all 3 platforms:

CREATE TABLE words_verbs (
word varchar(255) PRIMARY KEY CHECK (
word ~ '^[А-Я]{2,}$' AND
word !~ '[ЖШ]Ы' AND
word !~ '[ЧЩ]Я' AND
word !~ 'Ц[ЮЯ]' AND
(word ~ '[ТЧ]ЬСЯ$' OR
word ~ '[ТЧ]Ь$' OR
word ~ 'ТИ$')),
hashed varchar(32) NOT NULL
);

but I understand that it is probably different methods on the lower layer (pcre instead of some collating
functions?)....

Regards

Alex

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Farber (#3)
Re: lower() silently fails for 9.5.3 on Windows, but works on Mac, Linux

Alexander Farber <alexander.farber@gmail.com> writes:

More info:
# \l+

Name | Owner | Encoding | Collate | Ctype |
-----------+---------+----------+---------+-------+
postgres | user1 | UTF8 | C | C |
template0 | user1 | UTF8 | C | C |
| | | | |
template1 | user1 | UTF8 | C | C |
| | | | |
words | user1 | UTF8 | C | C |

Well, there's your problem: in C locale, only the standard ASCII letters
will be transformed by upper/lower.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general