Differences in Unicode handling on Mac vs Linux?

Started by Matt Dawalmost 13 years ago4 messagesgeneral
Jump to latest
#1Matt Daw
matt@shotgunsoftware.com

Howdy, I loaded a client's DB on my Mac to debug an unrelated bug, but
I'm blocked because my Mac is rejecting SQL that works on our Linux
production servers. Here's a simple case:

# select * from shots where sg_poznÁmka is NULL;
ERROR: column "sg_pozn�mka" does not exist
LINE 1: select * from shots where sg_poznÁmka is NULL;

... as far as I can tell, all my encodings are consistent on both
sides, I've checked LC_COLLATE, LC_CTYPE, client_encoding,
server_encoding and the database encodings. I'm running 9.0.13 on both
machines (and I tried 9.2.4 on my Mac).

Anything else I could double-check? Or are there any known Mac-related
Unicode issues?

Thanks!

Matt

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matt Daw (#1)
Re: Differences in Unicode handling on Mac vs Linux?

Matt Daw <matt@shotgunsoftware.com> writes:

Howdy, I loaded a client's DB on my Mac to debug an unrelated bug, but
I'm blocked because my Mac is rejecting SQL that works on our Linux
production servers. Here's a simple case:

# select * from shots where sg_poznÁmka is NULL;
ERROR: column "sg_pozn�mka" does not exist
LINE 1: select * from shots where sg_poznÁmka is NULL;

Hm ... what does "\d shots" say about the spelling of the column name?

Anything else I could double-check? Or are there any known Mac-related
Unicode issues?

OS X's Unicode locales are pretty crummy. I'm suspicious that there's
some sort of case-folding inconsistency here, but it's hard to say more
(especially since you didn't actually tell us *which* locales you've
selected on each machine). If it is that, as a short-term fix it might
help to double-quote the column name.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Matt Daw
matt@shotgunsoftware.com
In reply to: Tom Lane (#2)
Re: Differences in Unicode handling on Mac vs Linux?

Hm ... what does "\d shots" say about the spelling of the column name?

\d shots is the same on both systems:

sg_poznÁmka | text
|

OS X's Unicode locales are pretty crummy. I'm suspicious that there's
some sort of case-folding inconsistency here, but it's hard to say more
(especially since you didn't actually tell us *which* locales you've
selected on each machine). If it is that, as a short-term fix it might
help to double-quote the column name.

The locales are set to "en_US.UTF-8" and encodings to "UTF8". Double
quoting does solve the column case, but it's not helping with the
Rails generated:

SELECT a.attname, format_type(a.atttypid, a.atttypmod), d.adsrc, a.attnotnull
FROM pg_attribute a LEFT JOIN pg_attrdef d
ON a.attrelid = d.adrelid AND a.attnum = d.adnum
WHERE a.attrelid =
'asset_sg_kdo_dělá____assigned_to__connections'::regclass
AND a.attnum > 0 AND NOT a.attisdropped
ORDER BY a.attnum

... that produces:

ERROR: relation "asset_sg_kdo_d�l�____assigned_to__connections" does not exist

\d produces:

public | asset_sg_kdo_dělá____assigned_to__connections
| table | matt

For the short term, I think I'll boot up a Linux VM to troubleshoot my
production bug... but I'll submit a bug report with repro steps.

Thanks Tom!

Matt

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Ian Lawrence Barwick
barwick@gmail.com
In reply to: Tom Lane (#2)
Re: Differences in Unicode handling on Mac vs Linux?

2013/6/3 Tom Lane <tgl@sss.pgh.pa.us>:

Matt Daw <matt@shotgunsoftware.com> writes:

Howdy, I loaded a client's DB on my Mac to debug an unrelated bug, but
I'm blocked because my Mac is rejecting SQL that works on our Linux
production servers. Here's a simple case:

# select * from shots where sg_poznÁmka is NULL;
ERROR: column "sg_pozn�mka" does not exist
LINE 1: select * from shots where sg_poznÁmka is NULL;

Hm ... what does "\d shots" say about the spelling of the column name?

Anything else I could double-check? Or are there any known Mac-related
Unicode issues?

OS X's Unicode locales are pretty crummy. I'm suspicious that there's
some sort of case-folding inconsistency here, but it's hard to say more
(especially since you didn't actually tell us *which* locales you've
selected on each machine). If it is that, as a short-term fix it might
help to double-quote the column name.

I can recreate something similar (OS X 10.7, 9.3beta1):

postgres=# CREATE TABLE shots (id int);
CREATE TABLE
postgres=# SHOW client_encoding ;
client_encoding
-----------------
UTF8
(1 row)

postgres=# select * from shots where col_ä is NULL;
ERROR: column "col_�" does not exist
LINE 1: select * from shots where col_ä is NULL;

The corresponding log output is:

ERROR: column "col_<E3><A4>" does not exist at character 27
STATEMENT: select * from shots where col_ä is NULL;

Double-quoting the column name does seem to "work":

postgres=# select * from shots where "col_ä" is NULL;
ERROR: column "col_ä" does not exist
LINE 1: select * from shots where "col_ä" is NULL;

The only language/locale settings I see in my environment are:

LANG=en_GB.UTF-8
__CF_USER_TEXT_ENCODING=0x1F6:0:2

Regards

Ian Barwick

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general