BUG #7913: TO_CHAR Function & Turkish collate
The following bug has been logged on the website:
Bug reference: 7913
Logged by: TO_CHAR Function & Turkish collate
Email address: a_dursun@hotmail.com
PostgreSQL version: 9.2.0
Operating system: Linux
Description:
prod=# SELECT TO_CHAR('2013-03-01'::date,'DAY');
to_char
----------
FRİDAY
(1 row)
But it must return as FRIDAY.
Our database lc_collate is tr_TR.UTF-8 and encoding is UTF8.
Best regards,
Adnan DURSUN
Ankar/TURKEY
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
a_dursun@hotmail.com writes:
prod=# SELECT TO_CHAR('2013-03-01'::date,'DAY');
to_char
----------
FRİDAY
(1 row)
But it must return as FRIDAY.
Our database lc_collate is tr_TR.UTF-8 and encoding is UTF8.
It looks like the cause of this is that the result is computed as
str_toupper("Friday"), and str_toupper() applies a collation-sensitive
upcasing rule.
I think the use of str_toupper() is appropriate when processing the
locale-specific string for a TMDAY specification; but plain DAY is not
supposed to be locale-dependent, so we probably should use an ASCII-only
upcasing rule in the non-TM code path.
Anybody have an opinion on whether to back-patch such a fix? It seems
conceivable that somebody out there is relying on the current behavior.
OTOH, I believe that only Turkish UTF8 locales exhibit this behavior
(the single-byte-encoding code path in str_toupper acts differently for
historical reasons). So it's pretty inconsistent as it stands.
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Sun, 2013-03-03 at 10:42 -0500, Tom Lane wrote:
I think the use of str_toupper() is appropriate when processing the
locale-specific string for a TMDAY specification; but plain DAY is not
supposed to be locale-dependent, so we probably should use an
ASCII-only upcasing rule in the non-TM code path.
Agreed.
Anybody have an opinion on whether to back-patch such a fix?
I think it's a bug that should be backpatched.
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On 03-03-2013 12:42, Tom Lane wrote:
Anybody have an opinion on whether to back-patch such a fix? It seems
conceivable that somebody out there is relying on the current behavior.
OTOH, I believe that only Turkish UTF8 locales exhibit this behavior
(the single-byte-encoding code path in str_toupper acts differently for
historical reasons). So it's pretty inconsistent as it stands.
Nope. I'm not aware of the Turkish weird rules. Mea culpa. :(
As you suggested, s/str_toupper/pg_toupper/ in the else block (no TM) is the
right fix. I'm not aware of another locale that would break if we apply such a
change in a stable branch. Are you want me to post a fix?
--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Euler Taveira <euler@timbira.com> writes:
As you suggested, s/str_toupper/pg_toupper/ in the else block (no TM) is the
right fix. I'm not aware of another locale that would break if we apply such a
change in a stable branch. Are you want me to post a fix?
Thanks, but I have a fix mostly written already.
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Peter Eisentraut <peter_e@gmx.net> writes:
On Sun, 2013-03-03 at 10:42 -0500, Tom Lane wrote:
Anybody have an opinion on whether to back-patch such a fix?
I think it's a bug that should be backpatched.
Done. In addition to day/month names, I found that there were
case-folding hazards for timezone abbreviations ('tz' format)
and Roman numerals for numbers ('rn' format) ... though, curiously,
not for Roman numerals for months.
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Hi,
On Tue, 2013-03-05 at 13:08 -0500, Tom Lane wrote:
I think it's a bug that should be backpatched.
Done. In addition to day/month names, I found that there were
case-folding hazards for timezone abbreviations ('tz' format)
and Roman numerals for numbers ('rn' format) ... though, curiously,
not for Roman numerals for months.
Thanks!
Regards,
--
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org Twitter: http://twitter.com/devrimgunduz