wrong behavior using to_char()

Started by Euler Taveira de Oliveiraover 19 years ago10 messagesbugs
Jump to latest

Hi,

I notice a strange behavior using to_char() function. I'm using locale
pt_BR but it could happen with any locale.

template1=# select to_char(12345.67, '999G999D999');
to_char
--------------
12,345,670
(1 registro)

In the pt_BR locale, the thousand separator is "". So it should return
12345,670. Looking at the source, I saw that the test cases for locale
properties are independent among them. I think that the correct form is to
have all-or-nothing test case or didn't test *lconv->property ("" is
evaluated to false). Attached is a patch that fixes it using the second
option.

--
Euler Taveira de Oliveira
http://www.timbira.com/

Attachments:

xapplication/octet-stream; name=xDownload+10-11
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Euler Taveira de Oliveira (#1)
Re: wrong behavior using to_char()

"Euler Taveira de Oliveira" <euler@timbira.com> writes:

In the pt_BR locale, the thousand separator is "". So it should return
12345,670. Looking at the source, I saw that the test cases for locale
properties are independent among them. I think that the correct form is to
have all-or-nothing test case or didn't test *lconv->property ("" is
evaluated to false). Attached is a patch that fixes it using the second
option.

Not unless you have a solution to the problem seen in this thread:
http://archives.postgresql.org/pgsql-patches/2006-02/msg00172.php

regards, tom lane

#3Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#2)
Re: wrong behavior using to_char()

Tom Lane wrote:

"Euler Taveira de Oliveira" <euler@timbira.com> writes:

In the pt_BR locale, the thousand separator is "". So it should return
12345,670. Looking at the source, I saw that the test cases for locale
properties are independent among them. I think that the correct form is to
have all-or-nothing test case or didn't test *lconv->property ("" is
evaluated to false). Attached is a patch that fixes it using the second
option.

Not unless you have a solution to the problem seen in this thread:
http://archives.postgresql.org/pgsql-patches/2006-02/msg00172.php

I was going to point him to these commits to formatting.c:

date: 2006/02/12 23:48:23; author: momjian; state: Exp; lines: +3 -4
Revert because C locale uses "" for thousands_sep, meaning "n/a", while
French uses "" for "don't want". Seems we have to keep the existing
behavior.
----------------------------
revision 1.105
date: 2006/02/12 19:52:06; author: momjian; state: Exp; lines: +5 -4
Support "" for thousands separator and plus sign in to_char(), per
report from French Debian user. psql already handles "" fine.

One idea would be to handle C locale behavior differently from non-C
locale.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#4Jorge Godoy
jgodoy@gmail.com
In reply to: Euler Taveira de Oliveira (#1)
Re: wrong behavior using to_char()

"Euler Taveira de Oliveira" <euler@timbira.com> writes:

In the pt_BR locale, the thousand separator is "". So it should return

The thousands separator in pt_BR is ".".

12345,670. Looking at the source, I saw that the test cases for locale

This should be "12.345,670".

--
Jorge Godoy <jgodoy@gmail.com>

#5Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#3)
Re: wrong behavior using to_char()

Bruce Momjian wrote:

One idea would be to handle C locale behavior differently from non-C
locale.

Right.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

In reply to: Jorge Godoy (#4)
Re: wrong behavior using to_char()

Jorge Godoy wrote:

In the pt_BR locale, the thousand separator is "". So it should return

The thousands separator in pt_BR is ".".

Oh, good catch. There is so much hack in my glibc. :-)

--
Euler Taveira de Oliveira
http://www.timbira.com/

#7Bruce Momjian
bruce@momjian.us
In reply to: Peter Eisentraut (#5)
Re: wrong behavior using to_char()

Peter Eisentraut wrote:

Bruce Momjian wrote:

One idea would be to handle C locale behavior differently from non-C
locale.

Right.

I am thinking this is eomthing for 8.3.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#8Bruce Momjian
bruce@momjian.us
In reply to: Euler Taveira de Oliveira (#6)
Re: [BUGS] wrong behavior using to_char()

Euler Taveira de Oliveira wrote:

Jorge Godoy wrote:

In the pt_BR locale, the thousand separator is "". So it should return

The thousands separator in pt_BR is ".".

Oh, good catch. There is so much hack in my glibc. :-)

I researched this thread:

http://archives.postgresql.org/pgsql-bugs/2006-09/msg00074.php

Ultimately, the result was that glibc was wrong in its locale settings,
and there was a suggestion to use defaults only when using the C locale.
However, I am worried there are too many locales in the field that only
define some of the locale setting, so doing defaults only for the C
locale might not work.

The minimal patch I wrote (attached), suppresses the default for the
thousands separator only if is is the same as the decimal separator. I
think this is the only area where the default could potentially match
the locale setting for another field.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

Attachments:

/pgpatches/to_chartext/x-diffDownload+20-11
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#8)
Re: [BUGS] wrong behavior using to_char()

Bruce Momjian <bruce@momjian.us> writes:

Ultimately, the result was that glibc was wrong in its locale settings,
and there was a suggestion to use defaults only when using the C locale.
However, I am worried there are too many locales in the field that only
define some of the locale setting, so doing defaults only for the C
locale might not work.

The minimal patch I wrote (attached), suppresses the default for the
thousands separator only if is is the same as the decimal separator. I
think this is the only area where the default could potentially match
the locale setting for another field.

Should we really go introducing strange misbehaviors into our code to
work around an admitted glibc bug?

regards, tom lane

#10Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#9)
Re: [BUGS] wrong behavior using to_char()

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Ultimately, the result was that glibc was wrong in its locale settings,
and there was a suggestion to use defaults only when using the C locale.
However, I am worried there are too many locales in the field that only
define some of the locale setting, so doing defaults only for the C
locale might not work.

The minimal patch I wrote (attached), suppresses the default for the
thousands separator only if is is the same as the decimal separator. I
think this is the only area where the default could potentially match
the locale setting for another field.

Should we really go introducing strange misbehaviors into our code to
work around an admitted glibc bug?

Seems there is no interest in handling this specific case, so I withdraw
the patch, but I have added a comment with a date in case we need to
revisit it.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +