Number format problem

Started by Stéphane SCHILDKNECHTabout 20 years ago10 messagesgeneral
Jump to latest

Hi,

There seems to be some tricky behaviour with number formating and french
locale.

I tried the following request:
select to_char(1485.12, '9G999D99');

I was expecting to get: 1 485,12

But, surprinsingly, I got 1,1485,12.

My postgresql server is an 8.1.2 version. The same problem occurs under
Ubuntu Breezy and Debian Testing.
My current configuration is
LC_NUMERIC=fr_FR@euro
client_encoding=LATIN9
server_encoding=LATIN9

I tried to reconfigure locales and restart the server, but I can't get
the result I expect.

I really don't know what else I could do.

Sincerely,

--
St�phane SCHILDKNECHT
Pr�sident de PostgreSQLFr
http://www.postgresqlfr.org

#2Peter Eisentraut
peter_e@gmx.net
In reply to: Stéphane SCHILDKNECHT (#1)
Re: Number format problem

St�phane SCHILDKNECHT wrote:

select to_char(1485.12, '9G999D99');

But, surprinsingly, I got 1,1485,12.

The fr_FR locale is broken. You should report this to glibc.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#3Daniel Verite
daniel@manitou-mail.org
In reply to: Peter Eisentraut (#2)
Re: Number format problem

Peter Eisentraut wrote:

Stéphane SCHILDKNECHT wrote:

select to_char(1485.12, '9G999D99');

But, surprinsingly, I got 1,1485,12.

The fr_FR locale is broken. You should report this to glibc.

On my debian sarge with LC_NUMERIC set to fr_FR@euro, a
printf("%'g\n", 1485.12);
produces 1485,12 with which seems to be correct given that the
'thousands_sep' locale entry is set to "" (empty string) and
'decimal_point' to U002C

On the other hand, what postgres produces is:
test=> set lc_numeric='fr_FR@euro';
SET
test=> select to_char(1485.12, '9G999D99');
to_char
-----------
1,485,12
(1 row)

which is wrong with regard to thousands_sep="".

In fact, grep'ing the source code reveals that, when 'thousands_sep' is set to
an empty string, it gets ignored and a comma is used instead.
I'm referring to backend/utils/adt/formatting.c, NUM_prepare_locale() in 8.1.2:

/*
* Number thousands separator
*/
if (lconv->thousands_sep && *lconv->thousands_sep)
Np->L_thousands_sep = lconv->thousands_sep;
else
Np->L_thousands_sep = ",";

What's wrong with lconv->thousands_sep being set to an empty string?
Shouldn't it be used nonetheless?

--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

#4Bruce Momjian
bruce@momjian.us
In reply to: Daniel Verite (#3)
Re: [GENERAL] Number format problem

Interesting. We should be able to handle a "" thousands separator. In
fact, psql does, but to_char() does not.

Looking at to_char(), it seems that we should be able to handle "" for
thousands and plus sign.

I have applied the following patch for 8.2. I have not applied to 8.1.X
because it is a behavior change.

Thanks for the report.

---------------------------------------------------------------------------

Daniel Verite wrote:

Peter Eisentraut wrote:

St?phane SCHILDKNECHT wrote:

select to_char(1485.12, '9G999D99');

But, surprinsingly, I got 1,1485,12.

The fr_FR locale is broken. You should report this to glibc.

On my debian sarge with LC_NUMERIC set to fr_FR@euro, a
printf("%'g\n", 1485.12);
produces 1485,12 with which seems to be correct given that the
'thousands_sep' locale entry is set to "" (empty string) and
'decimal_point' to U002C

On the other hand, what postgres produces is:
test=> set lc_numeric='fr_FR@euro';
SET
test=> select to_char(1485.12, '9G999D99');
to_char
-----------
1,485,12
(1 row)

which is wrong with regard to thousands_sep="".

In fact, grep'ing the source code reveals that, when 'thousands_sep' is set to
an empty string, it gets ignored and a comma is used instead.
I'm referring to backend/utils/adt/formatting.c, NUM_prepare_locale() in 8.1.2:

/*
* Number thousands separator
*/
if (lconv->thousands_sep && *lconv->thousands_sep)
Np->L_thousands_sep = lconv->thousands_sep;
else
Np->L_thousands_sep = ",";

What's wrong with lconv->thousands_sep being set to an empty string?
Shouldn't it be used nonetheless?

--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Attachments:

/bjm/difftext/plainDownload+7-7
#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#4)
Re: [GENERAL] Number format problem

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I have applied the following patch for 8.2. I have not applied to 8.1.X
because it is a behavior change.

This patch broke the regression tests. It seems to change the C-locale
behavior on many platforms.

regards, tom lane

#6Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#5)
Re: [GENERAL] Number format problem

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I have applied the following patch for 8.2. I have not applied to 8.1.X
because it is a behavior change.

This patch broke the regression tests. It seems to change the C-locale
behavior on many platforms.

Oh, so C locale has "" for thousands_sep? Hmm. So there is no way to
disinguish "" as "don't have" from "" as "don't want". Will revert.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#6)
Re: [GENERAL] Number format problem

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Tom Lane wrote:

This patch broke the regression tests. It seems to change the C-locale
behavior on many platforms.

Oh, so C locale has "" for thousands_sep? Hmm. So there is no way to
disinguish "" as "don't have" from "" as "don't want". Will revert.

Possibly we could special-case C locale, and allow an empty string only
when not C locale. However, it could be that there are other locales
where the change would surprise people, too :-(

regards, tom lane

#8Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#7)
Re: [GENERAL] Number format problem

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Tom Lane wrote:

This patch broke the regression tests. It seems to change the C-locale
behavior on many platforms.

Oh, so C locale has "" for thousands_sep? Hmm. So there is no way to
disinguish "" as "don't have" from "" as "don't want". Will revert.

Possibly we could special-case C locale, and allow an empty string only
when not C locale. However, it could be that there are other locales
where the change would surprise people, too :-(

Not sure. Peter stated the French locale was wrong for the Debian they
were using, and he should know, so I am going to wait for more feeback.

One thing that has me confused is that psql/print.c doesn't check for a
zero-length string for thousands_sep in setDecimalLocale, so I am not
thinking that is wrong and should skip "" like we do in formatting.c.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#9Daniel Verite
daniel@manitou-mail.org
In reply to: Bruce Momjian (#8)
Re: [GENERAL] Number format problem

Bruce Momjian wrote:

Oh, so C locale has "" for thousands_sep? Hmm. So there is no way to
disinguish "" as "don't have" from "" as "don't want". Will revert.

Possibly we could special-case C locale, and allow an empty string only
when not C locale. However, it could be that there are other locales
where the change would surprise people, too :-(

Not sure. Peter stated the French locale was wrong for the Debian they
were using, and he should know, so I am going to wait for more feeback.

The baseline glibc sources from gnu.org have
(in glibc-2.3.6/localedata/locales/fr_FR):

LC_NUMERIC
decimal_point "<U002C>"
thousands_sep ""
grouping 0;0
END LC_NUMERIC

So at first sight it's more a glibc issue rather than debian-specific.

However, this was filed as a bug in debian long ago. Incidentally, it got fixed
just this week, meaning that the future debian-glibc should use a space instead
of the empty string, as reported here:
http://lists.debian.org/debian-glibc/2006/02/msg00168.html
Which is consistent with Peter's statement that it was wrong up to now, and also
with what the OP expected in the first place.

--
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

#10Bruce Momjian
bruce@momjian.us
In reply to: Daniel Verite (#9)
Re: [GENERAL] Number format problem

Daniel Verite wrote:

Bruce Momjian wrote:

Oh, so C locale has "" for thousands_sep? Hmm. So there is no way to
disinguish "" as "don't have" from "" as "don't want". Will revert.

Possibly we could special-case C locale, and allow an empty string only
when not C locale. However, it could be that there are other locales
where the change would surprise people, too :-(

Not sure. Peter stated the French locale was wrong for the Debian they
were using, and he should know, so I am going to wait for more feeback.

The baseline glibc sources from gnu.org have
(in glibc-2.3.6/localedata/locales/fr_FR):

LC_NUMERIC
decimal_point "<U002C>"
thousands_sep ""
grouping 0;0
END LC_NUMERIC

So at first sight it's more a glibc issue rather than debian-specific.

However, this was filed as a bug in debian long ago. Incidentally, it got fixed
just this week, meaning that the future debian-glibc should use a space instead
of the empty string, as reported here:
http://lists.debian.org/debian-glibc/2006/02/msg00168.html
Which is consistent with Peter's statement that it was wrong up to now, and also
with what the OP expected in the first place.

Oh, good. I looked and now realize that psql is already checking for a
non-"" string:

if (*extlconv->thousands_sep)

so we are consistent.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073