Failing example for to_number()

Started by Erwin Brandstetterover 8 years ago6 messagesdocs
Jump to latest
#1Erwin Brandstetter
brsaweda@gmail.com

The manual suggests here:

https://www.postgresql.org/docs/current/static/functions-formatting.html#FUNCTIONS-FORMATTING-TABLE

to_number(text, text) | numeric convert string to numeric | to_number

('12,454.8-', '99G999D9S')

The example fails for locales where the comma (',') does not happen to be
the group separator and the dot ('.') is not the decimal point.

The example is incorrectly assuming en_US locale. It must instead work
locale-agnostic. Replace:

to_number ('12,454.8-', '99G999D9S')

with:

to_number ('12,454.8-', '99,999.9S')

Regards
Erwin Brandstetter

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Erwin Brandstetter (#1)
Re: Failing example for to_number()

On Mon, Aug 21, 2017 at 5:36 AM, Erwin Brandstetter <brsaweda@gmail.com>
wrote:

The manual suggests here:

https://www.postgresql.org/docs/current/static/functions-
formatting.html#FUNCTIONS-FORMATTING-TABLE

to_number(text, text) | numeric convert string to numeric | to_number

('12,454.8-', '99G999D9S')

The example fails for locales where the comma (',') does not happen to be
the group separator and the dot ('.') is not the decimal point.

The example is incorrectly assuming en_US locale. It must instead work
locale-agnostic. Replace:

to_number ('12,454.8-', '99G999D9S')

with:

to_number ('12,454.8-', '99,999.9S')

​I can see the appeal of demonstrating the locale-dependent G and D symbols
so that a user becoming familiar with the feature knows that they have
options other than just writing out the comma and period literally. If one
wants to try the example in a language other than in which the example was
written they should modify it so that the literal number being parsed in
written in conformance to the locale definition for the language you are
using.

David J.

#3Erwin Brandstetter
brsaweda@gmail.com
In reply to: David G. Johnston (#2)
Re: Failing example for to_number()

On 21 August 2017 at 16:30, David G. Johnston <david.g.johnston@gmail.com>
wrote:

On Mon, Aug 21, 2017 at 5:36 AM, Erwin Brandstetter <brsaweda@gmail.com>
wrote:

The manual suggests here:

https://www.postgresql.org/docs/current/static/functions-for
matting.html#FUNCTIONS-FORMATTING-TABLE

to_number(text, text) | numeric convert string to numeric | to_number

('12,454.8-', '99G999D9S')

The example fails for locales where the comma (',') does not happen to be
the group separator and the dot ('.') is not the decimal point.

The example is incorrectly assuming en_US locale. It must instead work
locale-agnostic. Replace:

to_number ('12,454.8-', '99G999D9S')

with:

to_number ('12,454.8-', '99,999.9S')

​I can see the appeal of demonstrating the locale-dependent G and D
symbols so that a user becoming familiar with the feature knows that they
have options other than just writing out the comma and period literally.
If one wants to try the example in a language other than in which the
example was written they should modify it so that the literal number being
parsed in written in conformance to the locale definition for the language
you are using.

David J.

Well, there is still 'S' in the pattern to pointing to non-literal pattern
symbols (without making the example fail). I see your point, though.

On top of this (separate issue?) ',' and '.' are ***not*** interpreted
according to the current LC_NUMERIC setting. I.e.: even with (example)
Austrian locale, ',' still is interpreted as group separator and '.' still
is the decimal point.
It seems those just follow standard meaning without locale (C locale) -
while 'D' and 'G' pattern symbols enforce locale specific format. It took
me a while to untangle this and I suspect many people will fail to
understand it.

I guess there should be some more explanation.

I stumbled across this while working on this answer on dba.stackexchange.com
:
https://dba.stackexchange.com/questions/183964/is-it-possible-to-insert-un-formatted-data-for-money-data-type-in-postgresql/183975#183975

Regards
Erwin

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Erwin Brandstetter (#3)
Re: Failing example for to_number()

On Mon, Aug 21, 2017 at 7:55 AM, Erwin Brandstetter <brsaweda@gmail.com>
wrote:

On top of this (separate issue?) ',' and '.' are ***not*** interpreted
according to the current LC_NUMERIC setting. I.e.: even with (example)
Austrian locale, ',' still is interpreted as group separator and '.' still
is the decimal point.
It seems those just follow standard meaning without locale (C locale) -
while 'D' and 'G' pattern symbols enforce locale specific format. It took
me a while to untangle this and I suspect many people will fail to
understand it.

​Table 9-26 on that page says "(uses locale)" for D and G, the user is left
to deduce that since comma and period do not have that qualifier that they
are not locale dependent.​

I guess there should be some more explanation.

I don't really agree but I wouldn't discourage someone from submitting a
patch.

David J.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Erwin Brandstetter (#3)
Re: Failing example for to_number()

Erwin Brandstetter <brsaweda@gmail.com> writes:

On 21 August 2017 at 16:30, David G. Johnston <david.g.johnston@gmail.com>
wrote:

On Mon, Aug 21, 2017 at 5:36 AM, Erwin Brandstetter <brsaweda@gmail.com>
wrote:

The example fails for locales where the comma (',') does not happen to be
the group separator and the dot ('.') is not the decimal point.

If one wants to try the example in a language other than in which the
example was written they should modify it so that the literal number being
parsed in written in conformance to the locale definition for the language
you are using.

I guess there should be some more explanation.

Yeah. How about adding something like this to the "Usage notes for
numeric formatting" between tables 9-26 and 9-27:

* The pattern characters S, L, D, and G represent the sign,
currency symbol, decimal point, and thousands separator characters
defined by the current locale (see lc_monetary and lc_numeric).
The pattern characters period and comma represent those exact
characters, with the meanings of decimal point and thousands
separator, regardless of locale.

regards, tom lane

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#6Erwin Brandstetter
brsaweda@gmail.com
In reply to: Tom Lane (#5)
Re: Failing example for to_number()

On 28 August 2017 at 22:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Erwin Brandstetter <brsaweda@gmail.com> writes:

On 21 August 2017 at 16:30, David G. Johnston <

david.g.johnston@gmail.com>

wrote:

On Mon, Aug 21, 2017 at 5:36 AM, Erwin Brandstetter <brsaweda@gmail.com

wrote:

The example fails for locales where the comma (',') does not happen to

be

the group separator and the dot ('.') is not the decimal point.

If one wants to try the example in a language other than in which the
example was written they should modify it so that the literal number

being

parsed in written in conformance to the locale definition for the

language

you are using.

I guess there should be some more explanation.

Yeah. How about adding something like this to the "Usage notes for
numeric formatting" between tables 9-26 and 9-27:

* The pattern characters S, L, D, and G represent the sign,
currency symbol, decimal point, and thousands separator characters
defined by the current locale (see lc_monetary and lc_numeric).
The pattern characters period and comma represent those exact
characters, with the meanings of decimal point and thousands
separator, regardless of locale.

That should help to clear up some confusion, especially for to_number(). I
suggest some minor modifications:

* The pattern characters S, L, D, and G represent the sign,
currency symbol, decimal point, and group separator characters
defined by the current locale (see lc_monetary and lc_numeric).
The pattern characters period (.) and comma (,) represent decimal
point and group
separator, regardless of locale.

Regards
Erwin Brandstetter