[PATCH] Don't truncate integer part in to_char for 'FM99.'

Started by Marti Raudseppover 14 years ago5 messages
#1Marti Raudsepp
marti@juffo.org
1 attachment(s)

Hi,

This patch fixes an edge case bug in the numeric to_char() function.

When the numeric to_char format used fillmode (FM), didn't contain 0s
and had a trailing dot, the integer part of the number was truncated in
error.

to_char(10, 'FM99.') used to return '1', after this patch it will return '10'

This is known to affect the format() function in the mysqlcompat
pgFoundry project.

Regards,
Marti

Attachments:

0001-Don-t-truncate-integer-part-in-to_char-for-FM99.patchtext/x-patch; charset=US-ASCII; name=0001-Don-t-truncate-integer-part-in-to_char-for-FM99.patchDownload
From 848076a119c39782ef854ac940f14f5975430b4e Mon Sep 17 00:00:00 2001
From: Marti Raudsepp <marti@juffo.org>
Date: Wed, 7 Sep 2011 20:12:58 +0300
Subject: [PATCH] Don't truncate integer part in to_char for 'FM99.'

When the numeric to_char format used fillmode (FM), didn't contain 0s
and had a trailing dot, the integer part of the number was truncated in
error.

to_char(10, 'FM99.') used to return '1', now it will return '10'
---
 src/backend/utils/adt/formatting.c    |    6 +++++-
 src/test/regress/expected/numeric.out |    6 ++++++
 src/test/regress/sql/numeric.sql      |    2 ++
 3 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 7efd988..37a819e 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -4460,7 +4460,11 @@ NUM_processor(FormatNode *node, NUMDesc *Num, char *inout, char *number,
 
 		if (IS_FILLMODE(Np->Num))
 		{
-			if (IS_DECIMAL(Np->Num))
+			/*
+			 * Only call get_last_relevant_decnum if the number has fractional
+			 * digits, otherwise the result is bogus.
+			 */
+			if (IS_DECIMAL(Np->Num) && Np->Num->post > 0)
 				Np->last_relevant = get_last_relevant_decnum(
 															 Np->number +
 									 ((Np->Num->zero_end - Np->num_pre > 0) ?
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
index d9927b7..e12ab5b 100644
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -1154,6 +1154,12 @@ SELECT '' AS to_char_23, to_char(val, '9.999EEEE')				FROM num_data;
             | -2.493e+07
 (10 rows)
 
+SELECT '' AS to_char_24, to_char('100'::numeric, 'FM999.');
+ to_char_24 | to_char 
+------------+---------
+            | 100
+(1 row)
+
 -- TO_NUMBER()
 --
 SELECT '' AS to_number_1,  to_number('-34,338,492', '99G999G999');
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
index a1435ec..d552526 100644
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -764,6 +764,8 @@ SELECT '' AS to_char_21, to_char(val, '999999SG9999999999')			FROM num_data;
 SELECT '' AS to_char_22, to_char(val, 'FM9999999999999999.999999999999999')	FROM num_data;
 SELECT '' AS to_char_23, to_char(val, '9.999EEEE')				FROM num_data;
 
+SELECT '' AS to_char_24, to_char('100'::numeric, 'FM999.');
+
 -- TO_NUMBER()
 --
 SELECT '' AS to_number_1,  to_number('-34,338,492', '99G999G999');
-- 
1.7.6.1

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marti Raudsepp (#1)
Re: [PATCH] Don't truncate integer part in to_char for 'FM99.'

Marti Raudsepp <marti@juffo.org> writes:

This patch fixes an edge case bug in the numeric to_char() function.

When the numeric to_char format used fillmode (FM), didn't contain 0s
and had a trailing dot, the integer part of the number was truncated in
error.

to_char(10, 'FM99.') used to return '1', after this patch it will return '10'

Hmm. I agree that this is a bug, but the proposed fix seems like a bit
of a kluge. Wouldn't it be better to make get_last_relevant_decnum
honor its contract, that is not delete any relevant digits? I'm
thinking instead of this

if (!p)
p = num;

when there is no decimal point it should do something like

if (!p)
return num + strlen(num) - 1;

regards, tom lane

#3Marti Raudsepp
marti@juffo.org
In reply to: Tom Lane (#2)
1 attachment(s)
Re: [PATCH] Don't truncate integer part in to_char for 'FM99.'

On Wed, Sep 7, 2011 at 21:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmm.  I agree that this is a bug, but the proposed fix seems like a bit
of a kluge. Wouldn't it be better to make get_last_relevant_decnum
honor its contract, that is not delete any relevant digits?

You're right, it was a kludge.

Here's an improved version. I need to take a NUMProc* argument to do
that right, because that's how it knows how many '0's to keep from the
format.

What do you think?

Regards,
Marti

Attachments:

0001-Don-t-truncate-integer-part-in-to_char-for-FM99.patchtext/x-patch; charset=US-ASCII; name=0001-Don-t-truncate-integer-part-in-to_char-for-FM99.patchDownload
From d0264d8fe8179716bfd58e12201e456387ca5469 Mon Sep 17 00:00:00 2001
From: Marti Raudsepp <marti@juffo.org>
Date: Wed, 7 Sep 2011 20:12:58 +0300
Subject: [PATCH] Don't truncate integer part in to_char for 'FM99.'

When the numeric to_char format used fillmode (FM), didn't contain 0s
and had a trailing dot, the integer part of the number was truncated in
error.

to_char(10, 'FM99.') used to return '1', now it will return '10'
---
 src/backend/utils/adt/formatting.c    |   29 ++++++++++++++++-------------
 src/test/regress/expected/numeric.out |    6 ++++++
 src/test/regress/sql/numeric.sql      |    2 ++
 3 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 7efd988..eada4a8 100644
--- a/src/backend/utils/adt/formatting.c
+++ b/src/backend/utils/adt/formatting.c
@@ -979,7 +979,7 @@ static char *fill_str(char *str, int c, int max);
 static FormatNode *NUM_cache(int len, NUMDesc *Num, text *pars_str, bool *shouldFree);
 static char *int_to_roman(int number);
 static void NUM_prepare_locale(NUMProc *Np);
-static char *get_last_relevant_decnum(char *num);
+static char *get_last_relevant_decnum(NUMProc *Np);
 static void NUM_numpart_from_char(NUMProc *Np, int id, int plen);
 static void NUM_numpart_to_char(NUMProc *Np, int id);
 static char *NUM_processor(FormatNode *node, NUMDesc *Num, char *inout, char *number,
@@ -3911,17 +3911,26 @@ NUM_prepare_locale(NUMProc *Np)
  * ----------
  */
 static char *
-get_last_relevant_decnum(char *num)
+get_last_relevant_decnum(NUMProc *Np)
 {
 	char	   *result,
-			   *p = strchr(num, '.');
+			   *p;
 
 #ifdef DEBUG_TO_FROM_CHAR
 	elog(DEBUG_elog_output, "get_last_relevant_decnum()");
 #endif
 
-	if (!p)
-		p = num;
+	if (Np->Num->zero_end > Np->num_pre)
+		/* Keep digits according to '0's in the format */
+		p = Np->number + Np->Num->zero_end - Np->num_pre;
+	else
+	{
+		p = strchr(Np->number, '.');
+
+		if (!p)
+			return NULL;
+	}
+
 	result = p;
 
 	while (*(++p))
@@ -4458,14 +4467,8 @@ NUM_processor(FormatNode *node, NUMDesc *Num, char *inout, char *number,
 	{
 		Np->num_pre = plen;
 
-		if (IS_FILLMODE(Np->Num))
-		{
-			if (IS_DECIMAL(Np->Num))
-				Np->last_relevant = get_last_relevant_decnum(
-															 Np->number +
-									 ((Np->Num->zero_end - Np->num_pre > 0) ?
-									  Np->Num->zero_end - Np->num_pre : 0));
-		}
+		if (IS_FILLMODE(Np->Num) && IS_DECIMAL(Np->Num))
+			Np->last_relevant = get_last_relevant_decnum(Np);
 
 		if (Np->sign_wrote == FALSE && Np->num_pre == 0)
 			++Np->num_count;
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
index d9927b7..e12ab5b 100644
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -1154,6 +1154,12 @@ SELECT '' AS to_char_23, to_char(val, '9.999EEEE')				FROM num_data;
             | -2.493e+07
 (10 rows)
 
+SELECT '' AS to_char_24, to_char('100'::numeric, 'FM999.');
+ to_char_24 | to_char 
+------------+---------
+            | 100
+(1 row)
+
 -- TO_NUMBER()
 --
 SELECT '' AS to_number_1,  to_number('-34,338,492', '99G999G999');
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
index a1435ec..d552526 100644
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -764,6 +764,8 @@ SELECT '' AS to_char_21, to_char(val, '999999SG9999999999')			FROM num_data;
 SELECT '' AS to_char_22, to_char(val, 'FM9999999999999999.999999999999999')	FROM num_data;
 SELECT '' AS to_char_23, to_char(val, '9.999EEEE')				FROM num_data;
 
+SELECT '' AS to_char_24, to_char('100'::numeric, 'FM999.');
+
 -- TO_NUMBER()
 --
 SELECT '' AS to_number_1,  to_number('-34,338,492', '99G999G999');
-- 
1.7.6.1

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marti Raudsepp (#3)
Re: [PATCH] Don't truncate integer part in to_char for 'FM99.'

Marti Raudsepp <marti@juffo.org> writes:

On Wed, Sep 7, 2011 at 21:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmm. I agree that this is a bug, but the proposed fix seems like a bit
of a kluge. Wouldn't it be better to make get_last_relevant_decnum
honor its contract, that is not delete any relevant digits?

You're right, it was a kludge.

Here's an improved version. I need to take a NUMProc* argument to do
that right, because that's how it knows how many '0's to keep from the
format.

Yeah, after fooling with it myself I saw that it was the interconnection
of the don't-suppress-'0'-format-positions logic with the find-the-last-
nonzero-digit logic that was confusing matters. (formatting.c may not
be the most spaghetti-ish code I've ever worked with, but it's up
there.) However, I don't think that inserting knowledge of the other
consideration into get_last_relevant_decnum is really the way to make
things cleaner. Also, the way yours is set up, I'm dubious that it
does the right thing when the last '0' specifier is to the left of the
decimal point. I'm currently testing this patch:

diff --git a/src/backend/utils/adt/formatting.c b/src/backend/utils/adt/formatting.c
index 7efd988362889346af86c642f6ee660a4ae1b3d2..a7000250b0363165bee5697ad72036aad28b830e 100644
*** a/src/backend/utils/adt/formatting.c
--- b/src/backend/utils/adt/formatting.c
*************** NUM_prepare_locale(NUMProc *Np)
*** 3908,3913 ****
--- 3908,3916 ----
  /* ----------
   * Return pointer of last relevant number after decimal point
   *	12.0500 --> last relevant is '5'
+  *	12.0000 --> last relevant is '.'
+  * If there is no decimal point, return NULL (which will result in same
+  * behavior as if FM hadn't been specified).
   * ----------
   */
  static char *
*************** get_last_relevant_decnum(char *num)
*** 3921,3927 ****
  #endif

if (!p)
! p = num;
result = p;

  	while (*(++p))
--- 3924,3931 ----
  #endif

if (!p)
! return NULL;
!
result = p;

while (*(++p))
*************** NUM_processor(FormatNode *node, NUMDesc
*** 4458,4470 ****
{
Np->num_pre = plen;

! if (IS_FILLMODE(Np->Num))
{
! if (IS_DECIMAL(Np->Num))
! Np->last_relevant = get_last_relevant_decnum(
! Np->number +
! ((Np->Num->zero_end - Np->num_pre > 0) ?
! Np->Num->zero_end - Np->num_pre : 0));
}

  		if (Np->sign_wrote == FALSE && Np->num_pre == 0)
--- 4462,4483 ----
  	{
  		Np->num_pre = plen;

! if (IS_FILLMODE(Np->Num) && IS_DECIMAL(Np->Num))
{
! Np->last_relevant = get_last_relevant_decnum(Np->number);
!
! /*
! * If any '0' specifiers are present, make sure we don't strip
! * those digits.
! */
! if (Np->last_relevant && Np->Num->zero_end > Np->num_pre)
! {
! char *last_zero;
!
! last_zero = Np->number + (Np->Num->zero_end - Np->num_pre);
! if (Np->last_relevant < last_zero)
! Np->last_relevant = last_zero;
! }
}

if (Np->sign_wrote == FALSE && Np->num_pre == 0)

regards, tom lane

#5Marti Raudsepp
marti@juffo.org
In reply to: Tom Lane (#4)
Re: [PATCH] Don't truncate integer part in to_char for 'FM99.'

On Wed, Sep 7, 2011 at 23:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:

 Also, the way yours is set up, I'm dubious that it
does the right thing when the last '0' specifier is to the left of the
decimal point.

When the last '0' is left of the decimal point, Num->zero_end is set
to 0, so the branch dealing with that is never executed.

 I'm currently testing this patch:

Looks good to me. It's potentially less efficient, but I'm not worried
about that.

Regards,
Marti