Ancient bug in formatting.c/to_char()

Started by Peter Geogheganover 11 years ago4 messages
#1Peter Geoghegan
pg@heroku.com
1 attachment(s)

Consider this example:

[local]/postgres=# SELECT to_char(1e9::float8,'999999999999999999999D9');
to_char
--------------------------
1000000000.0
(1 row)

[local]/postgres=# SELECT to_char(1e20::float8,'999999999999999999999D9');
to_char
------------------------
100000000000000000000
(1 row)

[local]/postgres=# SELECT to_char(1e20,'999999999999999999999D9');
to_char
--------------------------
100000000000000000000.0
(1 row)

There is access to uninitialized memory here. #define
DEBUG_TO_FROM_CHAR shows this to be the case (garbage is printed to
stdout).

Valgrind brought this to my attention. I propose the attached patch as
a fix for this issue.

The direct cause appears to be that float8_to_char() feels entitled to
truncate Number description post-digits, while that doesn't happen
within numeric_to_char(), say. This is very old code, from the
original to_char() patch back in 2000, and the interactions here are
not obvious. I think that that should be okay to not do this
truncation, since the value of MAXDOUBLEWIDTH was greatly increased in
2007 as part of a round of fixes to bugs of similar vintage. There is
still a defensive measure against over-sized input (we bail), but that
ought to be unnecessary, strictly speaking.

--
Peter Geoghegan

Attachments:

float8_to_char_fix.patchtext/x-patch; charset=US-ASCII; name=float8_to_char_fix.patchDownload
*** a/src/backend/utils/adt/formatting.c
--- b/src/backend/utils/adt/formatting.c
***************
*** 59,64 ****
--- 59,65 ----
   * -----------------------------------------------------------------------
   */
  
+ /* XXX: DEBUG_TO_FROM_CHAR may access uninitialized memory */
  #ifdef DEBUG_TO_FROM_CHAR
  #define DEBUG_elog_output	DEBUG3
  #endif
*************** float8_to_char(PG_FUNCTION_ARGS)
*** 5427,5440 ****
  			Num.pre += Num.multi;
  		}
  		orgnum = (char *) palloc(MAXDOUBLEWIDTH + 1);
! 		len = snprintf(orgnum, MAXDOUBLEWIDTH + 1, "%.0f", fabs(val));
! 		if (Num.pre > len)
! 			plen = Num.pre - len;
! 		if (len >= DBL_DIG)
! 			Num.post = 0;
! 		else if (Num.post + len > DBL_DIG)
! 			Num.post = DBL_DIG - len;
! 		snprintf(orgnum, MAXDOUBLEWIDTH + 1, "%.*f", Num.post, val);
  
  		if (*orgnum == '-')
  		{						/* < 0 */
--- 5428,5439 ----
  			Num.pre += Num.multi;
  		}
  		orgnum = (char *) palloc(MAXDOUBLEWIDTH + 1);
! 
! 		len = snprintf(orgnum, MAXDOUBLEWIDTH + 1, "%.*f", Num.post, val);
! 
! 		/* Defensive */
! 		if (len >= MAXDOUBLEWIDTH + 1)
! 			elog(ERROR, "double precision for character conversion is too wide");
  
  		if (*orgnum == '-')
  		{						/* < 0 */
*** a/src/test/regress/expected/window.out
--- b/src/test/regress/expected/window.out
*************** SELECT to_char(SUM(n::float8) OVER (ORDE
*** 1765,1771 ****
    FROM (VALUES(1,1e20),(2,1)) n(i,n);
           to_char          
  --------------------------
!   100000000000000000000
                        1.0
  (2 rows)
  
--- 1765,1771 ----
    FROM (VALUES(1,1e20),(2,1)) n(i,n);
           to_char          
  --------------------------
!   100000000000000000000.0
                        1.0
  (2 rows)
  
#2Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Peter Geoghegan (#1)
Re: Ancient bug in formatting.c/to_char()

On 05/29/2014 04:59 AM, Peter Geoghegan wrote:

Consider this example:

[local]/postgres=# SELECT to_char(1e9::float8,'999999999999999999999D9');
to_char
--------------------------
1000000000.0
(1 row)

[local]/postgres=# SELECT to_char(1e20::float8,'999999999999999999999D9');
to_char
------------------------
100000000000000000000
(1 row)

[local]/postgres=# SELECT to_char(1e20,'999999999999999999999D9');
to_char
--------------------------
100000000000000000000.0
(1 row)

There is access to uninitialized memory here. #define
DEBUG_TO_FROM_CHAR shows this to be the case (garbage is printed to
stdout).

Valgrind brought this to my attention. I propose the attached patch as
a fix for this issue.

The direct cause appears to be that float8_to_char() feels entitled to
truncate Number description post-digits, while that doesn't happen
within numeric_to_char(), say. This is very old code, from the
original to_char() patch back in 2000, and the interactions here are
not obvious. I think that that should be okay to not do this
truncation, since the value of MAXDOUBLEWIDTH was greatly increased in
2007 as part of a round of fixes to bugs of similar vintage. There is
still a defensive measure against over-sized input (we bail), but that
ought to be unnecessary, strictly speaking.

postgres=# select to_char('0.1'::float8, '9D' || repeat('9', 1000));
ERROR: double precision for character conversion is too wide

Yeah, the code is pretty crufty, I'm not sure what the proper fix would
be. I wonder if psprintf would be useful.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Peter Geoghegan
pg@heroku.com
In reply to: Heikki Linnakangas (#2)
Re: Ancient bug in formatting.c/to_char()

On Thu, May 29, 2014 at 3:09 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

Yeah, the code is pretty crufty, I'm not sure what the proper fix would be.
I wonder if psprintf would be useful.

I don't know that it's worth worrying about the case you highlight. It
is certainly worth worrying about the lack of a similar fix in
float4_to_char(), though.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Peter Geoghegan
pg@heroku.com
In reply to: Peter Geoghegan (#3)
Re: Ancient bug in formatting.c/to_char()

Where are we on this?

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers