WIP: Make timestamptz_out less slow.

Started by Andres Freundover 10 years ago34 messages

andres@anarazel.de

over 10 years ago

1 attachment(s)

Hi,

I recently once more noticed that timestamptz_out is really, really
slow. To quantify that, I created a bunch of similar sized tables:

CREATE TABLE tbl_timestamp AS SELECT NOW() FROM generate_series(1, 100000) a, generate_series(1, 100) b;
CREATE TABLE tbl_int8 AS SELECT 1::bigint FROM generate_series(1, 100000) a, generate_series(1, 100) b;
CREATE TABLE tbl_bytea AS SELECT ' '::bytea FROM generate_series(1, 100000) a, generate_series(1, 100) b;

These all end up being 346MB large.

COPY tbl_bytea TO '/dev/null';
Time: 1173.484 ms
COPY tbl_int8 TO '/dev/null';
Time: 1030.756 ms
COPY tbl_timestamp TO '/dev/null';
Time: 6598.030

(all best of three)

Yes, timestamp outputs more data as a whole, but being 5 times as slow
is still pretty darn insane. To make sure that's not the cause, here's
another table:

CREATE TABLE tbl_timestamptext AS SELECT NOW()::text FROM generate_series(1, 100000) a, generate_series(1, 100) b;
COPY tbl_timestamptext TO '/dev/null';
Time: 1449.554 ms

So it's really just the timestamp code.

Profiling it shows:
  Overhead  Command         Shared Object     Symbol
  -   38.33%  postgres_stock  libc-2.19.so      [.] vfprintf
     - 97.92% vfprintf
          _IO_vsprintf
        - sprintf
           + 70.25% EncodeDateTime
           + 29.75% AppendSeconds.constprop.10
     + 1.11% _IO_default_xsputn
  -    8.22%  postgres_stock  libc-2.19.so      [.] _IO_default_xsputn
     - 99.43% _IO_default_xsputn
        - 90.09% vfprintf
             _IO_vsprintf
           - sprintf
              + 74.15% EncodeDateTime
              + 25.85% AppendSeconds.constprop.10
        + 9.72% _IO_padn
     + 0.57% vfprintf
  +   7.76%  postgres_stock  postgres_stock    [.] CopyOneRowTo

So nearly all the time is spent somewhere inside the sprintf calls. Not
nice.

The only thing I could come up to make the sprintfs cheaper was to
combine them into one and remove some of the width specifiers that
aren't needed. That doesn't buy us very much.

I then proceeded to replace the sprintf call with hand-rolled
conversions. And wow, the benefit is far bigger than I'd assumed:
postgres[7236][1]=# COPY tbl_timestamp TO '/dev/null';
Time: 2430.521 ms

So, by hand-rolling the ISO conversion in EncodeDateTime() we got a
~250% performance improvement. I'd say that's worthwhile.

The attached patch shows what I did. While there's some polishing
possible, as a whole, it's pretty ugly. But I think timestamp data is so
common that it's worth the effort.

Does anybody have a fundamentally nicer idea than the attached to
improvide this?

Greetings,

Andres Freund

Attachments:

0001-WIP-faster-timestamp-tz-_out.patchtext/x-patch; charset=us-asciiDownload

>From 1d7b6110f8864ee00c1fe4f54d8937347ade7d80 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Mon, 27 Jul 2015 23:09:33 +0200
Subject: [PATCH] WIP: faster timestamp[tz]_out

---
 src/backend/utils/adt/datetime.c        | 108 ++++++++++++++++++++++++++++++++
 src/test/regress/expected/horology.out  |  24 ++++---
 src/test/regress/expected/timestamp.out |  62 +++++++++++-------
 src/test/regress/sql/timestamp.sql      |   1 +
 4 files changed, 164 insertions(+), 31 deletions(-)

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 2a44b6e..4c13f01 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -3968,7 +3968,115 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 
 	switch (style)
 	{
+#ifdef HAVE_INT64_TIMESTAMP
+		case USE_ISO_DATES:
+			/*
+			 * Fastpath for most common format and range. Not using sprintf
+			 * provides significant performance benefits, and timestamp data
+			 * is pretty common. All sane use cases dealing with large amounts
+			 * of data use iso timestamps, so we can focus on optimizing
+			 * those, and keep the rest of the code maintainable.
+			 */
+			if (tm->tm_year > 0 && tm->tm_year < 10000)
+			{
+				int year  = (tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1);
+
+				*str++ = (year / 1000) + '0';
+				*str++ = (year / 100) % 10 + '0';
+				*str++ = (year / 10) % 10 + '0';
+				*str++ = year % 10 + '0';
+				*str++ = '-';
+				*str++ = (tm->tm_mon / 10) + '0';
+				*str++ = tm->tm_mon % 10 + '0';
+				*str++ = '-';
+				*str++ = (tm->tm_mday / 10) + '0';
+				*str++ = tm->tm_mday % 10 + '0';
+				*str++ = ' ';
+				*str++ = (tm->tm_hour / 10) + '0';
+				*str++ = tm->tm_hour % 10 + '0';
+				*str++ = ':';
+				*str++ = (tm->tm_min / 10) + '0';
+				*str++ = tm->tm_min % 10 + '0';
+				*str++ = ':';
+				*str++ = (tm->tm_sec / 10) + '0';
+				*str++ = tm->tm_sec % 10 + '0';
+
+				/*
+				 * Yes, this is darned ugly and would look nicer in a loop,
+				 * but some versions of gcc can't convert the divisions into
+				 * more efficient instructions unless manually unrolled.
+				 */
+				if (fsec != 0)
+				{
+					int fseca = abs(fsec);
+
+					*str++ = '.';
+
+					if (fseca % 1000000 != 0)
+					{
+						*str++ = (fseca / 100000) + '0';
+
+						if (fseca % 100000 != 0)
+						{
+							*str++ = ((fseca / 10000) % 10) + '0';
+
+							if (fseca % 10000 != 0)
+							{
+								*str++ = ((fseca / 1000) % 10) + '0';
+
+								if (fseca % 1000 != 0)
+								{
+									*str++ = ((fseca / 100) % 10) + '0';
+
+									if (fseca % 100 != 0)
+									{
+										*str++ = ((fseca / 10) % 10) + '0';
+
+										if (fseca % 10 != 0)
+										{
+											*str++ = (fseca % 10) + '0';
+										}
+									}
+								}
+							}
+						}
+					}
+				}
+
+				if (print_tz)
+				{
+					int			hour, min, sec;
+
+					sec = abs(tz);
+					min = sec / SECS_PER_MINUTE;
+					sec -= min * SECS_PER_MINUTE;
+					hour = min / MINS_PER_HOUR;
+					min -= hour * MINS_PER_HOUR;
+
+					*str++ = (tz <= 0 ? '+' : '-');
+
+					*str++ = (hour / 10) + '0';
+					*str++ = hour % 10 + '0';
+
+					if (min != 0)
+					{
+						*str++ = (min / 10) + '0';
+						*str++ = min % 10 + '0';
+					}
+
+					if (sec != 0)
+					{
+						*str++ = (sec / 10) + '0';
+						*str++ = sec % 10 + '0';
+					}
+				}
+				*str++ = 0;
+				break;
+			}
+			/* fall through for stuff not handled in the fast path */
+#else
 		case USE_ISO_DATES:
+#endif
 		case USE_XSD_DATES:
 			/* Compatible with ISO-8601 date formats */
 
diff --git a/src/test/regress/expected/horology.out b/src/test/regress/expected/horology.out
index 1fe02be..67b469b 100644
--- a/src/test/regress/expected/horology.out
+++ b/src/test/regress/expected/horology.out
@@ -531,7 +531,8 @@ SELECT '' AS "64", d1 + interval '1 year' AS one_year FROM TIMESTAMP_TBL;
     | Mon Jan 01 17:32:01 2001
     | Mon Dec 31 17:32:01 2001
     | Tue Jan 01 17:32:01 2002
-(65 rows)
+    | Tue Jan 01 17:32:01 12002
+(66 rows)
 
 SELECT '' AS "64", d1 - interval '1 year' AS one_year FROM TIMESTAMP_TBL;
  64 |          one_year           
@@ -601,7 +602,8 @@ SELECT '' AS "64", d1 - interval '1 year' AS one_year FROM TIMESTAMP_TBL;
     | Fri Jan 01 17:32:01 1999
     | Fri Dec 31 17:32:01 1999
     | Sat Jan 01 17:32:01 2000
-(65 rows)
+    | Sat Jan 01 17:32:01 12000
+(66 rows)
 
 SELECT timestamp with time zone '1996-03-01' - interval '1 second' AS "Feb 29";
             Feb 29            
@@ -2290,7 +2292,8 @@ SELECT '' AS "64", d1 AS us_postgres FROM TIMESTAMP_TBL;
     | Sat Jan 01 17:32:01 2000
     | Sun Dec 31 17:32:01 2000
     | Mon Jan 01 17:32:01 2001
-(65 rows)
+    | Mon Jan 01 17:32:01 12001
+(66 rows)
 
 SELECT '' AS seven, f1 AS us_postgres FROM ABSTIME_TBL;
  seven |         us_postgres          
@@ -2373,7 +2376,8 @@ SELECT '' AS "64", d1 AS us_iso FROM TIMESTAMP_TBL;
     | 2000-01-01 17:32:01
     | 2000-12-31 17:32:01
     | 2001-01-01 17:32:01
-(65 rows)
+    | 12001-01-01 17:32:01
+(66 rows)
 
 SELECT '' AS seven, f1 AS us_iso FROM ABSTIME_TBL;
  seven |         us_iso         
@@ -2462,7 +2466,8 @@ SELECT '' AS "64", d1 AS us_sql FROM TIMESTAMP_TBL;
     | 01/01/2000 17:32:01
     | 12/31/2000 17:32:01
     | 01/01/2001 17:32:01
-(65 rows)
+    | 01/01/12001 17:32:01
+(66 rows)
 
 SELECT '' AS seven, f1 AS us_sql FROM ABSTIME_TBL;
  seven |         us_sql          
@@ -2558,8 +2563,9 @@ SELECT '' AS "65", d1 AS european_postgres FROM TIMESTAMP_TBL;
     | Sat 01 Jan 17:32:01 2000
     | Sun 31 Dec 17:32:01 2000
     | Mon 01 Jan 17:32:01 2001
+    | Mon 01 Jan 17:32:01 12001
     | Thu 13 Jun 00:00:00 1957
-(66 rows)
+(67 rows)
 
 SELECT '' AS seven, f1 AS european_postgres FROM ABSTIME_TBL;
  seven |      european_postgres       
@@ -2648,8 +2654,9 @@ SELECT '' AS "65", d1 AS european_iso FROM TIMESTAMP_TBL;
     | 2000-01-01 17:32:01
     | 2000-12-31 17:32:01
     | 2001-01-01 17:32:01
+    | 12001-01-01 17:32:01
     | 1957-06-13 00:00:00
-(66 rows)
+(67 rows)
 
 SELECT '' AS seven, f1 AS european_iso FROM ABSTIME_TBL;
  seven |      european_iso      
@@ -2738,8 +2745,9 @@ SELECT '' AS "65", d1 AS european_sql FROM TIMESTAMP_TBL;
     | 01/01/2000 17:32:01
     | 31/12/2000 17:32:01
     | 01/01/2001 17:32:01
+    | 01/01/12001 17:32:01
     | 13/06/1957 00:00:00
-(66 rows)
+(67 rows)
 
 SELECT '' AS seven, f1 AS european_sql FROM ABSTIME_TBL;
  seven |      european_sql       
diff --git a/src/test/regress/expected/timestamp.out b/src/test/regress/expected/timestamp.out
index a092fc2..463484f 100644
--- a/src/test/regress/expected/timestamp.out
+++ b/src/test/regress/expected/timestamp.out
@@ -171,6 +171,7 @@ INSERT INTO TIMESTAMP_TBL VALUES ('Dec 31 17:32:01 1999');
 INSERT INTO TIMESTAMP_TBL VALUES ('Jan 01 17:32:01 2000');
 INSERT INTO TIMESTAMP_TBL VALUES ('Dec 31 17:32:01 2000');
 INSERT INTO TIMESTAMP_TBL VALUES ('Jan 01 17:32:01 2001');
+INSERT INTO TIMESTAMP_TBL VALUES ('Jan 01 17:32:01 12001');
 -- Currently unsupported syntax and ranges
 INSERT INTO TIMESTAMP_TBL VALUES ('Feb 16 17:32:01 -0097');
 ERROR:  time zone displacement out of range: "Feb 16 17:32:01 -0097"
@@ -248,7 +249,8 @@ SELECT '' AS "64", d1 FROM TIMESTAMP_TBL;
     | Sat Jan 01 17:32:01 2000
     | Sun Dec 31 17:32:01 2000
     | Mon Jan 01 17:32:01 2001
-(65 rows)
+    | Mon Jan 01 17:32:01 12001
+(66 rows)
 
 -- Demonstrate functions and operators
 SELECT '' AS "48", d1 FROM TIMESTAMP_TBL
@@ -304,7 +306,8 @@ SELECT '' AS "48", d1 FROM TIMESTAMP_TBL
     | Sat Jan 01 17:32:01 2000
     | Sun Dec 31 17:32:01 2000
     | Mon Jan 01 17:32:01 2001
-(49 rows)
+    | Mon Jan 01 17:32:01 12001
+(50 rows)
 
 SELECT '' AS "15", d1 FROM TIMESTAMP_TBL
    WHERE d1 < timestamp without time zone '1997-01-02';
@@ -402,7 +405,8 @@ SELECT '' AS "63", d1 FROM TIMESTAMP_TBL
     | Sat Jan 01 17:32:01 2000
     | Sun Dec 31 17:32:01 2000
     | Mon Jan 01 17:32:01 2001
-(64 rows)
+    | Mon Jan 01 17:32:01 12001
+(65 rows)
 
 SELECT '' AS "16", d1 FROM TIMESTAMP_TBL
    WHERE d1 <= timestamp without time zone '1997-01-02';
@@ -480,7 +484,8 @@ SELECT '' AS "49", d1 FROM TIMESTAMP_TBL
     | Sat Jan 01 17:32:01 2000
     | Sun Dec 31 17:32:01 2000
     | Mon Jan 01 17:32:01 2001
-(50 rows)
+    | Mon Jan 01 17:32:01 12001
+(51 rows)
 
 SELECT '' AS "54", d1 - timestamp without time zone '1997-01-02' AS diff
    FROM TIMESTAMP_TBL WHERE d1 BETWEEN '1902-01-01' AND '2038-01-01';
@@ -873,7 +878,8 @@ SELECT '' AS to_char_1, to_char(d1, 'DAY Day day DY Dy dy MONTH Month month RM M
            | SATURDAY  Saturday  saturday  SAT Sat sat JANUARY   January   january   I    JAN Jan jan
            | SUNDAY    Sunday    sunday    SUN Sun sun DECEMBER  December  december  XII  DEC Dec dec
            | MONDAY    Monday    monday    MON Mon mon JANUARY   January   january   I    JAN Jan jan
-(65 rows)
+           | MONDAY    Monday    monday    MON Mon mon JANUARY   January   january   I    JAN Jan jan
+(66 rows)
 
 SELECT '' AS to_char_2, to_char(d1, 'FMDAY FMDay FMday FMMONTH FMMonth FMmonth FMRM')
    FROM TIMESTAMP_TBL;
@@ -944,12 +950,13 @@ SELECT '' AS to_char_2, to_char(d1, 'FMDAY FMDay FMday FMMONTH FMMonth FMmonth F
            | SATURDAY Saturday saturday JANUARY January january I
            | SUNDAY Sunday sunday DECEMBER December december XII
            | MONDAY Monday monday JANUARY January january I
-(65 rows)
+           | MONDAY Monday monday JANUARY January january I
+(66 rows)
 
 SELECT '' AS to_char_3, to_char(d1, 'Y,YYY YYYY YYY YY Y CC Q MM WW DDD DD D J')
    FROM TIMESTAMP_TBL;
- to_char_3 |                     to_char                     
------------+-------------------------------------------------
+ to_char_3 |                      to_char                       
+-----------+----------------------------------------------------
            | 
            | 
            | 1,970 1970 970 70 0 20 1 01 01 001 01 5 2440588
@@ -1015,7 +1022,8 @@ SELECT '' AS to_char_3, to_char(d1, 'Y,YYY YYYY YYY YY Y CC Q MM WW DDD DD D J')
            | 2,000 2000 000 00 0 20 1 01 01 001 01 7 2451545
            | 2,000 2000 000 00 0 20 4 12 53 366 31 1 2451910
            | 2,001 2001 001 01 1 21 1 01 01 001 01 2 2451911
-(65 rows)
+           | 12,001 12001 001 01 1 121 1 01 01 001 01 2 6104336
+(66 rows)
 
 SELECT '' AS to_char_4, to_char(d1, 'FMY,YYY FMYYYY FMYYY FMYY FMY FMCC FMQ FMMM FMWW FMDDD FMDD FMD FMJ')
    FROM TIMESTAMP_TBL;
@@ -1086,7 +1094,8 @@ SELECT '' AS to_char_4, to_char(d1, 'FMY,YYY FMYYYY FMYYY FMYY FMY FMCC FMQ FMMM
            | 2,000 2000 0 0 0 20 1 1 1 1 1 7 2451545
            | 2,000 2000 0 0 0 20 4 12 53 366 31 1 2451910
            | 2,001 2001 1 1 1 21 1 1 1 1 1 2 2451911
-(65 rows)
+           | 12,001 12001 1 1 1 121 1 1 1 1 1 2 6104336
+(66 rows)
 
 SELECT '' AS to_char_5, to_char(d1, 'HH HH12 HH24 MI SS SSSS')
    FROM TIMESTAMP_TBL;
@@ -1157,7 +1166,8 @@ SELECT '' AS to_char_5, to_char(d1, 'HH HH12 HH24 MI SS SSSS')
            | 05 05 17 32 01 63121
            | 05 05 17 32 01 63121
            | 05 05 17 32 01 63121
-(65 rows)
+           | 05 05 17 32 01 63121
+(66 rows)
 
 SELECT '' AS to_char_6, to_char(d1, E'"HH:MI:SS is" HH:MI:SS "\\"text between quote marks\\""')
    FROM TIMESTAMP_TBL;
@@ -1228,7 +1238,8 @@ SELECT '' AS to_char_6, to_char(d1, E'"HH:MI:SS is" HH:MI:SS "\\"text between qu
            | HH:MI:SS is 05:32:01 "text between quote marks"
            | HH:MI:SS is 05:32:01 "text between quote marks"
            | HH:MI:SS is 05:32:01 "text between quote marks"
-(65 rows)
+           | HH:MI:SS is 05:32:01 "text between quote marks"
+(66 rows)
 
 SELECT '' AS to_char_7, to_char(d1, 'HH24--text--MI--text--SS')
    FROM TIMESTAMP_TBL;
@@ -1299,12 +1310,13 @@ SELECT '' AS to_char_7, to_char(d1, 'HH24--text--MI--text--SS')
            | 17--text--32--text--01
            | 17--text--32--text--01
            | 17--text--32--text--01
-(65 rows)
+           | 17--text--32--text--01
+(66 rows)
 
 SELECT '' AS to_char_8, to_char(d1, 'YYYYTH YYYYth Jth')
    FROM TIMESTAMP_TBL;
- to_char_8 |         to_char         
------------+-------------------------
+ to_char_8 |          to_char          
+-----------+---------------------------
            | 
            | 
            | 1970TH 1970th 2440588th
@@ -1370,12 +1382,13 @@ SELECT '' AS to_char_8, to_char(d1, 'YYYYTH YYYYth Jth')
            | 2000TH 2000th 2451545th
            | 2000TH 2000th 2451910th
            | 2001ST 2001st 2451911th
-(65 rows)
+           | 12001ST 12001st 6104336th
+(66 rows)
 
 SELECT '' AS to_char_9, to_char(d1, 'YYYY A.D. YYYY a.d. YYYY bc HH:MI:SS P.M. HH:MI:SS p.m. HH:MI:SS pm')
    FROM TIMESTAMP_TBL;
- to_char_9 |                               to_char                               
------------+---------------------------------------------------------------------
+ to_char_9 |                                to_char                                 
+-----------+------------------------------------------------------------------------
            | 
            | 
            | 1970 A.D. 1970 a.d. 1970 ad 12:00:00 A.M. 12:00:00 a.m. 12:00:00 am
@@ -1441,12 +1454,13 @@ SELECT '' AS to_char_9, to_char(d1, 'YYYY A.D. YYYY a.d. YYYY bc HH:MI:SS P.M. H
            | 2000 A.D. 2000 a.d. 2000 ad 05:32:01 P.M. 05:32:01 p.m. 05:32:01 pm
            | 2000 A.D. 2000 a.d. 2000 ad 05:32:01 P.M. 05:32:01 p.m. 05:32:01 pm
            | 2001 A.D. 2001 a.d. 2001 ad 05:32:01 P.M. 05:32:01 p.m. 05:32:01 pm
-(65 rows)
+           | 12001 A.D. 12001 a.d. 12001 ad 05:32:01 P.M. 05:32:01 p.m. 05:32:01 pm
+(66 rows)
 
 SELECT '' AS to_char_10, to_char(d1, 'IYYY IYY IY I IW IDDD ID')
    FROM TIMESTAMP_TBL;
- to_char_10 |        to_char         
-------------+------------------------
+ to_char_10 |         to_char         
+------------+-------------------------
             | 
             | 
             | 1970 970 70 0 01 004 4
@@ -1512,7 +1526,8 @@ SELECT '' AS to_char_10, to_char(d1, 'IYYY IYY IY I IW IDDD ID')
             | 1999 999 99 9 52 363 6
             | 2000 000 00 0 52 364 7
             | 2001 001 01 1 01 001 1
-(65 rows)
+            | 12001 001 01 1 01 001 1
+(66 rows)
 
 SELECT '' AS to_char_11, to_char(d1, 'FMIYYY FMIYY FMIY FMI FMIW FMIDDD FMID')
    FROM TIMESTAMP_TBL;
@@ -1583,7 +1598,8 @@ SELECT '' AS to_char_11, to_char(d1, 'FMIYYY FMIYY FMIY FMI FMIW FMIDDD FMID')
             | 1999 999 99 9 52 363 6
             | 2000 0 0 0 52 364 7
             | 2001 1 1 1 1 1 1
-(65 rows)
+            | 12001 1 1 1 1 1 1
+(66 rows)
 
 -- timestamp numeric fields constructor
 SELECT make_timestamp(2014,12,28,6,30,45.887);
diff --git a/src/test/regress/sql/timestamp.sql b/src/test/regress/sql/timestamp.sql
index b22cd48..b548c70 100644
--- a/src/test/regress/sql/timestamp.sql
+++ b/src/test/regress/sql/timestamp.sql
@@ -136,6 +136,7 @@ INSERT INTO TIMESTAMP_TBL VALUES ('Dec 31 17:32:01 1999');
 INSERT INTO TIMESTAMP_TBL VALUES ('Jan 01 17:32:01 2000');
 INSERT INTO TIMESTAMP_TBL VALUES ('Dec 31 17:32:01 2000');
 INSERT INTO TIMESTAMP_TBL VALUES ('Jan 01 17:32:01 2001');
+INSERT INTO TIMESTAMP_TBL VALUES ('Jan 01 17:32:01 12001');
 
 -- Currently unsupported syntax and ranges
 INSERT INTO TIMESTAMP_TBL VALUES ('Feb 16 17:32:01 -0097');
-- 
2.4.0.rc2.1.g3d6bc9a

Tom Lane

tgl@sss.pgh.pa.us

over 10 years ago

In reply to: Andres Freund (#1)

Re: WIP: Make timestamptz_out less slow.

Andres Freund <andres@anarazel.de> writes:

So nearly all the time is spent somewhere inside the sprintf calls. Not
nice.

What happens if you force use of port/snprintf.c instead of glibc's
version?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@anarazel.de

over 10 years ago

In reply to: Tom Lane (#2)

Re: WIP: Make timestamptz_out less slow.

On 2015-07-27 17:31:41 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

So nearly all the time is spent somewhere inside the sprintf calls. Not
nice.

What happens if you force use of port/snprintf.c instead of glibc's
version?

Good question.

Even worse. 15900.014 ms.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 10 years ago

In reply to: Andres Freund (#3)

Re: WIP: Make timestamptz_out less slow.

Andres Freund <andres@anarazel.de> writes:

On 2015-07-27 17:31:41 -0400, Tom Lane wrote:

What happens if you force use of port/snprintf.c instead of glibc's
version?

Even worse. 15900.014 ms.

Interesting. So as a separate optimization problem, we might consider
"try to put snprintf.c at least on a par with glibc". I'm kind of
surprised by this result really, since snprintf.c lacks a lot of the
bells and whistles that are in glibc.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David Rowley

david.rowley@2ndquadrant.com

over 10 years ago

In reply to: Andres Freund (#1)

Re: WIP: Make timestamptz_out less slow.

On 28 July 2015 at 09:17, Andres Freund <andres@anarazel.de> wrote:

Hi,

I recently once more noticed that timestamptz_out is really, really
slow. To quantify that, I created a bunch of similar sized tables:

CREATE TABLE tbl_timestamp AS SELECT NOW() FROM generate_series(1, 100000)
a, generate_series(1, 100) b;
CREATE TABLE tbl_int8 AS SELECT 1::bigint FROM generate_series(1, 100000)
a, generate_series(1, 100) b;
CREATE TABLE tbl_bytea AS SELECT ' '::bytea FROM generate_series(1,
100000) a, generate_series(1, 100) b;

These all end up being 346MB large.

COPY tbl_bytea TO '/dev/null';
Time: 1173.484 ms
COPY tbl_int8 TO '/dev/null';
Time: 1030.756 ms
COPY tbl_timestamp TO '/dev/null';
Time: 6598.030

(all best of three)

Yes, timestamp outputs more data as a whole, but being 5 times as slow
is still pretty darn insane. To make sure that's not the cause, here's
another table:

CREATE TABLE tbl_timestamptext AS SELECT NOW()::text FROM
generate_series(1, 100000) a, generate_series(1, 100) b;
COPY tbl_timestamptext TO '/dev/null';
Time: 1449.554 ms

So it's really just the timestamp code.
Profiling it shows:
Overhead  Command         Shared Object     Symbol
-   38.33%  postgres_stock  libc-2.19.so      [.] vfprintf
- 97.92% vfprintf
_IO_vsprintf
- sprintf
+ 70.25% EncodeDateTime
+ 29.75% AppendSeconds.constprop.10
+ 1.11% _IO_default_xsputn
-    8.22%  postgres_stock  libc-2.19.so      [.] _IO_default_xsputn
- 99.43% _IO_default_xsputn
- 90.09% vfprintf
_IO_vsprintf
- sprintf
+ 74.15% EncodeDateTime
+ 25.85% AppendSeconds.constprop.10
+ 9.72% _IO_padn
+ 0.57% vfprintf
+   7.76%  postgres_stock  postgres_stock    [.] CopyOneRowTo
So nearly all the time is spent somewhere inside the sprintf calls. Not
nice.

The only thing I could come up to make the sprintfs cheaper was to
combine them into one and remove some of the width specifiers that
aren't needed. That doesn't buy us very much.

I then proceeded to replace the sprintf call with hand-rolled
conversions. And wow, the benefit is far bigger than I'd assumed:
postgres[7236][1]=# COPY tbl_timestamp TO '/dev/null';
Time: 2430.521 ms

So, by hand-rolling the ISO conversion in EncodeDateTime() we got a
~250% performance improvement. I'd say that's worthwhile.

The attached patch shows what I did. While there's some polishing
possible, as a whole, it's pretty ugly. But I think timestamp data is so
common that it's worth the effort.

Does anybody have a fundamentally nicer idea than the attached to
improvide this?

It won't be quite as fast as what you've written, but I think it will be
much neater and more likely to be used in other places if we invent a
function like pg_ltoa() which returns a pointer to the new end of string.

Also if we're specifying padding with zeros then we can skip the reverse
part that's in pg_ltoa(), (normally needed since the numeric string is
build in reverse)

The code could then be written as:

str = pg_int2str_pad(str, year, 4);
*str++ = '-';
str = pg_int2str_pad(str, tm->tm_mon, 2);
*str++ = '-';
str = pg_int2str_pad(str, tm->tm_mday, 2);

etc

I've used this method before and found it to be about 10 times faster than
snprintf(), but I was reversing the string, so quite likely it be more than
10 times.

I'm interested to see how much you're really gaining by manually unrolling
the part that builds the fractional part of the second.

We could just build that part with: (untested)

if (fsec != 0)
{
int fseca = abs(fsec);
while (fseca % 10 == 0 && fseca > 0)
fseca /= 10;
*str++ = '.';
str = pg_int2str(str, fseca);
}

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

Andres Freund

andres@anarazel.de

over 10 years ago

In reply to: David Rowley (#5)

Re: WIP: Make timestamptz_out less slow.

On 2015-07-28 10:59:15 +1200, David Rowley wrote:

It won't be quite as fast as what you've written, but I think it will be
much neater and more likely to be used in other places if we invent a
function like pg_ltoa() which returns a pointer to the new end of string.

Also if we're specifying padding with zeros then we can skip the reverse
part that's in pg_ltoa(), (normally needed since the numeric string is
build in reverse)

The code could then be written as:

str = pg_int2str_pad(str, year, 4);
*str++ = '-';
str = pg_int2str_pad(str, tm->tm_mon, 2);
*str++ = '-';
str = pg_int2str_pad(str, tm->tm_mday, 2);

etc

I've used this method before and found it to be about 10 times faster than
snprintf(), but I was reversing the string, so quite likely it be more than
10 times.

Yes, that might be worthwhile to try. Certainly would look less
ugly. Willing to give it a try?

I'm interested to see how much you're really gaining by manually unrolling
the part that builds the fractional part of the second.

It seems to help a fair amount, but this really was more a quick POC
than something serious. My theory why it helps is that each loop
iteration is independent of the previous one and thus can take full
advantage of the pipeline.

Regards,

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David Rowley

david.rowley@2ndquadrant.com

over 10 years ago

In reply to: Andres Freund (#6)

1 attachment(s)

Re: WIP: Make timestamptz_out less slow.

On 28 July 2015 at 19:10, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-28 10:59:15 +1200, David Rowley wrote:

It won't be quite as fast as what you've written, but I think it will be
much neater and more likely to be used in other places if we invent a
function like pg_ltoa() which returns a pointer to the new end of string.

Also if we're specifying padding with zeros then we can skip the reverse
part that's in pg_ltoa(), (normally needed since the numeric string is
build in reverse)

The code could then be written as:

str = pg_int2str_pad(str, year, 4);
*str++ = '-';
str = pg_int2str_pad(str, tm->tm_mon, 2);
*str++ = '-';
str = pg_int2str_pad(str, tm->tm_mday, 2);

etc

I've used this method before and found it to be about 10 times faster

than

snprintf(), but I was reversing the string, so quite likely it be more

than

10 times.

Yes, that might be worthwhile to try. Certainly would look less
ugly. Willing to give it a try?

I had a quick try at this and ended up just writing a small test program to
see what's faster.

Please excuse the mess of the file, I just hacked it together as quickly as
I could with the sole intention of just to get an idea of which is faster
and by how much.

For me the output is as follows:

timestamp_out() = 2015-07-29 02:24:33.34 in 3.506000
timestamp_out_old() = 2015-07-29 02:24:33.034 in 64.518000
timestamp_out_af() = 2015-07-29 02:24:33.034 in 2.981000

timestamp_out_old is master's version, the timestamp_out_af() is yours, and
timestamp_out() is my one. times are in seconds to perform 100 million
calls.

So it appears your version is a bit faster than mine, but we're both about
20 times faster than the current one.
Also mine needs fixed up as the fractional part is not padded the same as
yours, but I doubt that'll affect the performance by much.

My view: It's probably not worth going quite as far as you've gone for a
handful of nanoseconds per call, but perhaps something along the lines of
mine can be fixed up.

Have you thought about what to do when HAVE_INT64_TIMESTAMP is not defined?

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

Andres Freund

andres@anarazel.de

over 10 years ago

In reply to: David Rowley (#7)

Re: WIP: Make timestamptz_out less slow.

On 2015-07-29 03:10:41 +1200, David Rowley wrote:

timestamp_out() = 2015-07-29 02:24:33.34 in 3.506000
timestamp_out_old() = 2015-07-29 02:24:33.034 in 64.518000
timestamp_out_af() = 2015-07-29 02:24:33.034 in 2.981000

timestamp_out_old is master's version, the timestamp_out_af() is yours, and
timestamp_out() is my one. times are in seconds to perform 100 million
calls.

That looks good.

So it appears your version is a bit faster than mine, but we're both about
20 times faster than the current one.
Also mine needs fixed up as the fractional part is not padded the same as
yours, but I doubt that'll affect the performance by much.

Worthwhile to finish that bit and try ;)

My view: It's probably not worth going quite as far as you've gone for a
handful of nanoseconds per call, but perhaps something along the lines of
mine can be fixed up.

Yes, I agreee that your's is probably going to be fast enough.

Have you thought about what to do when HAVE_INT64_TIMESTAMP is not defined?

I don't think it's actually important. The only difference vs float
timestamps is that in the latter case we set fsecs to zero BC.

Unless we want to slow down the common case it seems not unlikely that
we're going to end up with a separate slow path anyway. E.g. neither
your version nor mine handles 5 digit years (which is why I fell back to
the slow path in that case in my patch).

Regards,

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David Rowley

david.rowley@2ndquadrant.com

over 10 years ago

In reply to: Andres Freund (#8)

Re: WIP: Make timestamptz_out less slow.

On 29 July 2015 at 03:25, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-29 03:10:41 +1200, David Rowley wrote:

Have you thought about what to do when HAVE_INT64_TIMESTAMP is not

defined?

I don't think it's actually important. The only difference vs float
timestamps is that in the latter case we set fsecs to zero BC.

I was also thinking that the % 10 won't work when fsec_t is double.

typedef double fsec_t

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

#10

Andres Freund

andres@anarazel.de

over 10 years ago

In reply to: David Rowley (#9)

Re: WIP: Make timestamptz_out less slow.

On 2015-07-29 09:37:26 +1200, David Rowley wrote:

On 29 July 2015 at 03:25, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-29 03:10:41 +1200, David Rowley wrote:

Have you thought about what to do when HAVE_INT64_TIMESTAMP is not

defined?

I don't think it's actually important. The only difference vs float
timestamps is that in the latter case we set fsecs to zero BC.

I was also thinking that the % 10 won't work when fsec_t is double.

typedef double fsec_t

It seems quite possible to move that bit to timestamp2tm and do it
without a dependency on HAVE_INT64_TIMESTAMP. As long as it doesn't slow
down the int timestamp case I'm happy to simplify code in this area.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

David Rowley

david.rowley@2ndquadrant.com

over 10 years ago

In reply to: Andres Freund (#8)

Re: WIP: Make timestamptz_out less slow.

On 29 July 2015 at 03:25, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-29 03:10:41 +1200, David Rowley wrote:

timestamp_out() = 2015-07-29 02:24:33.34 in 3.506000
timestamp_out_old() = 2015-07-29 02:24:33.034 in 64.518000
timestamp_out_af() = 2015-07-29 02:24:33.034 in 2.981000

timestamp_out_old is master's version, the timestamp_out_af() is yours,

and

timestamp_out() is my one. times are in seconds to perform 100 million
calls.

That looks good.

So it appears your version is a bit faster than mine, but we're both

about

20 times faster than the current one.
Also mine needs fixed up as the fractional part is not padded the same as
yours, but I doubt that'll affect the performance by much.

Worthwhile to finish that bit and try ;)

My view: It's probably not worth going quite as far as you've gone for a
handful of nanoseconds per call, but perhaps something along the lines of
mine can be fixed up.

Yes, I agreee that your's is probably going to be fast enough.

Have you thought about what to do when HAVE_INT64_TIMESTAMP is not

defined?

I don't think it's actually important. The only difference vs float
timestamps is that in the latter case we set fsecs to zero BC.

Unless we want to slow down the common case it seems not unlikely that
we're going to end up with a separate slow path anyway. E.g. neither
your version nor mine handles 5 digit years (which is why I fell back to
the slow path in that case in my patch).

It occurred to me that handling the 5 digit year is quite a simple change
to my code:

static char *
pg_uint2str_padding(char *str, unsigned int value, unsigned int padding)
{
char *start = str;
char *end = &str[padding];
unsigned int num = value;
//Assert(padding > 0);
*end = '\0';
while (padding--)
{
str[padding] = num % 10 + '0';
num /= 10;
}
/*
* if value was too big for the specified padding then rebuild the whole
* number again without padding. Conveniently pg_uint2str() does exactly
* this.
*/
if (num > 0)
return pg_uint2str(str, value);

return end;
}

We simply just need to check if there was any 'num' left after consuming
the given space. If there's any left then just use pg_uint2str().
This keeps things very fast for the likely most common case where the year
is 4 digits long.

I've not thought about negative years. The whole function should perhaps
take signed ints instead of unsigned.

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

#12

David Rowley

david.rowley@2ndquadrant.com

over 10 years ago

In reply to: David Rowley (#11)

1 attachment(s)

Re: WIP: Make timestamptz_out less slow.

On 5 August 2015 at 12:51, David Rowley <david.rowley@2ndquadrant.com>
wrote:

On 29 July 2015 at 03:25, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-29 03:10:41 +1200, David Rowley wrote:

timestamp_out() = 2015-07-29 02:24:33.34 in 3.506000
timestamp_out_old() = 2015-07-29 02:24:33.034 in 64.518000
timestamp_out_af() = 2015-07-29 02:24:33.034 in 2.981000

timestamp_out_old is master's version, the timestamp_out_af() is yours,

and

timestamp_out() is my one. times are in seconds to perform 100 million
calls.

That looks good.

So it appears your version is a bit faster than mine, but we're both

about

20 times faster than the current one.
Also mine needs fixed up as the fractional part is not padded the same

as

yours, but I doubt that'll affect the performance by much.

Worthwhile to finish that bit and try ;)

My view: It's probably not worth going quite as far as you've gone for a
handful of nanoseconds per call, but perhaps something along the lines

of

mine can be fixed up.

Yes, I agreee that your's is probably going to be fast enough.

Have you thought about what to do when HAVE_INT64_TIMESTAMP is not

defined?

I don't think it's actually important. The only difference vs float
timestamps is that in the latter case we set fsecs to zero BC.

Unless we want to slow down the common case it seems not unlikely that
we're going to end up with a separate slow path anyway. E.g. neither
your version nor mine handles 5 digit years (which is why I fell back to
the slow path in that case in my patch).

It occurred to me that handling the 5 digit year is quite a simple change
to my code:

We simply just need to check if there was any 'num' left after consuming
the given space. If there's any left then just use pg_uint2str().
This keeps things very fast for the likely most common case where the year
is 4 digits long.

I've not thought about negative years. The whole function should perhaps
take signed ints instead of unsigned.

I've made a few changes to this to get the fractional seconds part working
as it should.

It also now works fine with 5 digit years.

It's still in the form of the test program, but it should be simple enough
to pull out what's required from that and put into Postgres.

I've also changed my version of AppendSeconds() so that it returns a
pointer to the new end of string. This should be useful as I see some
strlen() calls to get the new end of string. It'll easy to remove those now
which will further increase performance.

timestamp_out() is the proposed new version
timestamp_out_old() is master's version
timestamp_out_af() is your version

Performance is as follows:

With Clang
david@ubuntu:~/C$ clang timestamp_out.c -o timestamp_out -O2
david@ubuntu:~/C$ ./timestamp_out
timestamp_out() = 2015-07-29 02:24:33.034 in 0.313686
timestamp_out_old() = 2015-07-29 02:24:33.034 in 5.048472
timestamp_out_af() = 2015-07-29 02:24:33.034 in 0.198147

With gcc
david@ubuntu:~/C$ gcc timestamp_out.c -o timestamp_out -O2
david@ubuntu:~/C$ ./timestamp_out
timestamp_out() = 2015-07-29 02:24:33.034 in 0.405795
timestamp_out_old() = 2015-07-29 02:24:33.034 in 4.678918
timestamp_out_af() = 2015-07-29 02:24:33.034 in 0.270557

Regards

David Rowley
--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

#13

David Rowley

david.rowley@2ndquadrant.com

over 10 years ago

In reply to: Andres Freund (#8)

1 attachment(s)

Re: WIP: Make timestamptz_out less slow.

On 29 July 2015 at 03:25, Andres Freund <andres@anarazel.de> wrote:

On 2015-07-29 03:10:41 +1200, David Rowley wrote:

timestamp_out() = 2015-07-29 02:24:33.34 in 3.506000
timestamp_out_old() = 2015-07-29 02:24:33.034 in 64.518000
timestamp_out_af() = 2015-07-29 02:24:33.034 in 2.981000

timestamp_out_old is master's version, the timestamp_out_af() is yours,

and

timestamp_out() is my one. times are in seconds to perform 100 million
calls.

That looks good.

So it appears your version is a bit faster than mine, but we're both

about

20 times faster than the current one.
Also mine needs fixed up as the fractional part is not padded the same as
yours, but I doubt that'll affect the performance by much.

Worthwhile to finish that bit and try ;)

My view: It's probably not worth going quite as far as you've gone for a
handful of nanoseconds per call, but perhaps something along the lines of
mine can be fixed up.

Yes, I agreee that your's is probably going to be fast enough.

I took a bit of weekend time to finish this one off. Patch attached.

A quick test shows a pretty good performance increase:

create table ts (ts timestamp not null);
insert into ts select generate_series('2010-01-01 00:00:00', '2011-01-01
00:00:00', '1 sec'::interval);
vacuum freeze ts;

Master:
david=# copy ts to 'l:/ts.sql';
COPY 31536001
Time: 20444.468 ms

Patched:
david=# copy ts to 'l:/ts.sql';
COPY 31536001
Time: 10947.097 ms

There's probably a bit more to squeeze out of this.
1. EncodeDateTime() could return the length of the string to allow callers
to use pnstrdup() instead of pstrdup(), which would save the strlen() call.
2. Have pg_int2str_zeropad() and pg_int2str() skip appending the NUL char
and leave this up to the calling function.
3. Make something to replace the sprintf(str, " %.*s", MAXTZLEN, tzn); call.

Perhaps 1 and 3 are worth looking into, but I think 2 is likely too error
prone for the small gain we'd get from it.

Also I was not too sure on if pg_int2str() was too similar to pg_ltoa().
Perhaps pg_ltoa() should just be modified to return the end of string?

Thoughts?

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

timestamp_out_speedup_2015-08-09.patchapplication/octet-stream; name=timestamp_out_speedup_2015-08-09.patchDownload

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 2a44b6e..d712a94 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -43,8 +43,12 @@ static int DecodeTime(char *str, int fmask, int range,
 static const datetkn *datebsearch(const char *key, const datetkn *base, int nel);
 static int DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 		   struct pg_tm * tm);
-static void TrimTrailingZeros(char *str);
-static void AppendSeconds(char *cp, int sec, fsec_t fsec,
+
+#ifndef HAVE_INT64_TIMESTAMP
+static char *TrimTrailingZeros(char *str);
+#endif /* HAVE_INT64_TIMESTAMP */
+
+static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 			  int precision, bool fillzeros);
 static void AdjustFractSeconds(double frac, struct pg_tm * tm, fsec_t *fsec,
 				   int scale);
@@ -394,15 +398,16 @@ GetCurrentTimeUsec(struct pg_tm * tm, fsec_t *fsec, int *tzp)
 		*tzp = tz;
 }
 
-
+#ifndef HAVE_INT64_TIMESTAMP
 /* TrimTrailingZeros()
  * ... resulting from printing numbers with full precision.
  *
  * Before Postgres 8.4, this always left at least 2 fractional digits,
  * but conversations on the lists suggest this isn't desired
  * since showing '0.10' is misleading with values of precision(1).
+ * Returns a pointer to the new end of string.
  */
-static void
+static char *
 TrimTrailingZeros(char *str)
 {
 	int			len = strlen(str);
@@ -412,43 +417,98 @@ TrimTrailingZeros(char *str)
 		len--;
 		*(str + len) = '\0';
 	}
+
+	return str + len;
 }
+#endif /* HAVE_INT64_TIMESTAMP */
 
 /*
- * Append sections and fractional seconds (if any) at *cp.
+ * Append seconds and fractional seconds (if any) at *cp.
  * precision is the max number of fraction digits, fillzeros says to
  * pad to two integral-seconds digits.
  * Note that any sign is stripped from the input seconds values.
+ * Note 'precision' must not be a negative number.
  */
-static void
+static char *
 AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 {
+#ifdef HAVE_INT64_TIMESTAMP
+
+	if (fillzeros)
+		cp = pg_int2str_zeropad(cp, abs(sec), 2);
+	else
+		cp = pg_int2str(cp, abs(sec));
+
+	if (fsec != 0)
+	{
+		int			value = (int) Abs(fsec);
+		char	   *end = &cp[precision + 1];
+		bool		gotnonzero = false;
+
+		*cp++ = '.';
+
+		/*
+		 * Append the fractional seconds part. Note that we don't want any
+		 * trailing zeros here, so since we're building the number in reverse
+		 * we'll skip appending any zeros, unless we've seen a non-zero.
+		 */
+		while (precision--)
+		{
+			int		remainder;
+			int		oldval = value;
+
+			value /= 10;
+			remainder = oldval - value * 10;
+
+			/* check if we got a non-zero */
+			if (remainder)
+				gotnonzero = true;
+
+			if (gotnonzero)
+				cp[precision] = '0' + remainder;
+			else
+				end = &cp[precision];
+		}
+
+		/*
+		 * If we have a non-zero value then precision must have not been enough
+		 * to print the number, we'd better have another go. There won't be any
+		 * zero padding, so we can just use pg_int2str().
+		 */
+		if (value > 0)
+			return pg_int2str(cp, fsec);
+
+		*end = '\0';
+
+		return end;
+	}
+	else
+		return cp;
+#else
+
 	if (fsec == 0)
 	{
 		if (fillzeros)
-			sprintf(cp, "%02d", abs(sec));
+			return pg_int2str_zeropad(cp, abs(sec), 2);
 		else
-			sprintf(cp, "%d", abs(sec));
+			return pg_int2str(cp, abs(sec));
 	}
 	else
 	{
-#ifdef HAVE_INT64_TIMESTAMP
-		if (fillzeros)
-			sprintf(cp, "%02d.%0*d", abs(sec), precision, (int) Abs(fsec));
-		else
-			sprintf(cp, "%d.%0*d", abs(sec), precision, (int) Abs(fsec));
-#else
 		if (fillzeros)
 			sprintf(cp, "%0*.*f", precision + 3, precision, fabs(sec + fsec));
 		else
 			sprintf(cp, "%.*f", precision, fabs(sec + fsec));
-#endif
-		TrimTrailingZeros(cp);
+
+		return TrimTrailingZeros(cp);
 	}
+
+#endif /* HAVE_INT64_TIMESTAMP */
 }
 
+
 /* Variant of above that's specialized to timestamp case */
-static void
+static char *
 AppendTimestampSeconds(char *cp, struct pg_tm * tm, fsec_t fsec)
 {
 	/*
@@ -459,7 +519,7 @@ AppendTimestampSeconds(char *cp, struct pg_tm * tm, fsec_t fsec)
 	if (tm->tm_year <= 0)
 		fsec = 0;
 #endif
-	AppendSeconds(cp, tm->tm_sec, fsec, MAX_TIMESTAMP_PRECISION, true);
+	return AppendSeconds(cp, tm->tm_sec, fsec, MAX_TIMESTAMP_PRECISION, true);
 }
 
 /*
@@ -3831,9 +3891,11 @@ datebsearch(const char *key, const datetkn *base, int nel)
 }
 
 /* EncodeTimezone()
- *		Append representation of a numeric timezone offset to str.
+ *		Copies representation of a numeric timezone offset into str.
+ *		Returns a pointer pointing to the NUL character at the end of the
+ *		string.
  */
-static void
+static char *
 EncodeTimezone(char *str, int tz, int style)
 {
 	int			hour,
@@ -3846,16 +3908,26 @@ EncodeTimezone(char *str, int tz, int style)
 	hour = min / MINS_PER_HOUR;
 	min -= hour * MINS_PER_HOUR;
 
-	str += strlen(str);
 	/* TZ is negated compared to sign we wish to display ... */
 	*str++ = (tz <= 0 ? '+' : '-');
 
 	if (sec != 0)
-		sprintf(str, "%02d:%02d:%02d", hour, min, sec);
+	{
+		str = pg_int2str_zeropad(str, hour, 2);
+		*str++ = ':';
+		str = pg_int2str_zeropad(str, min, 2);
+		*str++ = ':';
+		str = pg_int2str_zeropad(str, sec, 2);
+	}
 	else if (min != 0 || style == USE_XSD_DATES)
-		sprintf(str, "%02d:%02d", hour, min);
+	{
+		str = pg_int2str_zeropad(str, hour, 2);
+		*str++ = ':';
+		str = pg_int2str_zeropad(str, min, 2);
+	}
 	else
-		sprintf(str, "%02d", hour);
+		str = pg_int2str_zeropad(str, hour, 2);
+	return str;
 }
 
 /* EncodeDateOnly()
@@ -3871,46 +3943,74 @@ EncodeDateOnly(struct pg_tm * tm, int style, char *str)
 		case USE_ISO_DATES:
 		case USE_XSD_DATES:
 			/* compatible with ISO date formats */
-			if (tm->tm_year > 0)
-				sprintf(str, "%04d-%02d-%02d",
-						tm->tm_year, tm->tm_mon, tm->tm_mday);
-			else
-				sprintf(str, "%04d-%02d-%02d %s",
-						-(tm->tm_year - 1), tm->tm_mon, tm->tm_mday, "BC");
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = '-';
+			str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			*str++ = '-';
+			str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+
+			if (tm->tm_year <= 0)
+				memcpy(str, " BC", 4);
 			break;
 
 		case USE_SQL_DATES:
 			/* compatible with Oracle/Ingres date formats */
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str, "%02d/%02d", tm->tm_mday, tm->tm_mon);
-			else
-				sprintf(str, "%02d/%02d", tm->tm_mon, tm->tm_mday);
-			if (tm->tm_year > 0)
-				sprintf(str + 5, "/%04d", tm->tm_year);
+			{
+				str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+				*str++ = '/';
+				str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			}
 			else
-				sprintf(str + 5, "/%04d %s", -(tm->tm_year - 1), "BC");
+			{
+				str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+				*str++ = '/';
+				str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+			}
+
+			*str++ = '/';
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+
+			if (tm->tm_year <= 0)
+				memcpy(str, " BC", 4);
 			break;
 
 		case USE_GERMAN_DATES:
 			/* German-style date format */
-			sprintf(str, "%02d.%02d", tm->tm_mday, tm->tm_mon);
-			if (tm->tm_year > 0)
-				sprintf(str + 5, ".%04d", tm->tm_year);
-			else
-				sprintf(str + 5, ".%04d %s", -(tm->tm_year - 1), "BC");
+			str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+			*str++ = '.';
+			str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			*str++ = '.';
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+
+			if (tm->tm_year <= 0)
+				memcpy(str, " BC", 4);
 			break;
 
 		case USE_POSTGRES_DATES:
 		default:
 			/* traditional date-only style for Postgres */
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str, "%02d-%02d", tm->tm_mday, tm->tm_mon);
-			else
-				sprintf(str, "%02d-%02d", tm->tm_mon, tm->tm_mday);
-			if (tm->tm_year > 0)
-				sprintf(str + 5, "-%04d", tm->tm_year);
+			{
+				str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+				*str++ = '-';
+				str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			}
 			else
-				sprintf(str + 5, "-%04d %s", -(tm->tm_year - 1), "BC");
+			{
+				str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+				*str++ = '-';
+				str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+			}
+			*str++ = '-';
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+
+			if (tm->tm_year <= 0)
+				memcpy(str, " BC", 4);
 			break;
 	}
 }
@@ -3927,10 +4027,11 @@ EncodeDateOnly(struct pg_tm * tm, int style, char *str)
 void
 EncodeTimeOnly(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, int style, char *str)
 {
-	sprintf(str, "%02d:%02d:", tm->tm_hour, tm->tm_min);
-	str += strlen(str);
-
-	AppendSeconds(str, tm->tm_sec, fsec, MAX_TIME_PRECISION, true);
+	str = pg_int2str_zeropad(str, tm->tm_hour, 2);
+	*str++ = ':';
+	str = pg_int2str_zeropad(str, tm->tm_min, 2);
+	*str++ = ':';
+	str = AppendSeconds(str, tm->tm_sec, fsec, MAX_TIME_PRECISION, true);
 
 	if (print_tz)
 		EncodeTimezone(str, tz, style);
@@ -3971,38 +4072,52 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 		case USE_ISO_DATES:
 		case USE_XSD_DATES:
 			/* Compatible with ISO-8601 date formats */
-
-			if (style == USE_ISO_DATES)
-				sprintf(str, "%04d-%02d-%02d %02d:%02d:",
-						(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-						tm->tm_mon, tm->tm_mday, tm->tm_hour, tm->tm_min);
-			else
-				sprintf(str, "%04d-%02d-%02dT%02d:%02d:",
-						(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-						tm->tm_mon, tm->tm_mday, tm->tm_hour, tm->tm_min);
-
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = '-';
+			str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			*str++ = '-';
+			str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+			*str++ = (style == USE_ISO_DATES) ? ' ' : 'T';
+			str = pg_int2str_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_int2str_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+
+			str = AppendTimestampSeconds(str, tm, fsec);
 
 			if (print_tz)
-				EncodeTimezone(str, tz, style);
+				str = EncodeTimezone(str, tz, style);
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+				memcpy(str, " BC", 4);
 			break;
 
 		case USE_SQL_DATES:
 			/* Compatible with Oracle/Ingres date formats */
 
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str, "%02d/%02d", tm->tm_mday, tm->tm_mon);
+			{
+				str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+				*str++ = '/';
+				str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			}
 			else
-				sprintf(str, "%02d/%02d", tm->tm_mon, tm->tm_mday);
-
-			sprintf(str + 5, "/%04d %02d:%02d:",
-					(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-					tm->tm_hour, tm->tm_min);
+			{
+				str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+				*str++ = '/';
+				str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+			}
 
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			*str++ = '/';
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = ' ';
+			str = pg_int2str_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_int2str_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+			str = AppendTimestampSeconds(str, tm, fsec);
 
 			/*
 			 * Note: the uses of %.*s in this function would be risky if the
@@ -4013,36 +4128,47 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 			if (print_tz)
 			{
 				if (tzn)
-					sprintf(str + strlen(str), " %.*s", MAXTZLEN, tzn);
+				{
+					sprintf(str, " %.*s", MAXTZLEN, tzn);
+					str += strlen(str);
+				}
 				else
-					EncodeTimezone(str, tz, style);
+					str = EncodeTimezone(str, tz, style);
 			}
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+				memcpy(str, " BC", 4);
 			break;
 
 		case USE_GERMAN_DATES:
 			/* German variant on European style */
 
-			sprintf(str, "%02d.%02d", tm->tm_mday, tm->tm_mon);
-
-			sprintf(str + 5, ".%04d %02d:%02d:",
-					(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-					tm->tm_hour, tm->tm_min);
-
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+			*str++ = '.';
+			str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			*str++ = '.';
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = ' ';
+			str = pg_int2str_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_int2str_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+			str = AppendTimestampSeconds(str, tm, fsec);
 
 			if (print_tz)
 			{
 				if (tzn)
-					sprintf(str + strlen(str), " %.*s", MAXTZLEN, tzn);
+				{
+					sprintf(str, " %.*s", MAXTZLEN, tzn);
+					str += strlen(str);
+				}
 				else
-					EncodeTimezone(str, tz, style);
+					str = EncodeTimezone(str, tz, style);
 			}
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+				memcpy(str, " BC", 4);
 			break;
 
 		case USE_POSTGRES_DATES:
@@ -4053,24 +4179,33 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 			tm->tm_wday = j2day(day);
 
 			memcpy(str, days[tm->tm_wday], 3);
-			strcpy(str + 3, " ");
+			str += 3;
+			*str++ = ' ';
 
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str + 4, "%02d %3s", tm->tm_mday, months[tm->tm_mon - 1]);
+				sprintf(str, "%02d %3s", tm->tm_mday, months[tm->tm_mon - 1]);
 			else
-				sprintf(str + 4, "%3s %02d", months[tm->tm_mon - 1], tm->tm_mday);
-
-			sprintf(str + 10, " %02d:%02d:", tm->tm_hour, tm->tm_min);
+				sprintf(str, "%3s %02d", months[tm->tm_mon - 1], tm->tm_mday);
+			str += 6;
 
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			*str++ = ' ';
+			str = pg_int2str_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_int2str_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+			str = AppendTimestampSeconds(str, tm, fsec);
 
-			sprintf(str + strlen(str), " %04d",
-					(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1));
+			*str++ = ' ';
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 
 			if (print_tz)
 			{
 				if (tzn)
-					sprintf(str + strlen(str), " %.*s", MAXTZLEN, tzn);
+				{
+					sprintf(str, " %.*s", MAXTZLEN, tzn);
+					str += strlen(str);
+				}
 				else
 				{
 					/*
@@ -4079,13 +4214,13 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 					 * avoid formatting something which would be rejected by
 					 * the date/time parser later. - thomas 2001-10-19
 					 */
-					sprintf(str + strlen(str), " ");
-					EncodeTimezone(str, tz, style);
+					*str++ = ' ';
+					str = EncodeTimezone(str, tz, style);
 				}
 			}
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+				memcpy(str, " BC", 4);
 			break;
 	}
 }
@@ -4284,8 +4419,7 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 			{
 				if (sec < 0 || fsec < 0)
 					*cp++ = '-';
-				AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
-				cp += strlen(cp);
+				cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
 				*cp++ = 'S';
 				*cp++ = '\0';
 			}
@@ -4337,8 +4471,7 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 				}
 				else if (is_before)
 					*cp++ = '-';
-				AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
-				cp += strlen(cp);
+				cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
 				sprintf(cp, " sec%s",
 						(abs(sec) != 1 || fsec != 0) ? "s" : "");
 				is_zero = FALSE;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 1dadbd5..cb51b8f 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -227,3 +227,147 @@ pg_lltoa(int64 value, char *a)
 		*a-- = swap;
 	}
 }
+
+
+/*
+ * pg_int2str_zeropad
+ *		Converts 'value' into a decimal string representation of the number.
+ *		'padding' specifies the minimum width of the number. Any extra space
+ *		is filled up by prefixing the number with zeros. The return value is a
+ *		pointer to the NUL terminated end of the string.
+ *
+ * Note: Callers should ensure that 'padding' is above zero.
+ * Note: This function is optimized for the case where the number is not too
+ *		 big to fit inside of the specified padding.
+ * Note: Caller must ensure that 'str' points to enough memory to hold the
+		 result (at least 12 bytes, counting a leading sign and trailing NUL,
+		 or padding + 1 bytes, whichever is larger).
+ */
+char *
+pg_int2str_zeropad(char *str, int32 value, int32 padding)
+{
+	char	   *start = str;
+	char	   *end = &str[padding];
+	int 		num = value;
+
+	Assert(padding > 0);
+
+	/*
+	 * Handle negative numbers in a special way. We can't just append a '-'
+	 * prefix and reverse the sign as on two's complement machines negative
+	 * numbers can be 1 further from 0 than positive numbers, we do it this way
+	 * so we properly handle the smallest possible value.
+	 */
+	if (num < 0)
+	{
+		*start++ = '-';
+		padding--;
+
+		/*
+		 * Build the number starting at the end. Here remainder will be a
+		 * negative number, we must reverse this sign on this before adding
+		 * '0' in order to get the correct ASCII digit
+		 */
+		while (padding--)
+		{
+			int32		remainder;
+			int32		oldval = num;
+
+			num /= 10;
+			remainder = oldval - num * 10;
+			start[padding] = '0' + -remainder;
+		}
+	}
+	else
+	{
+		/* build the number starting at the end */
+		while (padding--)
+		{
+			int32		remainder;
+			int32		oldval = num;
+
+			num /= 10;
+			remainder = oldval - num * 10;
+			start[padding] = '0' + remainder;
+		}
+	}
+
+	/*
+	 * If padding was not high enough to fit this number then num won't have
+	 * been divided down to zero. We'd better have another go, this time we
+	 * know there won't be any zero padding required so we can just enlist the
+	 * help of pg_int2str()
+	 */
+	if (num != 0)
+		return pg_int2str(str, value);
+
+	*end = '\0';
+	return end;
+}
+
+/*
+ * pg_int2str
+ *		Converts 'value' into a decimal string representation of the number.
+ *
+ * Caller must ensure that 'str' points to enough memory to hold the result
+ * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Return value is a pointer to the new NUL terminated end of string.
+ */
+char *
+pg_int2str(char *str, int32 value)
+{
+	char *start;
+	char *end;
+
+	/*
+	 * Handle negative numbers in a special way. We can't just append a '-'
+	 * prefix and reverse the sign as on two's complement machines negative
+	 * numbers can be 1 further from 0 than positive numbers, we do it this way
+	 * so we properly handle the smallest possible value.
+	 */
+	if (value < 0)
+	{
+		*str++ = '-';
+
+		/* mark the position we must reverse the string from. */
+		start = str;
+
+		/* Compute the result string backwards. */
+		do
+		{
+			int32		remainder;
+			int32		oldval = value;
+
+			value /= 10;
+			remainder = oldval - value * 10;
+			*str++ = '0' + -remainder;
+		} while (value != 0);
+	}
+	else
+	{
+		/* mark the position we must reverse the string from. */
+		start = str;
+		do
+		{
+			int32		remainder;
+			int32		oldval = value;
+
+			value /= 10;
+			remainder = oldval - value * 10;
+			*str++ = '0' + remainder;
+		} while (value != 0);
+	}
+
+	/* Add trailing NUL byte, and back up 'str' to the last character. */
+	end = str;
+	*str-- = '\0';
+
+	/* Reverse string. */
+	while (start < str)
+	{
+		char		swap = *start;
+		*start++ = *str;
+		*str-- = swap;
+	}
+	return end;
+}
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index fc1679e..a8644b7 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -289,6 +289,8 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern void pg_itoa(int16 i, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
+extern char *pg_int2str_zeropad(char *str, int32 value, int32 padding);
+extern char *pg_int2str(char *str, int32 value);
 
 /*
  *		Per-opclass comparison functions for new btrees.  These are
diff --git a/src/tools/msvc/build.pl b/src/tools/msvc/build.pl
index e107d41..1e2dd62 100644
--- a/src/tools/msvc/build.pl
+++ b/src/tools/msvc/build.pl
@@ -61,7 +61,7 @@ elsif ($buildwhat)
 }
 else
 {
-	system("msbuild pgsql.sln /verbosity:detailed /p:Configuration=$bconf");
+	system("msbuild pgsql.sln /verbosity:quiet /p:Configuration=$bconf");
 }
 
 # report status

#14

Andres Freund

andres@anarazel.de

over 10 years ago

In reply to: David Rowley (#13)

Re: WIP: Make timestamptz_out less slow.

On 2015-08-09 12:47:53 +1200, David Rowley wrote:

I took a bit of weekend time to finish this one off. Patch attached.

A quick test shows a pretty good performance increase:

create table ts (ts timestamp not null);
insert into ts select generate_series('2010-01-01 00:00:00', '2011-01-01
00:00:00', '1 sec'::interval);
vacuum freeze ts;

Master:
david=# copy ts to 'l:/ts.sql';
COPY 31536001
Time: 20444.468 ms

Patched:
david=# copy ts to 'l:/ts.sql';
COPY 31536001
Time: 10947.097 ms

Yes, that's pretty cool.

There's probably a bit more to squeeze out of this.
1. EncodeDateTime() could return the length of the string to allow callers
to use pnstrdup() instead of pstrdup(), which would save the strlen() call.
2. Have pg_int2str_zeropad() and pg_int2str() skip appending the NUL char
and leave this up to the calling function.
3. Make something to replace the sprintf(str, " %.*s", MAXTZLEN, tzn); call.

Perhaps 1 and 3 are worth looking into, but I think 2 is likely too error
prone for the small gain we'd get from it.

I'm inclined to first get the majority of the optimization - as in
somethign similar to the current patch. If we then feel a need to
optimize further we can do that. Otherwise we might end up not getting
the 95% performance improvement in 9.6 because we're playing with the
remaining 5 ;)

Also I was not too sure on if pg_int2str() was too similar to pg_ltoa().
Perhaps pg_ltoa() should just be modified to return the end of string?

I don't think the benefits are worth breaking pg_ltoa interface.

/*
- * Append sections and fractional seconds (if any) at *cp.
+ * Append seconds and fractional seconds (if any) at *cp.
* precision is the max number of fraction digits, fillzeros says to
* pad to two integral-seconds digits.
* Note that any sign is stripped from the input seconds values.
+ * Note 'precision' must not be a negative number.
*/
-static void
+static char *
AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
{
+#ifdef HAVE_INT64_TIMESTAMP

Wouldn't it be better to just normalize fsec to an integer in that case?
Afaics that's the only remaining reason for the alternative path?

+/*
+ * pg_int2str
+ *		Converts 'value' into a decimal string representation of the number.
+ *
+ * Caller must ensure that 'str' points to enough memory to hold the result
+ * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Return value is a pointer to the new NUL terminated end of string.
+ */
+char *
+pg_int2str(char *str, int32 value)
+{
+	char *start;
+	char *end;
+
+	/*
+	 * Handle negative numbers in a special way. We can't just append a '-'
+	 * prefix and reverse the sign as on two's complement machines negative
+	 * numbers can be 1 further from 0 than positive numbers, we do it this way
+	 * so we properly handle the smallest possible value.
+	 */
+	if (value < 0)
+	{

We could alternatively handle this by special-casing INT_MIN, probably
resulting in a bit less duplicated code.

/*
*		Per-opclass comparison functions for new btrees.  These are
diff --git a/src/tools/msvc/build.pl b/src/tools/msvc/build.pl
index e107d41..1e2dd62 100644
--- a/src/tools/msvc/build.pl
+++ b/src/tools/msvc/build.pl
@@ -61,7 +61,7 @@ elsif ($buildwhat)
}
else
{
-	system("msbuild pgsql.sln /verbosity:detailed /p:Configuration=$bconf");
+	system("msbuild pgsql.sln /verbosity:quiet /p:Configuration=$bconf");
}

Uh? I assume that's an acciental change?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

David Rowley

david.rowley@2ndquadrant.com

over 10 years ago

In reply to: Andres Freund (#14)

2 attachment(s)

Re: WIP: Make timestamptz_out less slow.

On 3 September 2015 at 05:10, Andres Freund <andres@anarazel.de> wrote:

On 2015-08-09 12:47:53 +1200, David Rowley wrote:

I took a bit of weekend time to finish this one off. Patch attached.

A quick test shows a pretty good performance increase:

create table ts (ts timestamp not null);
insert into ts select generate_series('2010-01-01 00:00:00', '2011-01-01
00:00:00', '1 sec'::interval);
vacuum freeze ts;

Master:
david=# copy ts to 'l:/ts.sql';
COPY 31536001
Time: 20444.468 ms

Patched:
david=# copy ts to 'l:/ts.sql';
COPY 31536001
Time: 10947.097 ms

Yes, that's pretty cool.

There's probably a bit more to squeeze out of this.
1. EncodeDateTime() could return the length of the string to allow

callers

to use pnstrdup() instead of pstrdup(), which would save the strlen()

call.

2. Have pg_int2str_zeropad() and pg_int2str() skip appending the NUL char
and leave this up to the calling function.
3. Make something to replace the sprintf(str, " %.*s", MAXTZLEN, tzn);

call.

Perhaps 1 and 3 are worth looking into, but I think 2 is likely too error
prone for the small gain we'd get from it.

I see the logic there, but a few weekends ago I modified the patch to
implement number 2. This is perhaps one optimization that would be hard to
change later since, if we suddenly change pg_int2str() to not append the
NUL, then it might break any 3rd party code that's started using them. The
more I think about it the more I think allowing those to skip appending the
NUL is ok. They're very useful functions for incrementally building strings.

Attached is a patch to implement this, which performs as follows
postgres=# copy ts to 'l:/ts.sql';
COPY 31536001
Time: 10665.297 ms

However, this version of the patch does change many existing functions in
datetime.c so that they no longer append the NUL char... The good news is
that these are all static functions, so nothing else should be using them.

I'm happy to leave 1 and 3 for another rainy weekend one day.

Also I was not too sure on if pg_int2str() was too similar to pg_ltoa().
Perhaps pg_ltoa() should just be modified to return the end of string?

I don't think the benefits are worth breaking pg_ltoa interface.

Ok let's leave pg_ltoa() alone

/*
- * Append sections and fractional seconds (if any) at *cp.
+ * Append seconds and fractional seconds (if any) at *cp.
* precision is the max number of fraction digits, fillzeros says to
* pad to two integral-seconds digits.
* Note that any sign is stripped from the input seconds values.
+ * Note 'precision' must not be a negative number.
*/
-static void
+static char *
AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool
fillzeros)

{
+#ifdef HAVE_INT64_TIMESTAMP

Wouldn't it be better to just normalize fsec to an integer in that case?
Afaics that's the only remaining reason for the alternative path?

A special case would need to exist somewhere, unless we just modified

timestamp2tm() to multiply fsec by 1 million when HAVE_INT64_TIMESTAMP is
not defined, but that changes the function signature.

+/*
+ * pg_int2str
+ *           Converts 'value' into a decimal string representation of
the number.
+ *
+ * Caller must ensure that 'str' points to enough memory to hold the
result
+ * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Return value is a pointer to the new NUL terminated end of string.
+ */
+char *
+pg_int2str(char *str, int32 value)
+{
+     char *start;
+     char *end;
+
+     /*
+      * Handle negative numbers in a special way. We can't just append
a '-'

+ * prefix and reverse the sign as on two's complement machines

negative

+ * numbers can be 1 further from 0 than positive numbers, we do it

this way
+      * so we properly handle the smallest possible value.
+      */
+     if (value < 0)
+     {
We could alternatively handle this by special-casing INT_MIN, probably
resulting in a bit less duplicated code.

Those special cases seem to do some weird stuff to the performance
characteristics of those functions. I find my method to handle negative
numbers to produce much faster code.

I experimented with finding the fastest way to convert an int to a string
at the weekend, I did happen to be testing with int64's but likely int32
will be the same.

This is the time taken to convert 30 into "30" 2 billion times.

The closest implementation I'm using in the patch is pg_int64tostr() one.
The pg_int64tostr_memcpy() only performs better with bigger numbers /
longer strings.

david@ubuntu:~/C$ gcc5.2 tostring.c -o tostring -O2
david@ubuntu:~/C$ ./tostring
pg_int64tostr_memcpy in 13.653252 seconds
pg_int64tostr in 2.042616 seconds
pg_int64tostr_new in 2.056688 seconds
pg_lltoa in 13.604653 seconds

david@ubuntu:~/C$ clang tostring.c -o tostring_clang -O2
david@ubuntu:~/C$ ./tostring_clang
pg_int64tostr_memcpy in 0.000004 seconds
pg_int64tostr in 2.063335 seconds
pg_int64tostr_new in 2.102068 seconds
pg_lltoa in 3.424630 seconds

clang obviously saw right through my pg_int64tostr_memcpy() and optimized
something I'd not intended it to.
Notice how badly pg_lltoa() performs in comparison, especially with gcc.

I'd actually be inclined to consider changing pg_lltoa() to remove the
special case and implement it the same as I've proposed we do in
pg_int2str(). That's not for this patch though, and would likely need
tested with some other compilers.

/*
*           Per-opclass comparison functions for new btrees.  These are
diff --git a/src/tools/msvc/build.pl b/src/tools/msvc/build.pl
index e107d41..1e2dd62 100644
--- a/src/tools/msvc/build.pl
+++ b/src/tools/msvc/build.pl
@@ -61,7 +61,7 @@ elsif ($buildwhat)
}
else
{
-     system("msbuild pgsql.sln /verbosity:detailed
/p:Configuration=$bconf");

+ system("msbuild pgsql.sln /verbosity:quiet

/p:Configuration=$bconf");

}

Uh? I assume that's an acciental change?

Oops, yes.

Attachments:

tostring.ctext/x-csrc; charset=US-ASCII; name=tostring.cDownload

timestamp_out_speedup_2015-09-03_f79f03b.patchapplication/octet-stream; name=timestamp_out_speedup_2015-09-03_f79f03b.patchDownload

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 2a44b6e..99ba6cd 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -43,8 +43,12 @@ static int DecodeTime(char *str, int fmask, int range,
 static const datetkn *datebsearch(const char *key, const datetkn *base, int nel);
 static int DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 		   struct pg_tm * tm);
-static void TrimTrailingZeros(char *str);
-static void AppendSeconds(char *cp, int sec, fsec_t fsec,
+
+#ifndef HAVE_INT64_TIMESTAMP
+static char *TrimTrailingZeros(char *str);
+#endif /* HAVE_INT64_TIMESTAMP */
+
+static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 			  int precision, bool fillzeros);
 static void AdjustFractSeconds(double frac, struct pg_tm * tm, fsec_t *fsec,
 				   int scale);
@@ -394,15 +398,16 @@ GetCurrentTimeUsec(struct pg_tm * tm, fsec_t *fsec, int *tzp)
 		*tzp = tz;
 }
 
-
+#ifndef HAVE_INT64_TIMESTAMP
 /* TrimTrailingZeros()
  * ... resulting from printing numbers with full precision.
  *
  * Before Postgres 8.4, this always left at least 2 fractional digits,
  * but conversations on the lists suggest this isn't desired
  * since showing '0.10' is misleading with values of precision(1).
+ * Returns a pointer to the new end of string.
  */
-static void
+static char *
 TrimTrailingZeros(char *str)
 {
 	int			len = strlen(str);
@@ -412,43 +417,100 @@ TrimTrailingZeros(char *str)
 		len--;
 		*(str + len) = '\0';
 	}
+
+	return str + len;
 }
+#endif /* HAVE_INT64_TIMESTAMP */
 
 /*
- * Append sections and fractional seconds (if any) at *cp.
+ * Append seconds and fractional seconds (if any) at *cp.
  * precision is the max number of fraction digits, fillzeros says to
  * pad to two integral-seconds digits.
  * Note that any sign is stripped from the input seconds values.
+ * Note 'precision' must not be a negative number.
+ * Note callers should assume cp will not be NUL terminated.
  */
-static void
+static char *
 AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 {
+#ifdef HAVE_INT64_TIMESTAMP
+
+	if (fillzeros)
+		cp = pg_int2str_zeropad(cp, abs(sec), 2);
+	else
+		cp = pg_int2str(cp, abs(sec));
+
+	if (fsec != 0)
+	{
+		int			value = (int) Abs(fsec);
+		char	   *end = &cp[precision + 1];
+		bool		gotnonzero = false;
+
+		*cp++ = '.';
+
+		/*
+		 * Append the fractional seconds part. Note that we don't want any
+		 * trailing zeros here, so since we're building the number in reverse
+		 * we'll skip appending any zeros, unless we've seen a non-zero.
+		 */
+		while (precision--)
+		{
+			int		remainder;
+			int		oldval = value;
+
+			value /= 10;
+			remainder = oldval - value * 10;
+
+			/* check if we got a non-zero */
+			if (remainder)
+				gotnonzero = true;
+
+			if (gotnonzero)
+				cp[precision] = '0' + remainder;
+			else
+				end = &cp[precision];
+		}
+
+		/*
+		 * If we have a non-zero value then precision must have not been enough
+		 * to print the number, we'd better have another go. There won't be any
+		 * zero padding, so we can just use pg_int2str().
+		 */
+		if (value > 0)
+			return pg_int2str(cp, fsec);
+
+		return end;
+	}
+	else
+		return cp;
+#else
+
 	if (fsec == 0)
 	{
 		if (fillzeros)
-			sprintf(cp, "%02d", abs(sec));
+			return pg_int2str_zeropad(cp, abs(sec), 2);
 		else
-			sprintf(cp, "%d", abs(sec));
+			return pg_int2str(cp, abs(sec));
 	}
 	else
 	{
-#ifdef HAVE_INT64_TIMESTAMP
-		if (fillzeros)
-			sprintf(cp, "%02d.%0*d", abs(sec), precision, (int) Abs(fsec));
-		else
-			sprintf(cp, "%d.%0*d", abs(sec), precision, (int) Abs(fsec));
-#else
 		if (fillzeros)
 			sprintf(cp, "%0*.*f", precision + 3, precision, fabs(sec + fsec));
 		else
 			sprintf(cp, "%.*f", precision, fabs(sec + fsec));
-#endif
-		TrimTrailingZeros(cp);
+
+		return TrimTrailingZeros(cp);
 	}
+
+#endif /* HAVE_INT64_TIMESTAMP */
 }
 
-/* Variant of above that's specialized to timestamp case */
-static void
+
+/*
+ * Variant of above that's specialized to timestamp case
+ * Note callers should assume cp will not be NUL terminated.
+ */
+static char *
 AppendTimestampSeconds(char *cp, struct pg_tm * tm, fsec_t fsec)
 {
 	/*
@@ -459,7 +521,7 @@ AppendTimestampSeconds(char *cp, struct pg_tm * tm, fsec_t fsec)
 	if (tm->tm_year <= 0)
 		fsec = 0;
 #endif
-	AppendSeconds(cp, tm->tm_sec, fsec, MAX_TIMESTAMP_PRECISION, true);
+	return AppendSeconds(cp, tm->tm_sec, fsec, MAX_TIMESTAMP_PRECISION, true);
 }
 
 /*
@@ -3831,9 +3893,13 @@ datebsearch(const char *key, const datetkn *base, int nel)
 }
 
 /* EncodeTimezone()
- *		Append representation of a numeric timezone offset to str.
+ *		Copies representation of a numeric timezone offset into str.
+ *		Returns a pointer pointing to where the NUL character should be set.
+ *
+ * Note str is not NUL terminated, callers are responsible for NUL terminating
+ * str themselves.
  */
-static void
+static char *
 EncodeTimezone(char *str, int tz, int style)
 {
 	int			hour,
@@ -3846,16 +3912,26 @@ EncodeTimezone(char *str, int tz, int style)
 	hour = min / MINS_PER_HOUR;
 	min -= hour * MINS_PER_HOUR;
 
-	str += strlen(str);
 	/* TZ is negated compared to sign we wish to display ... */
 	*str++ = (tz <= 0 ? '+' : '-');
 
 	if (sec != 0)
-		sprintf(str, "%02d:%02d:%02d", hour, min, sec);
+	{
+		str = pg_int2str_zeropad(str, hour, 2);
+		*str++ = ':';
+		str = pg_int2str_zeropad(str, min, 2);
+		*str++ = ':';
+		str = pg_int2str_zeropad(str, sec, 2);
+	}
 	else if (min != 0 || style == USE_XSD_DATES)
-		sprintf(str, "%02d:%02d", hour, min);
+	{
+		str = pg_int2str_zeropad(str, hour, 2);
+		*str++ = ':';
+		str = pg_int2str_zeropad(str, min, 2);
+	}
 	else
-		sprintf(str, "%02d", hour);
+		str = pg_int2str_zeropad(str, hour, 2);
+	return str;
 }
 
 /* EncodeDateOnly()
@@ -3871,46 +3947,90 @@ EncodeDateOnly(struct pg_tm * tm, int style, char *str)
 		case USE_ISO_DATES:
 		case USE_XSD_DATES:
 			/* compatible with ISO date formats */
-			if (tm->tm_year > 0)
-				sprintf(str, "%04d-%02d-%02d",
-						tm->tm_year, tm->tm_mon, tm->tm_mday);
-			else
-				sprintf(str, "%04d-%02d-%02d %s",
-						-(tm->tm_year - 1), tm->tm_mon, tm->tm_mday, "BC");
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = '-';
+			str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			*str++ = '-';
+			str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+
+			if (tm->tm_year <= 0)
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_SQL_DATES:
 			/* compatible with Oracle/Ingres date formats */
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str, "%02d/%02d", tm->tm_mday, tm->tm_mon);
-			else
-				sprintf(str, "%02d/%02d", tm->tm_mon, tm->tm_mday);
-			if (tm->tm_year > 0)
-				sprintf(str + 5, "/%04d", tm->tm_year);
+			{
+				str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+				*str++ = '/';
+				str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			}
 			else
-				sprintf(str + 5, "/%04d %s", -(tm->tm_year - 1), "BC");
+			{
+				str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+				*str++ = '/';
+				str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+			}
+
+			*str++ = '/';
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+
+			if (tm->tm_year <= 0)
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_GERMAN_DATES:
 			/* German-style date format */
-			sprintf(str, "%02d.%02d", tm->tm_mday, tm->tm_mon);
-			if (tm->tm_year > 0)
-				sprintf(str + 5, ".%04d", tm->tm_year);
-			else
-				sprintf(str + 5, ".%04d %s", -(tm->tm_year - 1), "BC");
+			str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+			*str++ = '.';
+			str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			*str++ = '.';
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+
+			if (tm->tm_year <= 0)
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_POSTGRES_DATES:
 		default:
 			/* traditional date-only style for Postgres */
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str, "%02d-%02d", tm->tm_mday, tm->tm_mon);
-			else
-				sprintf(str, "%02d-%02d", tm->tm_mon, tm->tm_mday);
-			if (tm->tm_year > 0)
-				sprintf(str + 5, "-%04d", tm->tm_year);
+			{
+				str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+				*str++ = '-';
+				str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			}
 			else
-				sprintf(str + 5, "-%04d %s", -(tm->tm_year - 1), "BC");
+			{
+				str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+				*str++ = '-';
+				str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+			}
+			*str++ = '-';
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+
+			if (tm->tm_year <= 0)
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 	}
 }
@@ -3927,13 +4047,15 @@ EncodeDateOnly(struct pg_tm * tm, int style, char *str)
 void
 EncodeTimeOnly(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, int style, char *str)
 {
-	sprintf(str, "%02d:%02d:", tm->tm_hour, tm->tm_min);
-	str += strlen(str);
-
-	AppendSeconds(str, tm->tm_sec, fsec, MAX_TIME_PRECISION, true);
+	str = pg_int2str_zeropad(str, tm->tm_hour, 2);
+	*str++ = ':';
+	str = pg_int2str_zeropad(str, tm->tm_min, 2);
+	*str++ = ':';
+	str = AppendSeconds(str, tm->tm_sec, fsec, MAX_TIME_PRECISION, true);
 
 	if (print_tz)
-		EncodeTimezone(str, tz, style);
+		str = EncodeTimezone(str, tz, style);
+	*str = '\0';
 }
 
 
@@ -3971,38 +4093,56 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 		case USE_ISO_DATES:
 		case USE_XSD_DATES:
 			/* Compatible with ISO-8601 date formats */
-
-			if (style == USE_ISO_DATES)
-				sprintf(str, "%04d-%02d-%02d %02d:%02d:",
-						(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-						tm->tm_mon, tm->tm_mday, tm->tm_hour, tm->tm_min);
-			else
-				sprintf(str, "%04d-%02d-%02dT%02d:%02d:",
-						(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-						tm->tm_mon, tm->tm_mday, tm->tm_hour, tm->tm_min);
-
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = '-';
+			str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			*str++ = '-';
+			str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+			*str++ = (style == USE_ISO_DATES) ? ' ' : 'T';
+			str = pg_int2str_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_int2str_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+
+			str = AppendTimestampSeconds(str, tm, fsec);
 
 			if (print_tz)
-				EncodeTimezone(str, tz, style);
+				str = EncodeTimezone(str, tz, style);
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_SQL_DATES:
 			/* Compatible with Oracle/Ingres date formats */
 
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str, "%02d/%02d", tm->tm_mday, tm->tm_mon);
+			{
+				str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+				*str++ = '/';
+				str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			}
 			else
-				sprintf(str, "%02d/%02d", tm->tm_mon, tm->tm_mday);
-
-			sprintf(str + 5, "/%04d %02d:%02d:",
-					(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-					tm->tm_hour, tm->tm_min);
+			{
+				str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+				*str++ = '/';
+				str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+			}
 
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			*str++ = '/';
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = ' ';
+			str = pg_int2str_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_int2str_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+			str = AppendTimestampSeconds(str, tm, fsec);
 
 			/*
 			 * Note: the uses of %.*s in this function would be risky if the
@@ -4013,36 +4153,55 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 			if (print_tz)
 			{
 				if (tzn)
-					sprintf(str + strlen(str), " %.*s", MAXTZLEN, tzn);
+				{
+					sprintf(str, " %.*s", MAXTZLEN, tzn);
+					str += strlen(str);
+				}
 				else
-					EncodeTimezone(str, tz, style);
+					str = EncodeTimezone(str, tz, style);
 			}
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_GERMAN_DATES:
 			/* German variant on European style */
 
-			sprintf(str, "%02d.%02d", tm->tm_mday, tm->tm_mon);
-
-			sprintf(str + 5, ".%04d %02d:%02d:",
-					(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-					tm->tm_hour, tm->tm_min);
-
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			str = pg_int2str_zeropad(str, tm->tm_mday, 2);
+			*str++ = '.';
+			str = pg_int2str_zeropad(str, tm->tm_mon, 2);
+			*str++ = '.';
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = ' ';
+			str = pg_int2str_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_int2str_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+			str = AppendTimestampSeconds(str, tm, fsec);
 
 			if (print_tz)
 			{
 				if (tzn)
-					sprintf(str + strlen(str), " %.*s", MAXTZLEN, tzn);
+				{
+					sprintf(str, " %.*s", MAXTZLEN, tzn);
+					str += strlen(str);
+				}
 				else
-					EncodeTimezone(str, tz, style);
+					str = EncodeTimezone(str, tz, style);
 			}
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_POSTGRES_DATES:
@@ -4053,24 +4212,33 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 			tm->tm_wday = j2day(day);
 
 			memcpy(str, days[tm->tm_wday], 3);
-			strcpy(str + 3, " ");
+			str += 3;
+			*str++ = ' ';
 
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str + 4, "%02d %3s", tm->tm_mday, months[tm->tm_mon - 1]);
+				sprintf(str, "%02d %3s", tm->tm_mday, months[tm->tm_mon - 1]);
 			else
-				sprintf(str + 4, "%3s %02d", months[tm->tm_mon - 1], tm->tm_mday);
+				sprintf(str, "%3s %02d", months[tm->tm_mon - 1], tm->tm_mday);
+			str += 6;
 
-			sprintf(str + 10, " %02d:%02d:", tm->tm_hour, tm->tm_min);
+			*str++ = ' ';
+			str = pg_int2str_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_int2str_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+			str = AppendTimestampSeconds(str, tm, fsec);
 
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
-
-			sprintf(str + strlen(str), " %04d",
-					(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1));
+			*str++ = ' ';
+			str = pg_int2str_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 
 			if (print_tz)
 			{
 				if (tzn)
-					sprintf(str + strlen(str), " %.*s", MAXTZLEN, tzn);
+				{
+					sprintf(str, " %.*s", MAXTZLEN, tzn);
+					str += strlen(str);
+				}
 				else
 				{
 					/*
@@ -4079,13 +4247,17 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 					 * avoid formatting something which would be rejected by
 					 * the date/time parser later. - thomas 2001-10-19
 					 */
-					sprintf(str + strlen(str), " ");
-					EncodeTimezone(str, tz, style);
+					*str++ = ' ';
+					str = EncodeTimezone(str, tz, style);
 				}
 			}
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 	}
 }
@@ -4242,7 +4414,8 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 							day_sign, abs(mday),
 							sec_sign, abs(hour), abs(min));
 					cp += strlen(cp);
-					AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					*cp = '\0';
 				}
 				else if (has_year_month)
 				{
@@ -4252,13 +4425,15 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 				{
 					sprintf(cp, "%d %d:%02d:", mday, hour, min);
 					cp += strlen(cp);
-					AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					*cp = '\0';
 				}
 				else
 				{
 					sprintf(cp, "%d:%02d:", hour, min);
 					cp += strlen(cp);
-					AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					*cp = '\0';
 				}
 			}
 			break;
@@ -4284,8 +4459,7 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 			{
 				if (sec < 0 || fsec < 0)
 					*cp++ = '-';
-				AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
-				cp += strlen(cp);
+				cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
 				*cp++ = 'S';
 				*cp++ = '\0';
 			}
@@ -4311,7 +4485,8 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 						(minus ? "-" : (is_before ? "+" : "")),
 						abs(hour), abs(min));
 				cp += strlen(cp);
-				AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+				cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+				*cp = '\0';
 			}
 			break;
 
@@ -4337,8 +4512,7 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 				}
 				else if (is_before)
 					*cp++ = '-';
-				AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
-				cp += strlen(cp);
+				cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
 				sprintf(cp, " sec%s",
 						(abs(sec) != 1 || fsec != 0) ? "s" : "");
 				is_zero = FALSE;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 1dadbd5..25623d5 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -227,3 +227,168 @@ pg_lltoa(int64 value, char *a)
 		*a-- = swap;
 	}
 }
+
+
+/*
+ * pg_int2str_zeropad
+ *		Converts 'value' into a decimal string representation of the number.
+ *		'padding' specifies the minimum width of the number. Any extra space
+ *		is filled up by prefixing the number with zeros. The return value is a
+ *		pointer to, would be end of string. Note that the NUL termination char
+ *		is not written.
+ *
+ * The intended use pattern for this function is to build strings which contain
+ * multiple individual numbers, such as:
+ *
+ *	str = pg_int2str_zeropad(str, hours, 2);
+ *	*str++ = ':';
+ *	str = pg_int2str_zeropad(str, mins, 2);
+ *	*str++ = ':';
+ *	str = pg_int2str_zeropad(str, secs, 2);
+ *	*str = '\0';
+ *
+ * Note: Callers should ensure that 'padding' is above zero.
+ * Note: This function is optimized for the case where the number is not too
+ *		 big to fit inside of the specified padding.
+ * Note: Caller must ensure that 'str' points to enough memory to hold the
+		 result (at least 12 bytes, counting a leading sign and trailing NUL,
+		 or padding + 1 bytes, whichever is larger).
+ */
+char *
+pg_int2str_zeropad(char *str, int32 value, int32 padding)
+{
+	char	   *start = str;
+	char	   *end = &str[padding];
+	int32		num = value;
+
+	Assert(padding > 0);
+
+	/*
+	 * Handle negative numbers in a special way. We can't just append a '-'
+	 * prefix and reverse the sign as on two's complement machines negative
+	 * numbers can be 1 further from 0 than positive numbers, we do it this way
+	 * so we properly handle the smallest possible value.
+	 */
+	if (num < 0)
+	{
+		*start++ = '-';
+		padding--;
+
+		/*
+		 * Build the number starting at the end. Here remainder will be a
+		 * negative number, we must reverse this sign on this before adding
+		 * '0' in order to get the correct ASCII digit
+		 */
+		while (padding--)
+		{
+			int32		remainder;
+			int32		oldval = num;
+
+			num /= 10;
+			remainder = oldval - num * 10;
+			start[padding] = '0' + -remainder;
+		}
+	}
+	else
+	{
+		/* build the number starting at the end */
+		while (padding--)
+		{
+			int32		remainder;
+			int32		oldval = num;
+
+			num /= 10;
+			remainder = oldval - num * 10;
+			start[padding] = '0' + remainder;
+		}
+	}
+
+	/*
+	 * If padding was not high enough to fit this number then num won't have
+	 * been divided down to zero. We'd better have another go, this time we
+	 * know there won't be any zero padding required so we can just enlist the
+	 * help of pg_int2str()
+	 */
+	if (num != 0)
+		return pg_int2str(str, value);
+
+	return end; /* Not NUL terminated */
+}
+
+/*
+ * pg_int2str
+ *		Converts 'value' into a decimal string representation of the number.
+ *
+ * Caller must ensure that 'str' points to enough memory to hold the result
+ * (at least 12 bytes, counting a leading sign and trailing NUL). The return
+ * value is a pointer to, would be end of string. Note that the NUL termination
+ * char is not written.
+ *
+ * The intended use pattern for this function is to build strings which contain
+ * multiple individual numbers, such as:
+ *
+ *	str = pg_int2str(str, a);
+ *	*str++ = ' ';
+ *	str = pg_int2str(str, b);
+ *	*str = '\0';
+ *
+ * Note: Caller must ensure that 'a' points to enough memory to hold the result
+ * (at least 12 bytes, counting a leading sign and trailing NUL).
+ */
+char *
+pg_int2str(char *str, int32 value)
+{
+	char *start;
+	char *end;
+
+	/*
+	 * Handle negative numbers in a special way. We can't just append a '-'
+	 * prefix and reverse the sign as on two's complement machines negative
+	 * numbers can be 1 further from 0 than positive numbers, we do it this way
+	 * so we properly handle the smallest possible value.
+	 */
+	if (value < 0)
+	{
+		*str++ = '-';
+
+		/* mark the position we must reverse the string from. */
+		start = str;
+
+		/* Compute the result string backwards. */
+		do
+		{
+			int32		remainder;
+			int32		oldval = value;
+
+			value /= 10;
+			remainder = oldval - value * 10;
+			*str++ = '0' + -remainder;
+		} while (value != 0);
+	}
+	else
+	{
+		/* mark the position we must reverse the string from. */
+		start = str;
+		do
+		{
+			int32		remainder;
+			int32		oldval = value;
+
+			value /= 10;
+			remainder = oldval - value * 10;
+			*str++ = '0' + remainder;
+		} while (value != 0);
+	}
+
+	/* Remember the end of string and back up 'str' to the last character. */
+	end = str--;
+
+	/* Reverse string. */
+	while (start < str)
+	{
+		char		swap = *start;
+		*start++ = *str;
+		*str-- = swap;
+	}
+	return end;
+}
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index fc1679e..a8644b7 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -289,6 +289,8 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern void pg_itoa(int16 i, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
+extern char *pg_int2str_zeropad(char *str, int32 value, int32 padding);
+extern char *pg_int2str(char *str, int32 value);
 
 /*
  *		Per-opclass comparison functions for new btrees.  These are

#16

Andres Freund

andres@anarazel.de

over 10 years ago

In reply to: David Rowley (#15)

Re: WIP: Make timestamptz_out less slow.

On 2015-09-03 16:28:40 +1200, David Rowley wrote:

Wouldn't it be better to just normalize fsec to an integer in that case?
Afaics that's the only remaining reason for the alternative path?

A special case would need to exist somewhere, unless we just modified

timestamp2tm() to multiply fsec by 1 million when HAVE_INT64_TIMESTAMP is
not defined, but that changes the function signature.

Sure, but that's a one line #ifdef?

We could alternatively handle this by special-casing INT_MIN, probably
resulting in a bit less duplicated code.

Those special cases seem to do some weird stuff to the performance
characteristics of those functions. I find my method to handle negative
numbers to produce much faster code.

That's a bit odd.

I experimented with finding the fastest way to convert an int to a string
at the weekend, I did happen to be testing with int64's but likely int32
will be the same.

This is the time taken to convert 30 into "30" 2 billion times.

The closest implementation I'm using in the patch is pg_int64tostr() one.
The pg_int64tostr_memcpy() only performs better with bigger numbers /
longer strings.

david@ubuntu:~/C$ gcc5.2 tostring.c -o tostring -O2
david@ubuntu:~/C$ ./tostring
pg_int64tostr_memcpy in 13.653252 seconds
pg_int64tostr in 2.042616 seconds
pg_int64tostr_new in 2.056688 seconds
pg_lltoa in 13.604653 seconds

david@ubuntu:~/C$ clang tostring.c -o tostring_clang -O2
david@ubuntu:~/C$ ./tostring_clang
pg_int64tostr_memcpy in 0.000004 seconds
pg_int64tostr in 2.063335 seconds
pg_int64tostr_new in 2.102068 seconds
pg_lltoa in 3.424630 seconds

Are you sure this isn't an optimization artifact where the actual work
is optimized away? Doing 2 billion conversions in 2s kinda implies that
the string conversion is done in ~4 cycles. That seems unrealistically
fast, even for a pipelined cpu.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

David Rowley

david.rowley@2ndquadrant.com

over 10 years ago

In reply to: Andres Freund (#16)

Re: WIP: Make timestamptz_out less slow.

On 3 September 2015 at 22:17, Andres Freund <andres@anarazel.de> wrote:

On 2015-09-03 16:28:40 +1200, David Rowley wrote:

I experimented with finding the fastest way to convert an int to a string
at the weekend, I did happen to be testing with int64's but likely int32
will be the same.

This is the time taken to convert 30 into "30" 2 billion times.

The closest implementation I'm using in the patch is pg_int64tostr() one.
The pg_int64tostr_memcpy() only performs better with bigger numbers /
longer strings.

david@ubuntu:~/C$ gcc5.2 tostring.c -o tostring -O2
david@ubuntu:~/C$ ./tostring
pg_int64tostr_memcpy in 13.653252 seconds
pg_int64tostr in 2.042616 seconds
pg_int64tostr_new in 2.056688 seconds
pg_lltoa in 13.604653 seconds

david@ubuntu:~/C$ clang tostring.c -o tostring_clang -O2
david@ubuntu:~/C$ ./tostring_clang
pg_int64tostr_memcpy in 0.000004 seconds
pg_int64tostr in 2.063335 seconds
pg_int64tostr_new in 2.102068 seconds
pg_lltoa in 3.424630 seconds

Are you sure this isn't an optimization artifact where the actual work
is optimized away? Doing 2 billion conversions in 2s kinda implies that
the string conversion is done in ~4 cycles. That seems unrealistically
fast, even for a pipelined cpu.

I think you're right.
If I change the NUM_TO_STRING in my tostring.c to
#define NUM_TO_PRINT i%100

I just wanted a way to keep the strings 2 digits long, so that it matches
what will happen when building timestamps.

It looks a bit different, but it's still a bit faster than pg_lltoa()

david@ubuntu:~/C$ ./tostring
pg_int64tostr_memcpy in 16.557578 seconds
pg_int64tostr in 8.070706 seconds
pg_int64tostr_new in 7.597320 seconds
pg_lltoa in 12.315339 seconds

david@ubuntu:~/C$ ./tostring_clang
pg_int64tostr_memcpy in 19.698807 seconds
pg_int64tostr in 12.800270 seconds
pg_int64tostr_new in 14.174052 seconds
pg_lltoa in 14.427803 seconds

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

#18

David Rowley

david.rowley@2ndquadrant.com

over 10 years ago

In reply to: Andres Freund (#16)

Re: WIP: Make timestamptz_out less slow.

On 3 September 2015 at 22:17, Andres Freund <andres@anarazel.de> wrote:

On 2015-09-03 16:28:40 +1200, David Rowley wrote:

Wouldn't it be better to just normalize fsec to an integer in that

case?

Afaics that's the only remaining reason for the alternative path?

A special case would need to exist somewhere, unless we just modified

timestamp2tm() to multiply fsec by 1 million when HAVE_INT64_TIMESTAMP is
not defined, but that changes the function signature.

Sure, but that's a one line #ifdef?

I had a look at this but I found that the precision is 10 with double
timestamps per:

#define MAX_TIME_PRECISION 10
#define TIME_PREC_INV 10000000000.0

So if I do something like:

int fseconds = fsec * TIME_PREC_INV;

In AppendSeconds(), it overflows fseconds.

I could of course make fseconds an int64, but then I'll need to include a
64bit version of pg_int2str(). That does not seem like an improvement.

I'm also starting to not like the name pg_int2str(), It may cause confusion
with 2 bytes, instead of convert "2".

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

#19

David Rowley

david.rowley@2ndquadrant.com

over 10 years ago

In reply to: Andres Freund (#14)

2 attachment(s)

Re: WIP: Make timestamptz_out less slow.

On 3 September 2015 at 05:10, Andres Freund <andres@anarazel.de> wrote:

+/*
+ * pg_int2str
+ *           Converts 'value' into a decimal string representation of
the number.
+ *
+ * Caller must ensure that 'str' points to enough memory to hold the
result
+ * (at least 12 bytes, counting a leading sign and trailing NUL).
+ * Return value is a pointer to the new NUL terminated end of string.
+ */
+char *
+pg_int2str(char *str, int32 value)
+{
+     char *start;
+     char *end;
+
+     /*
+      * Handle negative numbers in a special way. We can't just append
a '-'

+ * prefix and reverse the sign as on two's complement machines

negative

+ * numbers can be 1 further from 0 than positive numbers, we do it

this way
+      * so we properly handle the smallest possible value.
+      */
+     if (value < 0)
+     {
We could alternatively handle this by special-casing INT_MIN, probably
resulting in a bit less duplicated code.

Hi Andres,

I've made a few updates to the patch to cleanup a few badly written
comments, and also to fix a missing Abs() call.
I've also renamed the pg_int2str to become pg_ltostr() to bring it more
inline with pg_ltoa.

Attached is also timestamp_delta.patch which changes pg_ltostr() to use the
INT_MIN special case that pg_ltoa also uses. I didn't make a similar change
to pg_ltostr_zeropad() as I'd need to add even more special code to prepend
the correct number of zero before the 2147483648. This zero padding reason
was why I originally came up with the alternative way to handle the
negative numbers. I had just thought that it was best to keep both
functions as similar as possible.

I've not done anything yet to remove the #ifdef HAVE_INT64_TIMESTAMP from
AppendSeconds(). The only way I can think to handle this is to just
make fsec_t unconditionally an int64 (since with float datetimes the
precision is 10 decimal digits after the dec point, this does not fit in
int32). I'd go off and do this, but this means I need to write an int64
version of pg_ltostr(). Should I do it this way?

I also did some tests on server grade hardware, and the performance
increase is looking a bit better.

create table ts (ts timestamp not null);
insert into ts select generate_series('2010-01-01 00:00:00', '2011-01-01
00:00:00', '1 sec'::interval);
vacuum freeze ts;

Master

tpch=# copy ts to '/dev/null';
COPY 31536001
Time: 17071.255 ms

Patched
tpch=# copy ts to '/dev/null';
COPY 31536001
Time: 6402.835 ms

The times don't seem to vary any depending on the attached
timestamp_delta.patch

Regards

David Rowley

Attachments:

timestamp_delta.patchapplication/octet-stream; name=timestamp_delta.patchDownload

diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 6b64a85..408fd77 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -339,43 +339,31 @@ pg_ltostr(char *str, int32 value)
 	char *end;
 
 	/*
-	 * Handle negative numbers in a special way. We can't just append a '-'
-	 * prefix and reverse the sign as on two's complement machines negative
-	 * numbers can be 1 further from 0 than positive numbers, we do it this way
-	 * so we properly handle the smallest possible value.
+	 * Avoid problems with the most negative integer not being representable
+	 * as a positive integer.
 	 */
-	if (value < 0)
+	if (value == (-2147483647 - 1))
 	{
+		memcpy(str, "-2147483648", 11);
+		return str + 11;
+	}
+	else if (value < 0)
+	{
+		value = -value;
 		*str++ = '-';
-
-		/* mark the position we must reverse the string from. */
-		start = str;
-
-		/* Compute the result string backwards. */
-		do
-		{
-			int32		remainder;
-			int32		oldval = value;
-
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + -remainder;
-		} while (value != 0);
 	}
-	else
+	start = str;
+
+	/* Compute the result string backwards. */
+	do
 	{
-		/* mark the position we must reverse the string from. */
-		start = str;
-		do
-		{
-			int32		remainder;
-			int32		oldval = value;
+		int32		remainder;
+		int32		oldval = value;
 
-			value /= 10;
-			remainder = oldval - value * 10;
-			*str++ = '0' + remainder;
-		} while (value != 0);
-	}
+		value /= 10;
+		remainder = oldval - value * 10;
+		*str++ = '0' + remainder;
+	} while (value != 0);
 
 	/* Remember the end of string and back up 'str' to the last character. */
 	end = str--;

timestamp_out_speedup_2015-09-12.patchapplication/octet-stream; name=timestamp_out_speedup_2015-09-12.patchDownload

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 926358e..d0bb930 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -43,8 +43,12 @@ static int DecodeTime(char *str, int fmask, int range,
 static const datetkn *datebsearch(const char *key, const datetkn *base, int nel);
 static int DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 		   struct pg_tm * tm);
-static void TrimTrailingZeros(char *str);
-static void AppendSeconds(char *cp, int sec, fsec_t fsec,
+
+#ifndef HAVE_INT64_TIMESTAMP
+static char *TrimTrailingZeros(char *str);
+#endif /* HAVE_INT64_TIMESTAMP */
+
+static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 			  int precision, bool fillzeros);
 static void AdjustFractSeconds(double frac, struct pg_tm * tm, fsec_t *fsec,
 				   int scale);
@@ -394,15 +398,16 @@ GetCurrentTimeUsec(struct pg_tm * tm, fsec_t *fsec, int *tzp)
 		*tzp = tz;
 }
 
-
+#ifndef HAVE_INT64_TIMESTAMP
 /* TrimTrailingZeros()
  * ... resulting from printing numbers with full precision.
  *
  * Before Postgres 8.4, this always left at least 2 fractional digits,
  * but conversations on the lists suggest this isn't desired
  * since showing '0.10' is misleading with values of precision(1).
+ * Returns a pointer to the new end of string.
  */
-static void
+static char *
 TrimTrailingZeros(char *str)
 {
 	int			len = strlen(str);
@@ -412,43 +417,100 @@ TrimTrailingZeros(char *str)
 		len--;
 		*(str + len) = '\0';
 	}
+
+	return str + len;
 }
+#endif /* HAVE_INT64_TIMESTAMP */
 
 /*
- * Append sections and fractional seconds (if any) at *cp.
+ * Append seconds and fractional seconds (if any) at *cp.
  * precision is the max number of fraction digits, fillzeros says to
  * pad to two integral-seconds digits.
  * Note that any sign is stripped from the input seconds values.
+ * Note 'precision' must not be a negative number.
+ * Note callers should assume cp will not be NUL terminated.
  */
-static void
+static char *
 AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 {
+#ifdef HAVE_INT64_TIMESTAMP
+
+	if (fillzeros)
+		cp = pg_ltostr_zeropad(cp, abs(sec), 2);
+	else
+		cp = pg_ltostr(cp, abs(sec));
+
+	if (fsec != 0)
+	{
+		int			value = (int) Abs(fsec);
+		char	   *end = &cp[precision + 1];
+		bool		gotnonzero = false;
+
+		*cp++ = '.';
+
+		/*
+		 * Append the fractional seconds part. Note that we don't want any
+		 * trailing zeros here, so since we're building the number in reverse
+		 * we'll skip appending any zeros, unless we've seen a non-zero.
+		 */
+		while (precision--)
+		{
+			int		remainder;
+			int		oldval = value;
+
+			value /= 10;
+			remainder = oldval - value * 10;
+
+			/* check if we got a non-zero */
+			if (remainder)
+				gotnonzero = true;
+
+			if (gotnonzero)
+				cp[precision] = '0' + remainder;
+			else
+				end = &cp[precision];
+		}
+
+		/*
+		 * If we have a non-zero value then precision must have not been enough
+		 * to print the number, we'd better have another go. There won't be any
+		 * zero padding, so we can just use pg_ltostr().
+		 */
+		if (value > 0)
+			return pg_ltostr(cp, (int) Abs(fsec));
+
+		return end;
+	}
+	else
+		return cp;
+#else
+
 	if (fsec == 0)
 	{
 		if (fillzeros)
-			sprintf(cp, "%02d", abs(sec));
+			return pg_ltostr_zeropad(cp, abs(sec), 2);
 		else
-			sprintf(cp, "%d", abs(sec));
+			return pg_ltostr(cp, abs(sec));
 	}
 	else
 	{
-#ifdef HAVE_INT64_TIMESTAMP
-		if (fillzeros)
-			sprintf(cp, "%02d.%0*d", abs(sec), precision, (int) Abs(fsec));
-		else
-			sprintf(cp, "%d.%0*d", abs(sec), precision, (int) Abs(fsec));
-#else
 		if (fillzeros)
 			sprintf(cp, "%0*.*f", precision + 3, precision, fabs(sec + fsec));
 		else
 			sprintf(cp, "%.*f", precision, fabs(sec + fsec));
-#endif
-		TrimTrailingZeros(cp);
+
+		return TrimTrailingZeros(cp);
 	}
+
+#endif /* HAVE_INT64_TIMESTAMP */
 }
 
-/* Variant of above that's specialized to timestamp case */
-static void
+
+/*
+ * Variant of above that's specialized to timestamp case
+ * Note callers should assume cp will not be NUL terminated.
+ */
+static char *
 AppendTimestampSeconds(char *cp, struct pg_tm * tm, fsec_t fsec)
 {
 	/*
@@ -459,7 +521,7 @@ AppendTimestampSeconds(char *cp, struct pg_tm * tm, fsec_t fsec)
 	if (tm->tm_year <= 0)
 		fsec = 0;
 #endif
-	AppendSeconds(cp, tm->tm_sec, fsec, MAX_TIMESTAMP_PRECISION, true);
+	return AppendSeconds(cp, tm->tm_sec, fsec, MAX_TIMESTAMP_PRECISION, true);
 }
 
 /*
@@ -3831,9 +3893,13 @@ datebsearch(const char *key, const datetkn *base, int nel)
 }
 
 /* EncodeTimezone()
- *		Append representation of a numeric timezone offset to str.
+ *		Copies representation of a numeric timezone offset into str.
+ *		Returns a pointer pointing to where the NUL character should be set.
+ *
+ * Note str is not NUL terminated, callers are responsible for NUL terminating
+ * str themselves.
  */
-static void
+static char *
 EncodeTimezone(char *str, int tz, int style)
 {
 	int			hour,
@@ -3846,16 +3912,26 @@ EncodeTimezone(char *str, int tz, int style)
 	hour = min / MINS_PER_HOUR;
 	min -= hour * MINS_PER_HOUR;
 
-	str += strlen(str);
 	/* TZ is negated compared to sign we wish to display ... */
 	*str++ = (tz <= 0 ? '+' : '-');
 
 	if (sec != 0)
-		sprintf(str, "%02d:%02d:%02d", hour, min, sec);
+	{
+		str = pg_ltostr_zeropad(str, hour, 2);
+		*str++ = ':';
+		str = pg_ltostr_zeropad(str, min, 2);
+		*str++ = ':';
+		str = pg_ltostr_zeropad(str, sec, 2);
+	}
 	else if (min != 0 || style == USE_XSD_DATES)
-		sprintf(str, "%02d:%02d", hour, min);
+	{
+		str = pg_ltostr_zeropad(str, hour, 2);
+		*str++ = ':';
+		str = pg_ltostr_zeropad(str, min, 2);
+	}
 	else
-		sprintf(str, "%02d", hour);
+		str = pg_ltostr_zeropad(str, hour, 2);
+	return str;
 }
 
 /* EncodeDateOnly()
@@ -3871,46 +3947,90 @@ EncodeDateOnly(struct pg_tm * tm, int style, char *str)
 		case USE_ISO_DATES:
 		case USE_XSD_DATES:
 			/* compatible with ISO date formats */
-			if (tm->tm_year > 0)
-				sprintf(str, "%04d-%02d-%02d",
-						tm->tm_year, tm->tm_mon, tm->tm_mday);
-			else
-				sprintf(str, "%04d-%02d-%02d %s",
-						-(tm->tm_year - 1), tm->tm_mon, tm->tm_mday, "BC");
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = '-';
+			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			*str++ = '-';
+			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+
+			if (tm->tm_year <= 0)
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_SQL_DATES:
 			/* compatible with Oracle/Ingres date formats */
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str, "%02d/%02d", tm->tm_mday, tm->tm_mon);
-			else
-				sprintf(str, "%02d/%02d", tm->tm_mon, tm->tm_mday);
-			if (tm->tm_year > 0)
-				sprintf(str + 5, "/%04d", tm->tm_year);
+			{
+				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				*str++ = '/';
+				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			}
 			else
-				sprintf(str + 5, "/%04d %s", -(tm->tm_year - 1), "BC");
+			{
+				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				*str++ = '/';
+				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			}
+
+			*str++ = '/';
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+
+			if (tm->tm_year <= 0)
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_GERMAN_DATES:
 			/* German-style date format */
-			sprintf(str, "%02d.%02d", tm->tm_mday, tm->tm_mon);
-			if (tm->tm_year > 0)
-				sprintf(str + 5, ".%04d", tm->tm_year);
-			else
-				sprintf(str + 5, ".%04d %s", -(tm->tm_year - 1), "BC");
+			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			*str++ = '.';
+			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			*str++ = '.';
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+
+			if (tm->tm_year <= 0)
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_POSTGRES_DATES:
 		default:
 			/* traditional date-only style for Postgres */
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str, "%02d-%02d", tm->tm_mday, tm->tm_mon);
-			else
-				sprintf(str, "%02d-%02d", tm->tm_mon, tm->tm_mday);
-			if (tm->tm_year > 0)
-				sprintf(str + 5, "-%04d", tm->tm_year);
+			{
+				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				*str++ = '-';
+				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			}
 			else
-				sprintf(str + 5, "-%04d %s", -(tm->tm_year - 1), "BC");
+			{
+				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				*str++ = '-';
+				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			}
+			*str++ = '-';
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+
+			if (tm->tm_year <= 0)
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 	}
 }
@@ -3927,13 +4047,15 @@ EncodeDateOnly(struct pg_tm * tm, int style, char *str)
 void
 EncodeTimeOnly(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, int style, char *str)
 {
-	sprintf(str, "%02d:%02d:", tm->tm_hour, tm->tm_min);
-	str += strlen(str);
-
-	AppendSeconds(str, tm->tm_sec, fsec, MAX_TIME_PRECISION, true);
+	str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+	*str++ = ':';
+	str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+	*str++ = ':';
+	str = AppendSeconds(str, tm->tm_sec, fsec, MAX_TIME_PRECISION, true);
 
 	if (print_tz)
-		EncodeTimezone(str, tz, style);
+		str = EncodeTimezone(str, tz, style);
+	*str = '\0';
 }
 
 
@@ -3971,38 +4093,56 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 		case USE_ISO_DATES:
 		case USE_XSD_DATES:
 			/* Compatible with ISO-8601 date formats */
-
-			if (style == USE_ISO_DATES)
-				sprintf(str, "%04d-%02d-%02d %02d:%02d:",
-						(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-						tm->tm_mon, tm->tm_mday, tm->tm_hour, tm->tm_min);
-			else
-				sprintf(str, "%04d-%02d-%02dT%02d:%02d:",
-						(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-						tm->tm_mon, tm->tm_mday, tm->tm_hour, tm->tm_min);
-
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = '-';
+			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			*str++ = '-';
+			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			*str++ = (style == USE_ISO_DATES) ? ' ' : 'T';
+			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+
+			str = AppendTimestampSeconds(str, tm, fsec);
 
 			if (print_tz)
-				EncodeTimezone(str, tz, style);
+				str = EncodeTimezone(str, tz, style);
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_SQL_DATES:
 			/* Compatible with Oracle/Ingres date formats */
 
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str, "%02d/%02d", tm->tm_mday, tm->tm_mon);
+			{
+				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				*str++ = '/';
+				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			}
 			else
-				sprintf(str, "%02d/%02d", tm->tm_mon, tm->tm_mday);
-
-			sprintf(str + 5, "/%04d %02d:%02d:",
-					(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-					tm->tm_hour, tm->tm_min);
+			{
+				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				*str++ = '/';
+				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			}
 
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			*str++ = '/';
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = ' ';
+			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+			str = AppendTimestampSeconds(str, tm, fsec);
 
 			/*
 			 * Note: the uses of %.*s in this function would be risky if the
@@ -4013,36 +4153,55 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 			if (print_tz)
 			{
 				if (tzn)
-					sprintf(str + strlen(str), " %.*s", MAXTZLEN, tzn);
+				{
+					sprintf(str, " %.*s", MAXTZLEN, tzn);
+					str += strlen(str);
+				}
 				else
-					EncodeTimezone(str, tz, style);
+					str = EncodeTimezone(str, tz, style);
 			}
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_GERMAN_DATES:
 			/* German variant on European style */
 
-			sprintf(str, "%02d.%02d", tm->tm_mday, tm->tm_mon);
-
-			sprintf(str + 5, ".%04d %02d:%02d:",
-					(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-					tm->tm_hour, tm->tm_min);
-
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			*str++ = '.';
+			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			*str++ = '.';
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = ' ';
+			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+			str = AppendTimestampSeconds(str, tm, fsec);
 
 			if (print_tz)
 			{
 				if (tzn)
-					sprintf(str + strlen(str), " %.*s", MAXTZLEN, tzn);
+				{
+					sprintf(str, " %.*s", MAXTZLEN, tzn);
+					str += strlen(str);
+				}
 				else
-					EncodeTimezone(str, tz, style);
+					str = EncodeTimezone(str, tz, style);
 			}
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_POSTGRES_DATES:
@@ -4053,24 +4212,33 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 			tm->tm_wday = j2day(day);
 
 			memcpy(str, days[tm->tm_wday], 3);
-			strcpy(str + 3, " ");
+			str += 3;
+			*str++ = ' ';
 
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str + 4, "%02d %3s", tm->tm_mday, months[tm->tm_mon - 1]);
+				sprintf(str, "%02d %3s", tm->tm_mday, months[tm->tm_mon - 1]);
 			else
-				sprintf(str + 4, "%3s %02d", months[tm->tm_mon - 1], tm->tm_mday);
+				sprintf(str, "%3s %02d", months[tm->tm_mon - 1], tm->tm_mday);
+			str += 6;
 
-			sprintf(str + 10, " %02d:%02d:", tm->tm_hour, tm->tm_min);
+			*str++ = ' ';
+			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+			str = AppendTimestampSeconds(str, tm, fsec);
 
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
-
-			sprintf(str + strlen(str), " %04d",
-					(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1));
+			*str++ = ' ';
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 
 			if (print_tz)
 			{
 				if (tzn)
-					sprintf(str + strlen(str), " %.*s", MAXTZLEN, tzn);
+				{
+					sprintf(str, " %.*s", MAXTZLEN, tzn);
+					str += strlen(str);
+				}
 				else
 				{
 					/*
@@ -4079,13 +4247,17 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 					 * avoid formatting something which would be rejected by
 					 * the date/time parser later. - thomas 2001-10-19
 					 */
-					sprintf(str + strlen(str), " ");
-					EncodeTimezone(str, tz, style);
+					*str++ = ' ';
+					str = EncodeTimezone(str, tz, style);
 				}
 			}
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 	}
 }
@@ -4242,7 +4414,8 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 							day_sign, abs(mday),
 							sec_sign, abs(hour), abs(min));
 					cp += strlen(cp);
-					AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					*cp = '\0';
 				}
 				else if (has_year_month)
 				{
@@ -4252,13 +4425,15 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 				{
 					sprintf(cp, "%d %d:%02d:", mday, hour, min);
 					cp += strlen(cp);
-					AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					*cp = '\0';
 				}
 				else
 				{
 					sprintf(cp, "%d:%02d:", hour, min);
 					cp += strlen(cp);
-					AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					*cp = '\0';
 				}
 			}
 			break;
@@ -4284,8 +4459,7 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 			{
 				if (sec < 0 || fsec < 0)
 					*cp++ = '-';
-				AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
-				cp += strlen(cp);
+				cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
 				*cp++ = 'S';
 				*cp++ = '\0';
 			}
@@ -4311,7 +4485,8 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 						(minus ? "-" : (is_before ? "+" : "")),
 						abs(hour), abs(min));
 				cp += strlen(cp);
-				AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+				cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+				*cp = '\0';
 			}
 			break;
 
@@ -4337,8 +4512,7 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 				}
 				else if (is_before)
 					*cp++ = '-';
-				AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
-				cp += strlen(cp);
+				cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
 				sprintf(cp, " sec%s",
 						(abs(sec) != 1 || fsec != 0) ? "s" : "");
 				is_zero = FALSE;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 1dadbd5..6b64a85 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -227,3 +227,165 @@ pg_lltoa(int64 value, char *a)
 		*a-- = swap;
 	}
 }
+
+
+/*
+ * pg_ltostr_zeropad
+ *		Converts 'value' into a decimal string representation of the number.
+ *		'padding' specifies the minimum width of the number. Any extra space
+ *		is filled up by prefixing the number with zeros. The return value is a
+ *		pointer to, the would be end of string. Note that the NUL termination
+ *		char is not written.
+ *
+ * The intended use pattern for this function is to build strings which contain
+ * multiple individual numbers, such as:
+ *
+ *	str = pg_ltostr_zeropad(str, hours, 2);
+ *	*str++ = ':';
+ *	str = pg_ltostr_zeropad(str, mins, 2);
+ *	*str++ = ':';
+ *	str = pg_ltostr_zeropad(str, secs, 2);
+ *	*str = '\0';
+ *
+ * Note: Callers should ensure that 'padding' is above zero.
+ * Note: This function is optimized for the case where the number is not too
+ *		 big to fit inside of the specified padding.
+ * Note: Caller must ensure that 'str' points to enough memory to hold the
+		 result (at least 12 bytes, counting a leading sign and trailing NUL,
+		 or padding + 1 bytes, whichever is larger).
+ */
+char *
+pg_ltostr_zeropad(char *str, int32 value, int32 padding)
+{
+	char	   *start = str;
+	char	   *end = &str[padding];
+	int32		num = value;
+
+	Assert(padding > 0);
+
+	/*
+	 * Handle negative numbers in a special way. We can't just append a '-'
+	 * prefix and reverse the sign as on two's complement machines negative
+	 * numbers can be 1 further from 0 than positive numbers, we do it this way
+	 * so we properly handle the smallest possible value.
+	 */
+	if (num < 0)
+	{
+		*start++ = '-';
+		padding--;
+
+		/*
+		 * Build the number starting at the end. Here remainder will be a
+		 * negative number, we must reverse this sign on this before adding
+		 * '0' in order to get the correct ASCII digit
+		 */
+		while (padding--)
+		{
+			int32		remainder;
+			int32		oldval = num;
+
+			num /= 10;
+			remainder = oldval - num * 10;
+			start[padding] = '0' + -remainder;
+		}
+	}
+	else
+	{
+		/* build the number starting at the end */
+		while (padding--)
+		{
+			int32		remainder;
+			int32		oldval = num;
+
+			num /= 10;
+			remainder = oldval - num * 10;
+			start[padding] = '0' + remainder;
+		}
+	}
+
+	/*
+	 * If padding was not high enough to fit this number then num won't have
+	 * been divided down to zero. We'd better have another go, this time we
+	 * know there won't be any zero padding required so we can just enlist the
+	 * help of pg_ltostr()
+	 */
+	if (num != 0)
+		return pg_ltostr(str, value);
+
+	return end; /* Not NUL terminated */
+}
+
+/*
+ * pg_ltostr
+ *		Converts 'value' into a decimal string representation of the number.
+ *
+ * Caller must ensure that 'str' points to enough memory to hold the result
+ * (at least 12 bytes, counting a leading sign and trailing NUL). The return
+ * value is a pointer to, the would be end of string. The NUL termination char
+ * is NOT written.
+ *
+ * The intended use pattern for this function is to build strings which contain
+ * multiple individual numbers, such as:
+ *
+ *	str = pg_ltostr(str, a);
+ *	*str++ = ' ';
+ *	str = pg_ltostr(str, b);
+ *	*str = '\0';
+ */
+char *
+pg_ltostr(char *str, int32 value)
+{
+	char *start;
+	char *end;
+
+	/*
+	 * Handle negative numbers in a special way. We can't just append a '-'
+	 * prefix and reverse the sign as on two's complement machines negative
+	 * numbers can be 1 further from 0 than positive numbers, we do it this way
+	 * so we properly handle the smallest possible value.
+	 */
+	if (value < 0)
+	{
+		*str++ = '-';
+
+		/* mark the position we must reverse the string from. */
+		start = str;
+
+		/* Compute the result string backwards. */
+		do
+		{
+			int32		remainder;
+			int32		oldval = value;
+
+			value /= 10;
+			remainder = oldval - value * 10;
+			*str++ = '0' + -remainder;
+		} while (value != 0);
+	}
+	else
+	{
+		/* mark the position we must reverse the string from. */
+		start = str;
+		do
+		{
+			int32		remainder;
+			int32		oldval = value;
+
+			value /= 10;
+			remainder = oldval - value * 10;
+			*str++ = '0' + remainder;
+		} while (value != 0);
+	}
+
+	/* Remember the end of string and back up 'str' to the last character. */
+	end = str--;
+
+	/* Reverse string. */
+	while (start < str)
+	{
+		char		swap = *start;
+		*start++ = *str;
+		*str-- = swap;
+	}
+	return end;
+}
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index fc1679e..0f284ae 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -289,6 +289,8 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern void pg_itoa(int16 i, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
+extern char *pg_ltostr_zeropad(char *str, int32 value, int32 padding);
+extern char *pg_ltostr(char *str, int32 value);
 
 /*
  *		Per-opclass comparison functions for new btrees.  These are

#20

Andres Freund

andres@anarazel.de

over 10 years ago

In reply to: David Rowley (#19)

Re: WIP: Make timestamptz_out less slow.

On 2015-09-12 04:00:26 +1200, David Rowley wrote:

I've not done anything yet to remove the #ifdef HAVE_INT64_TIMESTAMP from
AppendSeconds(). The only way I can think to handle this is to just
make fsec_t unconditionally an int64 (since with float datetimes the
precision is 10 decimal digits after the dec point, this does not fit in
int32). I'd go off and do this, but this means I need to write an int64
version of pg_ltostr(). Should I do it this way?

I don't have time to look into this in detail right now. But isn't that
the wrong view?

The precision with float timestamps is still microseconds which you can
easily express in an integer. That the value can be higher is just
because fsec_t for floating points also includes seconds, right? It
shouldn't be too hard to separate those?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

David Rowley

david.rowley@2ndquadrant.com

over 10 years ago

In reply to: Andres Freund (#20)

Re: WIP: Make timestamptz_out less slow.

On 12 September 2015 at 04:24, Andres Freund <andres@anarazel.de> wrote:

On 2015-09-12 04:00:26 +1200, David Rowley wrote:

I've not done anything yet to remove the #ifdef HAVE_INT64_TIMESTAMP from
AppendSeconds(). The only way I can think to handle this is to just
make fsec_t unconditionally an int64 (since with float datetimes the
precision is 10 decimal digits after the dec point, this does not fit in
int32). I'd go off and do this, but this means I need to write an int64
version of pg_ltostr(). Should I do it this way?

I don't have time to look into this in detail right now. But isn't that
the wrong view?

I understand you have a lot on. I've attempted to explain my reasoning
below.

The precision with float timestamps is still microseconds which you can
easily express in an integer. That the value can be higher is just
because fsec_t for floating points also includes seconds, right? It
shouldn't be too hard to separate those?

I don't think it's possible to claim that "The precision with float
timestamps is still microseconds". I'd say it's more limited to double
itself.

As a demo (compiled with float datetime)

postgres=# select '11:59:59.0000000002 PM'::time;
time
---------------------
23:59:59.0000000002
(1 row)

Which is 23:59:59 and 0.0002 microseconds, or 0.2 nano seconds, or 200 pico
seconds.

This formatted this way by AppendSeconds as MAX_TIME_PRECISION is defined
as 10 when compiled with HAVE_INT64_TIMESTAMP undefined.

On the other end of the spectrum:

postgres=# select '999999-01-01 11:59:59.0005'::timestamp;
timestamp
-----------------------
999999-01-01 11:59:59
(1 row)

There was no precision left for the fractional seconds here. So I'd say
claims of microsecond precision to be incorrect in many regards.

I could change AppendSeconds() to only format to 6 decimal places, but that
would be a behavioral change that I'm certainly not going to argue for.
Quite possibly this extra precision with the time type is about the only
reason to bother keeping the float time code at all, else what's it good
for?

There's just not enough bits in an int32 to do 0.99 * 10000000000.0. Which
is why I asked about int64. It all just seems more hassle than it's worth
to get rid of a small special case in AppendSeconds()

The other problem with changing timestamp2tm() to output microseconds is
that this would require changes in all functions which call timestamp2tm(),
and likely many functions which call those. Many of these would break 3rd
party code. I found myself in contrib code when looking at changing it.
(Specifically a DecodeDateTime() call in adminpack.c)

I do agree that it's quite ugly that I've only kept TrimTrailingZeros() in
float timestamp mode, but at least it is static.

What's the best way to proceed with this:
1. Change AppendSeconds() to always format to 6 decimal places? (Breaks
applications which rely on more precision with TIME)
2. Change AppendSeconds() to multiply f_sec by 10000000000.0 (this requires
an int64 version of pg_ltostr, but still requires a special case in
AppendSeconds())
3. Just keep things as they are with my proposed patch.
4. ?

Are you worried about this because I've not focused on optimising float
timestamps as much as int64 timestamps? Are there many people compiling
with float timestamps in the real world?

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

#22

Jim Nasby

Jim.Nasby@BlueTreble.com

over 10 years ago

In reply to: David Rowley (#21)

Re: WIP: Make timestamptz_out less slow.

On 9/13/15 2:43 AM, David Rowley wrote:

Are you worried about this because I've not focused on optimising float
timestamps as much as int64 timestamps? Are there many people compiling
with float timestamps in the real world?

My $0.02: the default was changed some 5 years ago so FP time is
probably pretty rare now. I don't think it's worth a bunch of extra work
to speed them up.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23

Andres Freund

andres@anarazel.de

over 10 years ago

In reply to: Jim Nasby (#22)

Re: WIP: Make timestamptz_out less slow.

On 2015-09-14 12:03:31 -0500, Jim Nasby wrote:

On 9/13/15 2:43 AM, David Rowley wrote:

Are you worried about this because I've not focused on optimising float
timestamps as much as int64 timestamps? Are there many people compiling
with float timestamps in the real world?

My $0.02: the default was changed some 5 years ago so FP time is probably
pretty rare now. I don't think it's worth a bunch of extra work to speed
them up.

Agreed. That's not why I'm arguing for it. It's about having kind of duplicate
code that's not exercised at al. by any common setup.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24

Alvaro Herrera

alvherre@2ndquadrant.com

over 10 years ago

In reply to: Jim Nasby (#22)

Re: WIP: Make timestamptz_out less slow.

Jim Nasby wrote:

On 9/13/15 2:43 AM, David Rowley wrote:

Are you worried about this because I've not focused on optimising float
timestamps as much as int64 timestamps? Are there many people compiling
with float timestamps in the real world?

My $0.02: the default was changed some 5 years ago so FP time is probably
pretty rare now.

The default was FP for 8.3 and was changed before 8.4. Anybody who was
running with the default back then and who has pg_upgraded all the way
to current releases is still using floating-point date/times.

I don't think it's worth a bunch of extra work to speed them up.

Not sure about that.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25

David Rowley

david.rowley@2ndquadrant.com

over 10 years ago

In reply to: Alvaro Herrera (#24)

Re: WIP: Make timestamptz_out less slow.

On 15 September 2015 at 05:52, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

Jim Nasby wrote:

On 9/13/15 2:43 AM, David Rowley wrote:

Are you worried about this because I've not focused on optimising float
timestamps as much as int64 timestamps? Are there many people compiling
with float timestamps in the real world?

My $0.02: the default was changed some 5 years ago so FP time is probably
pretty rare now.

The default was FP for 8.3 and was changed before 8.4. Anybody who was
running with the default back then and who has pg_upgraded all the way
to current releases is still using floating-point date/times.

I don't think it's worth a bunch of extra work to speed them up.

Not sure about that.

It's not like nothing is improved in float timestamps with this patch, it's
only appending the non-zero fractional seconds that I've left alone. Every
other portion of the timestamp has been improved.

I made a quick test as a demo. This is compiled with float timestamps.

create table ts (ts timestamp not null);
insert into ts select generate_series('2010-01-01 00:00:00', '2011-01-01
00:00:00', '1 sec'::interval);
vacuum freeze ts;
create table ts2 (ts timestamp not null);
insert into ts2 select generate_series('2010-01-01 00:00:00', '2010-01-01
00:00:31.536001', '1 microsecond'::interval);
vacuum freeze ts2;

Master:
copy ts to '/dev/null'; -- Test 1
Time: 18252.128 ms
Time: 18230.650 ms
Time: 18247.622 ms

copy ts2 to '/dev/null'; -- Test 2
Time: 30928.999 ms
Time: 30973.576 ms
Time: 30935.489 ms

Patched

copy ts to '/dev/null'; -- Test 1 (247%)
Time: 7378.243 ms
Time: 7383.486 ms
Time: 7385.675 ms

copy ts2 to '/dev/null'; -- Test 2 (142%)
Time: 21761.047 ms
Time: 21757.026 ms
Time: 21759.621 ms

The patched difference between ts and ts2 can be accounted for by the fact
that I didn't find a better way to do:

if (fillzeros)
sprintf(cp, "%0*.*f", precision + 3, precision, fabs(sec + fsec));
else
sprintf(cp, "%.*f", precision, fabs(sec + fsec));

A fast path exists when the fractional seconds is zero, which is why
there's such a variation between the 2 tests.

I did, however spend some time a few weekends ago writing a function to
improve this, which is more aimed at improving poor performance of
float4out and float8out. It's quite likely if I can get finished one day,
it can help improve this case too.

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

#26

Alvaro Herrera

alvherre@2ndquadrant.com

over 10 years ago

In reply to: David Rowley (#25)

Re: WIP: Make timestamptz_out less slow.

David Rowley wrote:

It's not like nothing is improved in float timestamps with this patch, it's
only appending the non-zero fractional seconds that I've left alone. Every
other portion of the timestamp has been improved.

OK, sounds good enough to me.

I did, however spend some time a few weekends ago writing a function to
improve this, which is more aimed at improving poor performance of
float4out and float8out. It's quite likely if I can get finished one day,
it can help improve this case too.

Even better, prolly not a blocker for this patch.

--
ï¿½lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27

Peter Geoghegan

pg@heroku.com

about 10 years ago

In reply to: David Rowley (#19)

Re: WIP: Make timestamptz_out less slow.

On Fri, Sep 11, 2015 at 9:00 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

Attached is also timestamp_delta.patch which changes pg_ltostr() to use the
INT_MIN special case that pg_ltoa also uses. I didn't make a similar change
to pg_ltostr_zeropad() as I'd need to add even more special code to prepend
the correct number of zero before the 2147483648. This zero padding reason
was why I originally came up with the alternative way to handle the negative
numbers. I had just thought that it was best to keep both functions as
similar as possible.

I've started looking at this -- "timestamp_out_speedup_2015-09-12.patch".

I've not done anything yet to remove the #ifdef HAVE_INT64_TIMESTAMP from
AppendSeconds(). The only way I can think to handle this is to just make
fsec_t unconditionally an int64 (since with float datetimes the precision is
10 decimal digits after the dec point, this does not fit in int32). I'd go
off and do this, but this means I need to write an int64 version of
pg_ltostr(). Should I do it this way?

Have you thought about *just* having an int64 pg_ltostr()? Are you
aware of an appreciable overhead from having int32 callers just rely
on a promotion to int64?

In general, years are 1900-based in Postgres:

struct pg_tm
{
...
int tm_mday;
int tm_mon; /* origin 0, not 1 */
int tm_year; /* relative to 1900 */
...
};

While it's not your patch's fault, it is not noted by EncodeDateTime()
and EncodeDateOnly() and others that there pg_tm structs are not
1900-based. This is in contrast to multiple functions within
formatting.c, nabstime.c, and timestamp.c (some of which are clients
or clients of clients for this new code). I think that the rule has
been undermined so much that it doesn't make sense to indicate
exceptions directly, though. I think pg_tm should have comments saying
that there are many cases where tm_year is not relative to 1900.

I have a few minor nitpicks about the patch.

No need for "negative number" comment here -- just use
"Assert(precision >= 0)" code:

+ * Note 'precision' must not be a negative number.
+ * Note callers should assume cp will not be NUL terminated.
  */
-static void
+static char *
 AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 {
+#ifdef HAVE_INT64_TIMESTAMP
+

I would just use a period/full stop here:

+ * Note str is not NUL terminated, callers are responsible for NUL terminating
+ * str themselves.

Don't think you need so many "Note:" specifiers here:

+ * Note: Callers should ensure that 'padding' is above zero.
+ * Note: This function is optimized for the case where the number is not too
+ *      big to fit inside of the specified padding.
+ * Note: Caller must ensure that 'str' points to enough memory to hold the
+        result (at least 12 bytes, counting a leading sign and trailing NUL,
+        or padding + 1 bytes, whichever is larger).
+ */
+char *
+pg_ltostr_zeropad(char *str, int32 value, int32 padding)

Not so sure about this, within the new pg_ltostr_zeropad() function:

+   /*
+    * Handle negative numbers in a special way. We can't just append a '-'
+    * prefix and reverse the sign as on two's complement machines negative
+    * numbers can be 1 further from 0 than positive numbers, we do it this way
+    * so we properly handle the smallest possible value.
+    */
+   if (num < 0)
+   {
      ...
+   }
+   else
+   {
      ...
+   }

Maybe it's better to just special case INT_MIN and the do an Abs()?
Actually, would any likely caller of pg_ltostr_zeropad() actually care
if INT_MIN was a case that you cannot handle, and assert against? You
could perhaps reasonably make it the caller's problem. Haven't made up
my mind if your timestamp_delta.patch is better than that; cannot
immediately see a problem with putting it on the caller.

More generally, maybe routines like EncodeDateTime() ought now to use
the Abs() macro.

Those are all of my nitpicks. :-)

I also did some tests on server grade hardware, and the performance increase
is looking a bit better.

create table ts (ts timestamp not null);
insert into ts select generate_series('2010-01-01 00:00:00', '2011-01-01
00:00:00', '1 sec'::interval);
vacuum freeze ts;

On my laptop, I saw a 2.4X - 2.6X speedup for this case. That seems
pretty nice to me.

Will make another pass over this soon.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28

David Rowley

david.rowley@2ndquadrant.com

about 10 years ago

In reply to: Peter Geoghegan (#27)

1 attachment(s)

Re: WIP: Make timestamptz_out less slow.

On 5 November 2015 at 13:20, Peter Geoghegan <pg@heroku.com> wrote:

On Fri, Sep 11, 2015 at 9:00 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

Attached is also timestamp_delta.patch which changes pg_ltostr() to use

the

INT_MIN special case that pg_ltoa also uses. I didn't make a similar

change

to pg_ltostr_zeropad() as I'd need to add even more special code to

prepend

the correct number of zero before the 2147483648. This zero padding

reason

was why I originally came up with the alternative way to handle the

negative

numbers. I had just thought that it was best to keep both functions as
similar as possible.

I've started looking at this -- "timestamp_out_speedup_2015-09-12.patch".

I've not done anything yet to remove the #ifdef HAVE_INT64_TIMESTAMP from
AppendSeconds(). The only way I can think to handle this is to just make
fsec_t unconditionally an int64 (since with float datetimes the

precision is

10 decimal digits after the dec point, this does not fit in int32). I'd

go

off and do this, but this means I need to write an int64 version of
pg_ltostr(). Should I do it this way?

Have you thought about *just* having an int64 pg_ltostr()? Are you
aware of an appreciable overhead from having int32 callers just rely
on a promotion to int64?

I'd not thought of that. It would certainly add unnecessary overhead on
32bit machines.
To be honest, I don't really like the idea of making fsec_t an int64, there
are just to many places around the code base that need to be updated. Plus
with float datetimes, we'd need to multiple the fractional seconds by
1000000000, which is based on the MAX_TIME_PRECISION setting as defined
when HAVE_INT64_TIMESTAMP is undefined.

I have a few minor nitpicks about the patch.

No need for "negative number" comment here -- just use
"Assert(precision >= 0)" code:
+ * Note 'precision' must not be a negative number.
+ * Note callers should assume cp will not be NUL terminated.
*/
-static void
+static char *
AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool
fillzeros)
{
+#ifdef HAVE_INT64_TIMESTAMP
+
I would just use a period/full stop here:
+ * Note str is not NUL terminated, callers are responsible for NUL
terminating
+ * str themselves.

Do you mean change the comment to "Note str is not NUL terminated. Callers
are responsible for NUL terminating str themselves." ?

Don't think you need so many "Note:" specifiers here:

+ * Note: Callers should ensure that 'padding' is above zero.
+ * Note: This function is optimized for the case where the number is not
too
+ *      big to fit inside of the specified padding.
+ * Note: Caller must ensure that 'str' points to enough memory to hold the
+        result (at least 12 bytes, counting a leading sign and trailing
NUL,
+        or padding + 1 bytes, whichever is larger).
+ */
+char *
+pg_ltostr_zeropad(char *str, int32 value, int32 padding)

Yes, that's a bit ugly. How about just Assert(padding > 0) just like above?
That gets rid of one Note:
The optimized part is probably not that important anyway. I just mentioned
it to try and reduce the changes of someone using it when the padding is
always too small.

Not so sure about this, within the new pg_ltostr_zeropad() function:
+   /*
+    * Handle negative numbers in a special way. We can't just append a '-'
+    * prefix and reverse the sign as on two's complement machines negative
+    * numbers can be 1 further from 0 than positive numbers, we do it
this way
+    * so we properly handle the smallest possible value.
+    */
+   if (num < 0)
+   {
...
+   }
+   else
+   {
...
+   }
Maybe it's better to just special case INT_MIN and the do an Abs()?
Actually, would any likely caller of pg_ltostr_zeropad() actually care
if INT_MIN was a case that you cannot handle, and assert against? You
could perhaps reasonably make it the caller's problem. Haven't made up
my mind if your timestamp_delta.patch is better than that; cannot
immediately see a problem with putting it on the caller.

I can make that case work the same as pg_ltoa() for pg_ltostr(), that's not
a problem, I did that in the delta patch. With pg_ltostr_zeropad() I'd need
to add some corner case code to prepend the correct number of zeros onto
-2147483648. This is likely not going to look very nice, and also it's
probably going to end up about the same number of lines of code to what's
in the patch already. If you think it's better for performance, then I've
already done tests to show that it's not better. I really don't think it'll
look very nice either. I quite like my negative number handling, since it's
actually code that would be exercised 50% of the time ran than 1/ 2^32 of
the time, assuming all numbers have an equal chance of being passed.

More generally, maybe routines like EncodeDateTime() ought now to use
the Abs() macro.

You might be right about that. Then we can just have pg_ltostr() do
unsigned only. The reason I didn't do that is because I was trying to
minimise what I was changing in the patch, and I thought it was less likely
to make it if I started adding calls to Abs() around the code base.

Perhaps it's good to have pg_ltostr() there so that we have a good fast way
to build generic strings in Postgres incrementally. I'd say we'd have a
much less reusable function if we only supported unsigned int, as we don't
have a database type which is based on unsigned int.

Those are all of my nitpicks. :-)

Many thanks for looking at this. Nitpicks are fine :)

I also did some tests on server grade hardware, and the performance

increase

is looking a bit better.

create table ts (ts timestamp not null);
insert into ts select generate_series('2010-01-01 00:00:00', '2011-01-01
00:00:00', '1 sec'::interval);
vacuum freeze ts;

On my laptop, I saw a 2.4X - 2.6X speedup for this case. That seems
pretty nice to me.

Will make another pass over this soon.

Thanks

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

timestamp_out_speedup_2015-11-05.patchapplication/octet-stream; name=timestamp_out_speedup_2015-11-05.patchDownload

diff --git a/src/backend/utils/adt/datetime.c b/src/backend/utils/adt/datetime.c
index 926358e..ac8934e 100644
--- a/src/backend/utils/adt/datetime.c
+++ b/src/backend/utils/adt/datetime.c
@@ -43,8 +43,12 @@ static int DecodeTime(char *str, int fmask, int range,
 static const datetkn *datebsearch(const char *key, const datetkn *base, int nel);
 static int DecodeDate(char *str, int fmask, int *tmask, bool *is2digits,
 		   struct pg_tm * tm);
-static void TrimTrailingZeros(char *str);
-static void AppendSeconds(char *cp, int sec, fsec_t fsec,
+
+#ifndef HAVE_INT64_TIMESTAMP
+static char *TrimTrailingZeros(char *str);
+#endif /* HAVE_INT64_TIMESTAMP */
+
+static char *AppendSeconds(char *cp, int sec, fsec_t fsec,
 			  int precision, bool fillzeros);
 static void AdjustFractSeconds(double frac, struct pg_tm * tm, fsec_t *fsec,
 				   int scale);
@@ -394,15 +398,16 @@ GetCurrentTimeUsec(struct pg_tm * tm, fsec_t *fsec, int *tzp)
 		*tzp = tz;
 }
 
-
+#ifndef HAVE_INT64_TIMESTAMP
 /* TrimTrailingZeros()
  * ... resulting from printing numbers with full precision.
  *
  * Before Postgres 8.4, this always left at least 2 fractional digits,
  * but conversations on the lists suggest this isn't desired
  * since showing '0.10' is misleading with values of precision(1).
+ * Returns a pointer to the new end of string.
  */
-static void
+static char *
 TrimTrailingZeros(char *str)
 {
 	int			len = strlen(str);
@@ -412,43 +417,103 @@ TrimTrailingZeros(char *str)
 		len--;
 		*(str + len) = '\0';
 	}
+
+	return str + len;
 }
+#endif /* HAVE_INT64_TIMESTAMP */
 
 /*
- * Append sections and fractional seconds (if any) at *cp.
+ * Append seconds and fractional seconds (if any) at *cp.
  * precision is the max number of fraction digits, fillzeros says to
- * pad to two integral-seconds digits.
+ * pad to two integral-seconds digits. Returns a pointer to the new end of
+ * string.
  * Note that any sign is stripped from the input seconds values.
+ * Note str is not NUL terminated, callers are responsible for NUL terminating
+ * str themselves.
  */
-static void
+static char *
 AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros)
 {
+	Assert(precision >= 0);
+
+#ifdef HAVE_INT64_TIMESTAMP
+
+	if (fillzeros)
+		cp = pg_ltostr_zeropad(cp, abs(sec), 2);
+	else
+		cp = pg_ltostr(cp, abs(sec));
+
+	if (fsec != 0)
+	{
+		int			value = (int) Abs(fsec);
+		char	   *end = &cp[precision + 1];
+		bool		gotnonzero = false;
+
+		*cp++ = '.';
+
+		/*
+		 * Append the fractional seconds part. Note that we don't want any
+		 * trailing zeros here, so since we're building the number in reverse
+		 * we'll skip appending any zeros, unless we've seen a non-zero.
+		 */
+		while (precision--)
+		{
+			int		remainder;
+			int		oldval = value;
+
+			value /= 10;
+			remainder = oldval - value * 10;
+
+			/* check if we got a non-zero */
+			if (remainder)
+				gotnonzero = true;
+
+			if (gotnonzero)
+				cp[precision] = '0' + remainder;
+			else
+				end = &cp[precision];
+		}
+
+		/*
+		 * If we have a non-zero value then precision must have not been enough
+		 * to print the number, we'd better have another go. There won't be any
+		 * zero padding, so we can just use pg_ltostr().
+		 */
+		if (value > 0)
+			return pg_ltostr(cp, (int) Abs(fsec));
+
+		return end;
+	}
+	else
+		return cp;
+#else
+
 	if (fsec == 0)
 	{
 		if (fillzeros)
-			sprintf(cp, "%02d", abs(sec));
+			return pg_ltostr_zeropad(cp, abs(sec), 2);
 		else
-			sprintf(cp, "%d", abs(sec));
+			return pg_ltostr(cp, abs(sec));
 	}
 	else
 	{
-#ifdef HAVE_INT64_TIMESTAMP
-		if (fillzeros)
-			sprintf(cp, "%02d.%0*d", abs(sec), precision, (int) Abs(fsec));
-		else
-			sprintf(cp, "%d.%0*d", abs(sec), precision, (int) Abs(fsec));
-#else
 		if (fillzeros)
 			sprintf(cp, "%0*.*f", precision + 3, precision, fabs(sec + fsec));
 		else
 			sprintf(cp, "%.*f", precision, fabs(sec + fsec));
-#endif
-		TrimTrailingZeros(cp);
+
+		return TrimTrailingZeros(cp);
 	}
+
+#endif /* HAVE_INT64_TIMESTAMP */
 }
 
-/* Variant of above that's specialized to timestamp case */
-static void
+
+/*
+ * Variant of above that's specialized to timestamp case
+ * Note callers should assume cp will not be NUL terminated.
+ */
+static char *
 AppendTimestampSeconds(char *cp, struct pg_tm * tm, fsec_t fsec)
 {
 	/*
@@ -459,7 +524,7 @@ AppendTimestampSeconds(char *cp, struct pg_tm * tm, fsec_t fsec)
 	if (tm->tm_year <= 0)
 		fsec = 0;
 #endif
-	AppendSeconds(cp, tm->tm_sec, fsec, MAX_TIMESTAMP_PRECISION, true);
+	return AppendSeconds(cp, tm->tm_sec, fsec, MAX_TIMESTAMP_PRECISION, true);
 }
 
 /*
@@ -3831,9 +3896,13 @@ datebsearch(const char *key, const datetkn *base, int nel)
 }
 
 /* EncodeTimezone()
- *		Append representation of a numeric timezone offset to str.
+ *		Copies representation of a numeric timezone offset into str.
+ *		Returns a pointer pointing to where the NUL character should be set.
+ *
+ * Note str is not NUL terminated, callers are responsible for NUL terminating
+ * str themselves.
  */
-static void
+static char *
 EncodeTimezone(char *str, int tz, int style)
 {
 	int			hour,
@@ -3846,16 +3915,26 @@ EncodeTimezone(char *str, int tz, int style)
 	hour = min / MINS_PER_HOUR;
 	min -= hour * MINS_PER_HOUR;
 
-	str += strlen(str);
 	/* TZ is negated compared to sign we wish to display ... */
 	*str++ = (tz <= 0 ? '+' : '-');
 
 	if (sec != 0)
-		sprintf(str, "%02d:%02d:%02d", hour, min, sec);
+	{
+		str = pg_ltostr_zeropad(str, hour, 2);
+		*str++ = ':';
+		str = pg_ltostr_zeropad(str, min, 2);
+		*str++ = ':';
+		str = pg_ltostr_zeropad(str, sec, 2);
+	}
 	else if (min != 0 || style == USE_XSD_DATES)
-		sprintf(str, "%02d:%02d", hour, min);
+	{
+		str = pg_ltostr_zeropad(str, hour, 2);
+		*str++ = ':';
+		str = pg_ltostr_zeropad(str, min, 2);
+	}
 	else
-		sprintf(str, "%02d", hour);
+		str = pg_ltostr_zeropad(str, hour, 2);
+	return str;
 }
 
 /* EncodeDateOnly()
@@ -3871,46 +3950,90 @@ EncodeDateOnly(struct pg_tm * tm, int style, char *str)
 		case USE_ISO_DATES:
 		case USE_XSD_DATES:
 			/* compatible with ISO date formats */
-			if (tm->tm_year > 0)
-				sprintf(str, "%04d-%02d-%02d",
-						tm->tm_year, tm->tm_mon, tm->tm_mday);
-			else
-				sprintf(str, "%04d-%02d-%02d %s",
-						-(tm->tm_year - 1), tm->tm_mon, tm->tm_mday, "BC");
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = '-';
+			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			*str++ = '-';
+			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+
+			if (tm->tm_year <= 0)
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_SQL_DATES:
 			/* compatible with Oracle/Ingres date formats */
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str, "%02d/%02d", tm->tm_mday, tm->tm_mon);
-			else
-				sprintf(str, "%02d/%02d", tm->tm_mon, tm->tm_mday);
-			if (tm->tm_year > 0)
-				sprintf(str + 5, "/%04d", tm->tm_year);
+			{
+				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				*str++ = '/';
+				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			}
 			else
-				sprintf(str + 5, "/%04d %s", -(tm->tm_year - 1), "BC");
+			{
+				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				*str++ = '/';
+				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			}
+
+			*str++ = '/';
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+
+			if (tm->tm_year <= 0)
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_GERMAN_DATES:
 			/* German-style date format */
-			sprintf(str, "%02d.%02d", tm->tm_mday, tm->tm_mon);
-			if (tm->tm_year > 0)
-				sprintf(str + 5, ".%04d", tm->tm_year);
-			else
-				sprintf(str + 5, ".%04d %s", -(tm->tm_year - 1), "BC");
+			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			*str++ = '.';
+			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			*str++ = '.';
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+
+			if (tm->tm_year <= 0)
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_POSTGRES_DATES:
 		default:
 			/* traditional date-only style for Postgres */
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str, "%02d-%02d", tm->tm_mday, tm->tm_mon);
-			else
-				sprintf(str, "%02d-%02d", tm->tm_mon, tm->tm_mday);
-			if (tm->tm_year > 0)
-				sprintf(str + 5, "-%04d", tm->tm_year);
+			{
+				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				*str++ = '-';
+				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			}
 			else
-				sprintf(str + 5, "-%04d %s", -(tm->tm_year - 1), "BC");
+			{
+				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				*str++ = '-';
+				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			}
+			*str++ = '-';
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+
+			if (tm->tm_year <= 0)
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 	}
 }
@@ -3927,13 +4050,15 @@ EncodeDateOnly(struct pg_tm * tm, int style, char *str)
 void
 EncodeTimeOnly(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, int style, char *str)
 {
-	sprintf(str, "%02d:%02d:", tm->tm_hour, tm->tm_min);
-	str += strlen(str);
-
-	AppendSeconds(str, tm->tm_sec, fsec, MAX_TIME_PRECISION, true);
+	str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+	*str++ = ':';
+	str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+	*str++ = ':';
+	str = AppendSeconds(str, tm->tm_sec, fsec, MAX_TIME_PRECISION, true);
 
 	if (print_tz)
-		EncodeTimezone(str, tz, style);
+		str = EncodeTimezone(str, tz, style);
+	*str = '\0';
 }
 
 
@@ -3971,38 +4096,56 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 		case USE_ISO_DATES:
 		case USE_XSD_DATES:
 			/* Compatible with ISO-8601 date formats */
-
-			if (style == USE_ISO_DATES)
-				sprintf(str, "%04d-%02d-%02d %02d:%02d:",
-						(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-						tm->tm_mon, tm->tm_mday, tm->tm_hour, tm->tm_min);
-			else
-				sprintf(str, "%04d-%02d-%02dT%02d:%02d:",
-						(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-						tm->tm_mon, tm->tm_mday, tm->tm_hour, tm->tm_min);
-
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = '-';
+			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			*str++ = '-';
+			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			*str++ = (style == USE_ISO_DATES) ? ' ' : 'T';
+			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+
+			str = AppendTimestampSeconds(str, tm, fsec);
 
 			if (print_tz)
-				EncodeTimezone(str, tz, style);
+				str = EncodeTimezone(str, tz, style);
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_SQL_DATES:
 			/* Compatible with Oracle/Ingres date formats */
 
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str, "%02d/%02d", tm->tm_mday, tm->tm_mon);
+			{
+				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+				*str++ = '/';
+				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			}
 			else
-				sprintf(str, "%02d/%02d", tm->tm_mon, tm->tm_mday);
-
-			sprintf(str + 5, "/%04d %02d:%02d:",
-					(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-					tm->tm_hour, tm->tm_min);
+			{
+				str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+				*str++ = '/';
+				str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			}
 
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			*str++ = '/';
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = ' ';
+			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+			str = AppendTimestampSeconds(str, tm, fsec);
 
 			/*
 			 * Note: the uses of %.*s in this function would be risky if the
@@ -4013,36 +4156,55 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 			if (print_tz)
 			{
 				if (tzn)
-					sprintf(str + strlen(str), " %.*s", MAXTZLEN, tzn);
+				{
+					sprintf(str, " %.*s", MAXTZLEN, tzn);
+					str += strlen(str);
+				}
 				else
-					EncodeTimezone(str, tz, style);
+					str = EncodeTimezone(str, tz, style);
 			}
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_GERMAN_DATES:
 			/* German variant on European style */
 
-			sprintf(str, "%02d.%02d", tm->tm_mday, tm->tm_mon);
-
-			sprintf(str + 5, ".%04d %02d:%02d:",
-					(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1),
-					tm->tm_hour, tm->tm_min);
-
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
+			str = pg_ltostr_zeropad(str, tm->tm_mday, 2);
+			*str++ = '.';
+			str = pg_ltostr_zeropad(str, tm->tm_mon, 2);
+			*str++ = '.';
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
+			*str++ = ' ';
+			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+			str = AppendTimestampSeconds(str, tm, fsec);
 
 			if (print_tz)
 			{
 				if (tzn)
-					sprintf(str + strlen(str), " %.*s", MAXTZLEN, tzn);
+				{
+					sprintf(str, " %.*s", MAXTZLEN, tzn);
+					str += strlen(str);
+				}
 				else
-					EncodeTimezone(str, tz, style);
+					str = EncodeTimezone(str, tz, style);
 			}
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 
 		case USE_POSTGRES_DATES:
@@ -4053,24 +4215,33 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 			tm->tm_wday = j2day(day);
 
 			memcpy(str, days[tm->tm_wday], 3);
-			strcpy(str + 3, " ");
+			str += 3;
+			*str++ = ' ';
 
 			if (DateOrder == DATEORDER_DMY)
-				sprintf(str + 4, "%02d %3s", tm->tm_mday, months[tm->tm_mon - 1]);
+				sprintf(str, "%02d %3s", tm->tm_mday, months[tm->tm_mon - 1]);
 			else
-				sprintf(str + 4, "%3s %02d", months[tm->tm_mon - 1], tm->tm_mday);
+				sprintf(str, "%3s %02d", months[tm->tm_mon - 1], tm->tm_mday);
+			str += 6;
 
-			sprintf(str + 10, " %02d:%02d:", tm->tm_hour, tm->tm_min);
+			*str++ = ' ';
+			str = pg_ltostr_zeropad(str, tm->tm_hour, 2);
+			*str++ = ':';
+			str = pg_ltostr_zeropad(str, tm->tm_min, 2);
+			*str++ = ':';
+			str = AppendTimestampSeconds(str, tm, fsec);
 
-			AppendTimestampSeconds(str + strlen(str), tm, fsec);
-
-			sprintf(str + strlen(str), " %04d",
-					(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1));
+			*str++ = ' ';
+			str = pg_ltostr_zeropad(str,
+				(tm->tm_year > 0) ? tm->tm_year : -(tm->tm_year - 1), 4);
 
 			if (print_tz)
 			{
 				if (tzn)
-					sprintf(str + strlen(str), " %.*s", MAXTZLEN, tzn);
+				{
+					sprintf(str, " %.*s", MAXTZLEN, tzn);
+					str += strlen(str);
+				}
 				else
 				{
 					/*
@@ -4079,13 +4250,17 @@ EncodeDateTime(struct pg_tm * tm, fsec_t fsec, bool print_tz, int tz, const char
 					 * avoid formatting something which would be rejected by
 					 * the date/time parser later. - thomas 2001-10-19
 					 */
-					sprintf(str + strlen(str), " ");
-					EncodeTimezone(str, tz, style);
+					*str++ = ' ';
+					str = EncodeTimezone(str, tz, style);
 				}
 			}
 
 			if (tm->tm_year <= 0)
-				sprintf(str + strlen(str), " BC");
+			{
+				memcpy(str, " BC", 3); /* Don't copy NUL */
+				str += 3;
+			}
+			*str = '\0';
 			break;
 	}
 }
@@ -4242,7 +4417,8 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 							day_sign, abs(mday),
 							sec_sign, abs(hour), abs(min));
 					cp += strlen(cp);
-					AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					*cp = '\0';
 				}
 				else if (has_year_month)
 				{
@@ -4252,13 +4428,15 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 				{
 					sprintf(cp, "%d %d:%02d:", mday, hour, min);
 					cp += strlen(cp);
-					AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					*cp = '\0';
 				}
 				else
 				{
 					sprintf(cp, "%d:%02d:", hour, min);
 					cp += strlen(cp);
-					AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+					*cp = '\0';
 				}
 			}
 			break;
@@ -4284,8 +4462,7 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 			{
 				if (sec < 0 || fsec < 0)
 					*cp++ = '-';
-				AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
-				cp += strlen(cp);
+				cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
 				*cp++ = 'S';
 				*cp++ = '\0';
 			}
@@ -4311,7 +4488,8 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 						(minus ? "-" : (is_before ? "+" : "")),
 						abs(hour), abs(min));
 				cp += strlen(cp);
-				AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+				cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, true);
+				*cp = '\0';
 			}
 			break;
 
@@ -4337,8 +4515,7 @@ EncodeInterval(struct pg_tm * tm, fsec_t fsec, int style, char *str)
 				}
 				else if (is_before)
 					*cp++ = '-';
-				AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
-				cp += strlen(cp);
+				cp = AppendSeconds(cp, sec, fsec, MAX_INTERVAL_PRECISION, false);
 				sprintf(cp, " sec%s",
 						(abs(sec) != 1 || fsec != 0) ? "s" : "");
 				is_zero = FALSE;
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 1dadbd5..d667a93 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -227,3 +227,162 @@ pg_lltoa(int64 value, char *a)
 		*a-- = swap;
 	}
 }
+
+
+/*
+ * pg_ltostr_zeropad
+ *		Converts 'value' into a decimal string representation of the number.
+ *		'padding' specifies the minimum width of the number. Any extra space
+ *		is filled up by prefixing the number with zeros. The return value is a
+ *		pointer to, the would be end of string. Note that the NUL termination
+ *		char is not written.
+ *
+ * The intended use pattern for this function is to build strings which contain
+ * multiple individual numbers, such as:
+ *
+ *	str = pg_ltostr_zeropad(str, hours, 2);
+ *	*str++ = ':';
+ *	str = pg_ltostr_zeropad(str, mins, 2);
+ *	*str++ = ':';
+ *	str = pg_ltostr_zeropad(str, secs, 2);
+ *	*str = '\0';
+ *
+ * Note: Caller must ensure that 'str' points to enough memory to hold the
+		 result (at least 12 bytes, counting a leading sign and trailing NUL,
+		 or padding + 1 bytes, whichever is larger).
+ */
+char *
+pg_ltostr_zeropad(char *str, int32 value, int32 padding)
+{
+	char	   *start = str;
+	char	   *end = &str[padding];
+	int32		num = value;
+
+	Assert(padding > 0);
+
+	/*
+	 * Handle negative numbers in a special way. We can't just append a '-'
+	 * prefix and reverse the sign as on two's complement machines negative
+	 * numbers can be 1 further from 0 than positive numbers, we do it this way
+	 * so we properly handle the smallest possible value.
+	 */
+	if (num < 0)
+	{
+		*start++ = '-';
+		padding--;
+
+		/*
+		 * Build the number starting at the end. Here remainder will be a
+		 * negative number, we must reverse this sign on this before adding
+		 * '0' in order to get the correct ASCII digit
+		 */
+		while (padding--)
+		{
+			int32		remainder;
+			int32		oldval = num;
+
+			num /= 10;
+			remainder = oldval - num * 10;
+			start[padding] = '0' + -remainder;
+		}
+	}
+	else
+	{
+		/* build the number starting at the end */
+		while (padding--)
+		{
+			int32		remainder;
+			int32		oldval = num;
+
+			num /= 10;
+			remainder = oldval - num * 10;
+			start[padding] = '0' + remainder;
+		}
+	}
+
+	/*
+	 * If padding was not high enough to fit this number then num won't have
+	 * been divided down to zero. We'd better have another go, this time we
+	 * know there won't be any zero padding required so we can just enlist the
+	 * help of pg_ltostr()
+	 */
+	if (num != 0)
+		return pg_ltostr(str, value);
+
+	return end; /* Not NUL terminated */
+}
+
+/*
+ * pg_ltostr
+ *		Converts 'value' into a decimal string representation of the number.
+ *
+ * Caller must ensure that 'str' points to enough memory to hold the result
+ * (at least 12 bytes, counting a leading sign and trailing NUL). The return
+ * value is a pointer to, the would be end of string. The NUL termination char
+ * is NOT written.
+ *
+ * The intended use pattern for this function is to build strings which contain
+ * multiple individual numbers, such as:
+ *
+ *	str = pg_ltostr(str, a);
+ *	*str++ = ' ';
+ *	str = pg_ltostr(str, b);
+ *	*str = '\0';
+ */
+char *
+pg_ltostr(char *str, int32 value)
+{
+	char *start;
+	char *end;
+
+	/*
+	 * Handle negative numbers in a special way. We can't just append a '-'
+	 * prefix and reverse the sign as on two's complement machines negative
+	 * numbers can be 1 further from 0 than positive numbers, we do it this way
+	 * so we properly handle the smallest possible value.
+	 */
+	if (value < 0)
+	{
+		*str++ = '-';
+
+		/* mark the position we must reverse the string from. */
+		start = str;
+
+		/* Compute the result string backwards. */
+		do
+		{
+			int32		remainder;
+			int32		oldval = value;
+
+			value /= 10;
+			remainder = oldval - value * 10;
+			*str++ = '0' + -remainder;
+		} while (value != 0);
+	}
+	else
+	{
+		/* mark the position we must reverse the string from. */
+		start = str;
+		do
+		{
+			int32		remainder;
+			int32		oldval = value;
+
+			value /= 10;
+			remainder = oldval - value * 10;
+			*str++ = '0' + remainder;
+		} while (value != 0);
+	}
+
+	/* Remember the end of string and back up 'str' to the last character. */
+	end = str--;
+
+	/* Reverse string. */
+	while (start < str)
+	{
+		char		swap = *start;
+		*start++ = *str;
+		*str-- = swap;
+	}
+	return end;
+}
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index c193e44..c405673 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -289,6 +289,8 @@ extern int32 pg_atoi(const char *s, int size, int c);
 extern void pg_itoa(int16 i, char *a);
 extern void pg_ltoa(int32 l, char *a);
 extern void pg_lltoa(int64 ll, char *a);
+extern char *pg_ltostr_zeropad(char *str, int32 value, int32 padding);
+extern char *pg_ltostr(char *str, int32 value);
 
 /*
  *		Per-opclass comparison functions for new btrees.  These are

#29

Peter Geoghegan

pg@heroku.com

about 10 years ago

In reply to: David Rowley (#28)

Re: WIP: Make timestamptz_out less slow.

On Wed, Nov 4, 2015 at 6:30 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

Have you thought about *just* having an int64 pg_ltostr()? Are you
aware of an appreciable overhead from having int32 callers just rely
on a promotion to int64?

I'd not thought of that. It would certainly add unnecessary overhead on
32bit machines.
To be honest, I don't really like the idea of making fsec_t an int64, there
are just to many places around the code base that need to be updated. Plus
with float datetimes, we'd need to multiple the fractional seconds by
1000000000, which is based on the MAX_TIME_PRECISION setting as defined when
HAVE_INT64_TIMESTAMP is undefined.

OK.

I would just use a period/full stop here:
+ * Note str is not NUL terminated, callers are responsible for NUL
terminating
+ * str themselves.
Do you mean change the comment to "Note str is not NUL terminated. Callers
are responsible for NUL terminating str themselves." ?

Yes.

Yes, that's a bit ugly. How about just Assert(padding > 0) just like above?
That gets rid of one Note:
The optimized part is probably not that important anyway. I just mentioned
it to try and reduce the changes of someone using it when the padding is
always too small.

Cool.

Maybe it's better to just special case INT_MIN and the do an Abs()?
Actually, would any likely caller of pg_ltostr_zeropad() actually care
if INT_MIN was a case that you cannot handle, and assert against? You
could perhaps reasonably make it the caller's problem. Haven't made up
my mind if your timestamp_delta.patch is better than that; cannot
immediately see a problem with putting it on the caller.

I can make that case work the same as pg_ltoa() for pg_ltostr(), that's not
a problem, I did that in the delta patch. With pg_ltostr_zeropad() I'd need
to add some corner case code to prepend the correct number of zeros onto
-2147483648. This is likely not going to look very nice, and also it's
probably going to end up about the same number of lines of code to what's in
the patch already. If you think it's better for performance, then I've
already done tests to show that it's not better. I really don't think it'll
look very nice either. I quite like my negative number handling, since it's
actually code that would be exercised 50% of the time ran than 1/ 2^32 of
the time, assuming all numbers have an equal chance of being passed.

Fair enough.

More generally, maybe routines like EncodeDateTime() ought now to use
the Abs() macro.

You might be right about that. Then we can just have pg_ltostr() do unsigned
only. The reason I didn't do that is because I was trying to minimise what I
was changing in the patch, and I thought it was less likely to make it if I
started adding calls to Abs() around the code base.

I am very familiar with being conflicted about changing things beyond
what is actually strictly necessary. It's still something I deal with
a lot. One case that I think I have right in recommending commenting
on is the need to centrally document that there are many exceptions to
the 1900-based behavior of struct pg_tm. As I said, we should not
continue to inconsistently locally note the exceptions, even if your
patch doesn't make it any worse.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30

David Rowley

david.rowley@2ndquadrant.com

about 10 years ago

In reply to: Peter Geoghegan (#29)

Re: WIP: Make timestamptz_out less slow.

On 12 November 2015 at 13:44, Peter Geoghegan <pg@heroku.com> wrote:

On Wed, Nov 4, 2015 at 6:30 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

Have you thought about *just* having an int64 pg_ltostr()? Are you
aware of an appreciable overhead from having int32 callers just rely
on a promotion to int64?

I'd not thought of that. It would certainly add unnecessary overhead on
32bit machines.
To be honest, I don't really like the idea of making fsec_t an int64,

there

are just to many places around the code base that need to be updated.

Plus

with float datetimes, we'd need to multiple the fractional seconds by
1000000000, which is based on the MAX_TIME_PRECISION setting as defined

when

HAVE_INT64_TIMESTAMP is undefined.

OK.
I would just use a period/full stop here:
+ * Note str is not NUL terminated, callers are responsible for NUL
terminating
+ * str themselves.
Do you mean change the comment to "Note str is not NUL terminated.
Callers

are responsible for NUL terminating str themselves." ?

Yes.

Yes, that's a bit ugly. How about just Assert(padding > 0) just like

above?

That gets rid of one Note:
The optimized part is probably not that important anyway. I just

mentioned

it to try and reduce the changes of someone using it when the padding is
always too small.

Cool.

Maybe it's better to just special case INT_MIN and the do an Abs()?
Actually, would any likely caller of pg_ltostr_zeropad() actually care
if INT_MIN was a case that you cannot handle, and assert against? You
could perhaps reasonably make it the caller's problem. Haven't made up
my mind if your timestamp_delta.patch is better than that; cannot
immediately see a problem with putting it on the caller.

I can make that case work the same as pg_ltoa() for pg_ltostr(), that's

not

a problem, I did that in the delta patch. With pg_ltostr_zeropad() I'd

need

to add some corner case code to prepend the correct number of zeros onto
-2147483648. This is likely not going to look very nice, and also it's
probably going to end up about the same number of lines of code to

what's in

the patch already. If you think it's better for performance, then I've
already done tests to show that it's not better. I really don't think

it'll

look very nice either. I quite like my negative number handling, since

it's

actually code that would be exercised 50% of the time ran than 1/ 2^32 of
the time, assuming all numbers have an equal chance of being passed.

Fair enough.

Many thanks for spending time on this Peter.

More generally, maybe routines like EncodeDateTime() ought now to use
the Abs() macro.

You might be right about that. Then we can just have pg_ltostr() do

unsigned

only. The reason I didn't do that is because I was trying to minimise

what I

was changing in the patch, and I thought it was less likely to make it

if I

started adding calls to Abs() around the code base.

I am very familiar with being conflicted about changing things beyond
what is actually strictly necessary. It's still something I deal with
a lot. One case that I think I have right in recommending commenting
on is the need to centrally document that there are many exceptions to
the 1900-based behavior of struct pg_tm. As I said, we should not
continue to inconsistently locally note the exceptions, even if your
patch doesn't make it any worse.

Just to confirm, you mean this comment?

int tm_year; /* relative to 1900 */

Please let me know if you disagree, but I'm not sure it's the business of
this patch to fix that. If it's wrong now, then it was wrong before my
patch, so it should be a separate patch which fixes it.

At this stage I don't quite know what the fix should be, weather it's doing
tm->tm_year -= 1900; in timestamp2tm() after the j2date() call, or if it's
just removing the misleading comment.

I also don't quite understand why we bother having it relative to 1900 and
not just base it on 0.

Is there anything else you see that's pending before it can be marked as
ready for committer?

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

#31

Peter Geoghegan

pg@heroku.com

about 10 years ago

In reply to: David Rowley (#30)

Re: WIP: Make timestamptz_out less slow.

On Sun, Nov 22, 2015 at 2:20 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

Just to confirm, you mean this comment?

int tm_year; /* relative to 1900 */

Please let me know if you disagree, but I'm not sure it's the business of
this patch to fix that. If it's wrong now, then it was wrong before my
patch, so it should be a separate patch which fixes it.

At this stage I don't quite know what the fix should be, weather it's doing
tm->tm_year -= 1900; in timestamp2tm() after the j2date() call, or if it's
just removing the misleading comment.

I also don't quite understand why we bother having it relative to 1900 and
not just base it on 0.

That's fair. I defer to the judgement of the committer here.

Is there anything else you see that's pending before it can be marked as
ready for committer?

Can't think of any reason not to. It's been marked "ready for committer".

Thanks

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32

David Rowley

david.rowley@2ndquadrant.com

about 10 years ago

In reply to: Peter Geoghegan (#31)

Re: WIP: Make timestamptz_out less slow.

On 29 November 2015 at 14:00, Peter Geoghegan <pg@heroku.com> wrote:

On Sun, Nov 22, 2015 at 2:20 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

Just to confirm, you mean this comment?

int tm_year; /* relative to 1900 */

Please let me know if you disagree, but I'm not sure it's the business of
this patch to fix that. If it's wrong now, then it was wrong before my
patch, so it should be a separate patch which fixes it.

At this stage I don't quite know what the fix should be, weather it's

doing

tm->tm_year -= 1900; in timestamp2tm() after the j2date() call, or if

it's

just removing the misleading comment.

I also don't quite understand why we bother having it relative to 1900

and

not just base it on 0.

That's fair. I defer to the judgement of the committer here.

Is there anything else you see that's pending before it can be marked as
ready for committer?

Can't think of any reason not to. It's been marked "ready for committer".

Many thanks for reviewing this Peter.

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

#33

Tom Lane

tgl@sss.pgh.pa.us

almost 10 years ago

In reply to: David Rowley (#28)

Re: WIP: Make timestamptz_out less slow.

David Rowley <david.rowley@2ndquadrant.com> writes:
[ timestamp_out_speedup_2015-11-05.patch ]

Pushed with a bunch of cosmetic tweaks.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34

David Rowley

david.rowley@2ndquadrant.com

almost 10 years ago

In reply to: Tom Lane (#33)

Re: WIP: Make timestamptz_out less slow.

On 7/02/2016 4:14 am, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:
[ timestamp_out_speedup_2015-11-05.patch ]

Pushed with a bunch of cosmetic tweaks.

Great. Thanks for pushing this.

David