xlog location arithmetic

Started by Euler Taveira de Oliveiraabout 14 years ago93 messages
1 attachment(s)

Hi,

A while ago when blogging about WAL [1]http://eulerto.blogspot.com/2011/11/understanding-wal-nomenclature.html, I noticed a function to deal with
xlog location arithmetic is wanted. I remembered Depez [2]http://www.depesz.com/index.php/2011/01/24/waiting-for-9-1-pg_stat_replication/ mentioning it and
after some questions during trainings and conferences I decided to translate
my shell script function in C.

The attached patch implements the function pg_xlog_location_diff (bikeshed
colors are welcome). It calculates the difference between two given
transaction log locations. Now that we have pg_stat_replication view, it will
be easy to get the lag just passing columns as parameters. Also, the
monitoring tools could take advantage of it instead of relying on a fragile
routine to get the lag.

I noticed that pg_xlogfile_name* functions does not sanity check the xrecoff
boundaries but that is material for another patch.

[1]: http://eulerto.blogspot.com/2011/11/understanding-wal-nomenclature.html
[2]: http://www.depesz.com/index.php/2011/01/24/waiting-for-9-1-pg_stat_replication/
http://www.depesz.com/index.php/2011/01/24/waiting-for-9-1-pg_stat_replication/

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Attachments:

xlogdiff.patchtext/x-patch; name=xlogdiff.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index ddfb29a..cce218a 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14342,6 +14342,9 @@ SELECT set_config('log_statement_stats', 'off', false);
    <indexterm>
     <primary>pg_xlogfile_name_offset</primary>
    </indexterm>
+   <indexterm>
+    <primary>pg_xlog_location_diff</primary>
+   </indexterm>
 
    <para>
     The functions shown in <xref
@@ -14414,6 +14417,13 @@ SELECT set_config('log_statement_stats', 'off', false);
        <entry><type>text</>, <type>integer</></entry>
        <entry>Convert transaction log location string to file name and decimal byte offset within file</entry>
       </row>
+      <row>
+       <entry>
+        <literal><function>pg_xlog_location_diff(<parameter>location</> <type>text</>, <parameter>location</> <type>text</>)</function></literal>
+        </entry>
+       <entry><type>bigint</></entry>
+       <entry>Calculate the difference between two transaction log locations</entry>
+      </row>
      </tbody>
     </tgroup>
    </table>
@@ -14507,6 +14517,13 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
    </para>
 
    <para>
+	<function>pg_xlog_location_diff</> calculates the difference in bytes
+	between two transaction log locations. It can be used with
+	<structname>pg_stat_replication</structname> or some functions shown in
+	<xref linkend="functions-admin-backup-table"> to get the replication lag.
+   </para>
+
+   <para>
     For details about proper usage of these functions, see
     <xref linkend="continuous-archiving">.
    </para>
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 22c6ca0..09e8369 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "replication/walreceiver.h"
 #include "storage/smgr.h"
 #include "utils/builtins.h"
+#include "utils/int8.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -465,3 +466,57 @@ pg_is_in_recovery(PG_FUNCTION_ARGS)
 {
 	PG_RETURN_BOOL(RecoveryInProgress());
 }
+
+/*
+ * Compute the difference in bytes between two WAL locations.
+ */
+Datum
+pg_xlog_location_diff(PG_FUNCTION_ARGS)
+{
+	text	*location1 = PG_GETARG_TEXT_P(0);
+	text	*location2 = PG_GETARG_TEXT_P(1);
+	char	*str1, *str2;
+	uint32	xlogid1, xrecoff1;
+	uint32	xlogid2, xrecoff2;
+	int64	tmp;
+	int64	result;
+
+	/*
+	 * Read and parse input
+	 */
+	str1 = text_to_cstring(location1);
+	str2 = text_to_cstring(location2);
+
+	if (sscanf(str1, "%8X/%8X", &xlogid1, &xrecoff1) != 2)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("could not parse transaction log location \"%s\"", str1)));
+	if (sscanf(str2, "%8X/%8X", &xlogid2, &xrecoff2) != 2)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("could not parse transaction log location \"%s\"", str2)));
+
+	/*
+	 * Sanity check
+	 */
+	if (xrecoff1 > XLogFileSize)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("xrecoff \"%X\" is out of valid range, 0..%X", xrecoff1, XLogFileSize)));
+	if (xrecoff2 > XLogFileSize)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("xrecoff \"%X\" is out of valid range, 0..%X", xrecoff2, XLogFileSize)));
+
+	/*
+	 * Use the int8 functions mainly for overflow detection
+	 *
+	 * result = XLogFileSize * (xlogid1 - xlogid2) + xrecoff1 - xrecoff2
+	 */
+	tmp = DirectFunctionCall2(int8mi, xlogid1, xlogid2);
+	tmp = DirectFunctionCall2(int8mul, XLogFileSize, tmp);
+	tmp = DirectFunctionCall2(int8pl, tmp, xrecoff1);
+	result = DirectFunctionCall2(int8mi, tmp, xrecoff2);
+
+	PG_RETURN_INT64(result);
+}
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index cb43879..3e7340b 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -279,5 +279,6 @@ extern Datum pg_is_in_recovery(PG_FUNCTION_ARGS);
 extern Datum pg_xlog_replay_pause(PG_FUNCTION_ARGS);
 extern Datum pg_xlog_replay_resume(PG_FUNCTION_ARGS);
 extern Datum pg_is_xlog_replay_paused(PG_FUNCTION_ARGS);
+extern Datum pg_xlog_location_diff(PG_FUNCTION_ARGS);
 
 #endif   /* XLOG_INTERNAL_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 28e53b7..036d6ca 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2869,6 +2869,8 @@ DATA(insert OID = 2850 ( pg_xlogfile_name_offset	PGNSP PGUID 12 1 0 0 0 f f f t
 DESCR("xlog filename and byte offset, given an xlog location");
 DATA(insert OID = 2851 ( pg_xlogfile_name			PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 25 "25" _null_ _null_ _null_ _null_ pg_xlogfile_name _null_ _null_ _null_ ));
 DESCR("xlog filename, given an xlog location");
+DATA(insert OID = 3129 ( pg_xlog_location_diff		PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 20 "25 25" _null_ _null_ _null_ _null_ pg_xlog_location_diff _null_ _null_ _null_ ));
+DESCR("difference in bytes, given two xlog locations");
 
 DATA(insert OID = 3809 ( pg_export_snapshot		PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 25 "" _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
 DESCR("export a snapshot");
#2Magnus Hagander
magnus@hagander.net
In reply to: Euler Taveira de Oliveira (#1)
Re: xlog location arithmetic

On Tue, Dec 6, 2011 at 05:19, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

Hi,

A while ago when blogging about WAL [1], I noticed a function to deal with
xlog location arithmetic is wanted. I remembered Depez [2] mentioning it and
after some questions during trainings and conferences I decided to translate
my shell script function in C.

The attached patch implements the function pg_xlog_location_diff (bikeshed
colors are welcome). It calculates the difference between two given
transaction log locations. Now that we have pg_stat_replication view, it will
be easy to get the lag just passing columns as parameters. Also, the
monitoring tools could take advantage of it instead of relying on a fragile
routine to get the lag.

I've been considering similar things, as you can find in the archives,
but what I was thinking of was converting the number to just a plain
bigint, then letting the user apply whatever arithmetic wanted at the
SQL level. I never got around to acutally coding it, though. It could
easily be extracted from your patch of course - and I think that's a
more flexible approach. Is there some advantage to your method that
I'm missing?

Also, why do you use DirectFunctionCall to do the simple math, and not
just do the math right there in the function?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#3Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#2)
1 attachment(s)
Re: xlog location arithmetic

On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote:

I've been considering similar things, as you can find in the archives,
but what I was thinking of was converting the number to just a plain
bigint, then letting the user apply whatever arithmetic wanted at the
SQL level. I never got around to acutally coding it, though. It could
easily be extracted from your patch of course - and I think that's a
more flexible approach. Is there some advantage to your method that
I'm missing?

I went so far as to put together an lsn data type. I didn't actually
get all that far with it, which is why I haven't posted it sooner, but
here's what I came up with. It's missing indexing support and stuff,
but that could be added if people like the approach. It solves this
problem by implementing -(lsn,lsn) => numeric (not int8, that can
overflow since it is not unsigned), which allows an lsn => numeric
conversion by just subtracting '0/0'::lsn.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

lsn.patchapplication/octet-stream; name=lsn.patchDownload
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 5f968b0..e79aeef 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -18,7 +18,7 @@ endif
 OBJS = acl.o arrayfuncs.o array_userfuncs.o arrayutils.o bool.o \
 	cash.o char.o date.o datetime.o datum.o domains.o \
 	enum.o float.o format_type.o \
-	geo_ops.o geo_selfuncs.o int.o int8.o like.o lockfuncs.o \
+	geo_ops.o geo_selfuncs.o int.o int8.o like.o lockfuncs.o lsn.o \
 	misc.o nabstime.o name.o numeric.o numutils.o \
 	oid.o oracle_compat.o pseudotypes.o rangetypes.o rangetypes_gist.o \
 	rowtypes.o regexp.o regproc.o ruleutils.o selfuncs.o \
diff --git a/src/backend/utils/adt/lsn.c b/src/backend/utils/adt/lsn.c
new file mode 100644
index 0000000..839ff37
--- /dev/null
+++ b/src/backend/utils/adt/lsn.c
@@ -0,0 +1,189 @@
+/*-------------------------------------------------------------------------
+ *
+ * lsn.c
+ *	  Internal LSN operations
+ *
+ * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * IDENTIFICATION
+ *	  src/backend/utils/adt/lsn.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "funcapi.h"
+#include "libpq/pqformat.h"
+#include "utils/builtins.h"
+#include "utils/lsn.h"
+
+#define MAXINT4LEN		12
+#define MAXINT8LEN		25
+#define MAXLSNLEN		17
+#define MAXLSNCOMPONENT	8
+
+/*----------------------------------------------------------
+ * Formatting and conversion routines.
+ *---------------------------------------------------------*/
+
+Datum
+lsn_in(PG_FUNCTION_ARGS)
+{
+	char	   *str = PG_GETARG_CSTRING(0);
+	int			len1,
+				len2;
+	XLogRecPtr *result;
+
+	/* Sanity check input format. */
+	len1 = strspn(str, "0123456789abcdefABCDEF");
+	if (len1 < 1 || len1 > MAXLSNCOMPONENT || str[len1] != '/')
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+				 errmsg("invalid input syntax for lsn: \"%s\"", str)));
+	len2 = strspn(str + len1 + 1, "0123456789abcdefABCDEF");
+	if (len2 < 1 || len2 > MAXLSNCOMPONENT || str[len1 + 1 + len2] != '\0')
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+				 errmsg("invalid input syntax for lsn: \"%s\"", str)));
+
+	/* Decode result. */
+	result = palloc(sizeof(XLogRecPtr));
+	result->xlogid = (uint32) strtoul(str, NULL, 16);
+	result->xrecoff = (uint32) strtoul(str + len1 + 1, NULL, 16);
+
+	PG_RETURN_POINTER(result);
+}
+
+Datum
+lsn_out(PG_FUNCTION_ARGS)
+{
+	XLogRecPtr *lsn = (XLogRecPtr *) PG_GETARG_POINTER(0);
+	char		buf[MAXLSNLEN + 1];
+	char	   *result;
+
+	sprintf(buf, "%X/%X", lsn->xlogid, lsn->xrecoff);
+	result = pstrdup(buf);
+	PG_RETURN_CSTRING(result);
+}
+
+Datum
+lsn_recv(PG_FUNCTION_ARGS)
+{
+	StringInfo	buf = (StringInfo) PG_GETARG_POINTER(0);
+	XLogRecPtr *result;
+
+	result = palloc(sizeof(XLogRecPtr));
+	result->xlogid = pq_getmsgint(buf, 4);
+	result->xrecoff = pq_getmsgint(buf, 4);
+
+	PG_RETURN_POINTER(result);
+}
+
+Datum
+lsn_send(PG_FUNCTION_ARGS)
+{
+	XLogRecPtr *lsn = (XLogRecPtr *) PG_GETARG_POINTER(0);
+	StringInfoData buf;
+
+	pq_begintypsend(&buf);
+	pq_sendint(&buf, lsn->xlogid, 4);
+	pq_sendint(&buf, lsn->xrecoff, 4);
+	PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
+}
+
+
+/*----------------------------------------------------------
+ *	Relational operators for LSNs
+ *---------------------------------------------------------*/
+
+Datum
+lsn_eq(PG_FUNCTION_ARGS)
+{
+	XLogRecPtr *lsn1 = (XLogRecPtr *) PG_GETARG_POINTER(0);
+	XLogRecPtr *lsn2 = (XLogRecPtr *) PG_GETARG_POINTER(1);
+
+	PG_RETURN_BOOL(XLByteEQ(*lsn1, *lsn2));
+}
+
+Datum
+lsn_ne(PG_FUNCTION_ARGS)
+{
+	XLogRecPtr *lsn1 = (XLogRecPtr *) PG_GETARG_POINTER(0);
+	XLogRecPtr *lsn2 = (XLogRecPtr *) PG_GETARG_POINTER(1);
+
+	PG_RETURN_BOOL(!XLByteEQ(*lsn1, *lsn2));
+}
+
+Datum
+lsn_lt(PG_FUNCTION_ARGS)
+{
+	XLogRecPtr *lsn1 = (XLogRecPtr *) PG_GETARG_POINTER(0);
+	XLogRecPtr *lsn2 = (XLogRecPtr *) PG_GETARG_POINTER(1);
+
+	PG_RETURN_BOOL(XLByteLT(*lsn1, *lsn2));
+}
+
+Datum
+lsn_gt(PG_FUNCTION_ARGS)
+{
+	XLogRecPtr *lsn1 = (XLogRecPtr *) PG_GETARG_POINTER(0);
+	XLogRecPtr *lsn2 = (XLogRecPtr *) PG_GETARG_POINTER(1);
+
+	/* note order of arguments */
+	PG_RETURN_BOOL(XLByteLT(*lsn2, *lsn1));
+}
+
+Datum
+lsn_le(PG_FUNCTION_ARGS)
+{
+	XLogRecPtr *lsn1 = (XLogRecPtr *) PG_GETARG_POINTER(0);
+	XLogRecPtr *lsn2 = (XLogRecPtr *) PG_GETARG_POINTER(1);
+
+	PG_RETURN_BOOL(XLByteLE(*lsn1, *lsn2));
+}
+
+Datum
+lsn_ge(PG_FUNCTION_ARGS)
+{
+	XLogRecPtr *lsn1 = (XLogRecPtr *) PG_GETARG_POINTER(0);
+	XLogRecPtr *lsn2 = (XLogRecPtr *) PG_GETARG_POINTER(1);
+
+	/* note order of arguments */
+	PG_RETURN_BOOL(XLByteLE(*lsn2, *lsn1));
+}
+
+
+/*----------------------------------------------------------
+ *	Arithmetic operators on LSNs.
+ *---------------------------------------------------------*/
+
+Datum
+lsn_mi(PG_FUNCTION_ARGS)
+{
+	XLogRecPtr *lsn1 = (XLogRecPtr *) PG_GETARG_POINTER(0);
+	XLogRecPtr *lsn2 = (XLogRecPtr *) PG_GETARG_POINTER(1);
+	uint64		v1;
+	uint64		v2;
+	char		buf[256];
+	Datum		result;
+
+	/* Flatten values to 64-bit unsigned integers. */
+	v1 = (((uint64) lsn1->xlogid) << 32) | ((uint64) lsn1->xrecoff);
+	v2 = (((uint64) lsn2->xlogid) << 32) | ((uint64) lsn2->xrecoff);
+
+	/* Negative results are not allowed. */
+	if (v1 < v2)
+		ereport(ERROR,
+				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+				 errmsg("LSN out of range")));
+
+	/* Convert to numeric. */
+	sprintf(buf, UINT64_FORMAT, v1 - v2);
+	result = DirectFunctionCall3(numeric_in,
+								 CStringGetDatum(buf),
+								 ObjectIdGetDatum(0),
+								 Int32GetDatum(-1));
+
+	return result;
+}
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index eac5cb9..07b8f63 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1578,6 +1578,22 @@ DESCR("less than or equal");
 DATA(insert OID = 2977 (  ">="	   PGNSP PGUID b f f 2950 2950 16 2976 2974 uuid_ge scalargtsel scalargtjoinsel ));
 DESCR("greater than or equal");
 
+/* lsn operators */
+DATA(insert OID = 3788 (  "="	   PGNSP PGUID b f f 3813 3813 16 3788 3789 lsn_eq eqsel eqjoinsel ));
+DESCR("equal");
+DATA(insert OID = 3789 (  "<>"	   PGNSP PGUID b f f 3813 3813 16 3789 3788 lsn_ne neqsel neqjoinsel ));
+DESCR("not equal");
+DATA(insert OID = 3790 (  "<"	   PGNSP PGUID b f f 3813 3813 16 3791 3793 lsn_lt scalarltsel scalarltjoinsel ));
+DESCR("less than");
+DATA(insert OID = 3791 (  ">"	   PGNSP PGUID b f f 3813 3813 16 3790 3792 lsn_gt scalargtsel scalargtjoinsel ));
+DESCR("greater than");
+DATA(insert OID = 3792 (  "<="	   PGNSP PGUID b f f 3813 3813 16 3793 3791 lsn_le scalarltsel scalarltjoinsel ));
+DESCR("less than or equal");
+DATA(insert OID = 3793 (  ">="	   PGNSP PGUID b f f 3813 3813 16 3792 3790 lsn_ge scalargtsel scalargtjoinsel ));
+DESCR("greater than or equal");
+DATA(insert OID = 3795 (  "-"	   PGNSP PGUID b f f 3813 3813 1700    0	0 lsn_mi - - ));
+DESCR("minus");
+
 /* enum operators */
 DATA(insert OID = 3516 (  "="	   PGNSP PGUID b t t 3500 3500 16 3516 3517 enum_eq eqsel eqjoinsel ));
 DESCR("equal");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 28e53b7..0ddc956 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -3983,6 +3983,23 @@ DESCR("I/O");
 DATA(insert OID = 2963 (  uuid_hash		   PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 23 "2950" _null_ _null_ _null_ _null_ uuid_hash _null_ _null_ _null_ ));
 DESCR("hash");
 
+/* lsn */
+DATA(insert OID = 3778 (  lsn_in		   PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 3813 "2275" _null_ _null_ _null_ _null_ lsn_in _null_ _null_ _null_ ));
+DESCR("I/O");
+DATA(insert OID = 3779 (  lsn_out		   PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 2275 "3813" _null_ _null_ _null_ _null_ lsn_out _null_ _null_ _null_ ));
+DESCR("I/O");
+DATA(insert OID = 3780 (  lsn_lt		   PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 16 "3813 3813" _null_ _null_ _null_ _null_ lsn_lt _null_ _null_ _null_ ));
+DATA(insert OID = 3781 (  lsn_le		   PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 16 "3813 3813" _null_ _null_ _null_ _null_ lsn_le _null_ _null_ _null_ ));
+DATA(insert OID = 3782 (  lsn_eq		   PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 16 "3813 3813" _null_ _null_ _null_ _null_ lsn_eq _null_ _null_ _null_ ));
+DATA(insert OID = 3783 (  lsn_ge		   PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 16 "3813 3813" _null_ _null_ _null_ _null_ lsn_ge _null_ _null_ _null_ ));
+DATA(insert OID = 3784 (  lsn_gt		   PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 16 "3813 3813" _null_ _null_ _null_ _null_ lsn_gt _null_ _null_ _null_ ));
+DATA(insert OID = 3785 (  lsn_ne		   PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 16 "3813 3813" _null_ _null_ _null_ _null_ lsn_ne _null_ _null_ _null_ ));
+DATA(insert OID = 3794 (  lsn_mi		   PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 1700 "3813 3813" _null_ _null_ _null_ _null_ lsn_mi _null_ _null_ _null_ ));
+DATA(insert OID = 3786 (  lsn_recv		   PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 3813 "2281" _null_ _null_ _null_ _null_ lsn_recv _null_ _null_ _null_ ));
+DESCR("I/O");
+DATA(insert OID = 3787 (  lsn_send		   PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 17 "3813" _null_ _null_ _null_ _null_ lsn_send _null_ _null_ _null_ ));
+DESCR("I/O");
+
 /* enum related procs */
 DATA(insert OID = 3504 (  anyenum_in	PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 3500 "2275" _null_ _null_ _null_ _null_ anyenum_in _null_ _null_ _null_ ));
 DESCR("I/O");
diff --git a/src/include/catalog/pg_type.h b/src/include/catalog/pg_type.h
index 406241a..aec9a24 100644
--- a/src/include/catalog/pg_type.h
+++ b/src/include/catalog/pg_type.h
@@ -565,6 +565,11 @@ DATA(insert OID = 2950 ( uuid			PGNSP PGUID 16 f b U f t \054 0 0 2951 uuid_in u
 DESCR("UUID datatype");
 DATA(insert OID = 2951 ( _uuid			PGNSP PGUID -1 f b A f t \054 0 2950 0 array_in array_out array_recv array_send - - - i x f 0 -1 0 0 _null_ _null_ ));
 
+/* lsn */
+DATA(insert OID = 3813 ( lsn			PGNSP PGUID 16 f b U f t \054 0 0 3814 lsn_in lsn_out lsn_recv lsn_send - - - i p f 0 -1 0 0 _null_ _null_ ));
+DESCR("LSN datatype");
+DATA(insert OID = 3814 ( _lsn			PGNSP PGUID -1 f b A f t \054 0 3813 0 array_in array_out array_recv array_send - - - i x f 0 -1 0 0 _null_ _null_ ));
+
 /* text search */
 DATA(insert OID = 3614 ( tsvector		PGNSP PGUID -1 f b U f t \054 0 0 3643 tsvectorin tsvectorout tsvectorrecv tsvectorsend - - ts_typanalyze i x f 0 -1 0 0 _null_ _null_ ));
 DESCR("text representation for text search");
diff --git a/src/include/utils/lsn.h b/src/include/utils/lsn.h
new file mode 100644
index 0000000..7e4be7b
--- /dev/null
+++ b/src/include/utils/lsn.h
@@ -0,0 +1,33 @@
+/*-------------------------------------------------------------------------
+ *
+ * lsn.h
+ *	  Declarations for operations on log sequence numbers (LSNs).
+ *
+ *
+ * Portions Copyright (c) 1996-2011, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/lsn.h
+ *
+ *-------------------------------------------------------------------------
+ */
+#ifndef LSN_H
+#define LSN_H
+
+#include "fmgr.h"
+
+extern Datum lsn_in(PG_FUNCTION_ARGS);
+extern Datum lsn_out(PG_FUNCTION_ARGS);
+extern Datum lsn_recv(PG_FUNCTION_ARGS);
+extern Datum lsn_send(PG_FUNCTION_ARGS);
+
+extern Datum lsn_eq(PG_FUNCTION_ARGS);
+extern Datum lsn_ne(PG_FUNCTION_ARGS);
+extern Datum lsn_lt(PG_FUNCTION_ARGS);
+extern Datum lsn_gt(PG_FUNCTION_ARGS);
+extern Datum lsn_le(PG_FUNCTION_ARGS);
+extern Datum lsn_ge(PG_FUNCTION_ARGS);
+
+extern Datum lsn_mi(PG_FUNCTION_ARGS);
+
+#endif   /* LSN_H */
In reply to: Magnus Hagander (#2)
Re: xlog location arithmetic

On 06-12-2011 07:14, Magnus Hagander wrote:

On Tue, Dec 6, 2011 at 05:19, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

Hi,

A while ago when blogging about WAL [1], I noticed a function to deal with
xlog location arithmetic is wanted. I remembered Depez [2] mentioning it and
after some questions during trainings and conferences I decided to translate
my shell script function in C.

The attached patch implements the function pg_xlog_location_diff (bikeshed
colors are welcome). It calculates the difference between two given
transaction log locations. Now that we have pg_stat_replication view, it will
be easy to get the lag just passing columns as parameters. Also, the
monitoring tools could take advantage of it instead of relying on a fragile
routine to get the lag.

I've been considering similar things, as you can find in the archives,
but what I was thinking of was converting the number to just a plain
bigint, then letting the user apply whatever arithmetic wanted at the
SQL level. I never got around to acutally coding it, though. It could
easily be extracted from your patch of course - and I think that's a
more flexible approach. Is there some advantage to your method that
I'm missing?

The only advantage is that you don't expose the arithmetic, e.g., user doesn't
need to know the xlog internals (like I described in a recent blog post). If
one day we consider changes in xlog arithmetic (for example, XLogFileSize), we
don't need to worry too much about external tools.

Also, why do you use DirectFunctionCall to do the simple math, and not
just do the math right there in the function?

I use it because I don't want to duplicate the overflow code.

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

In reply to: Robert Haas (#3)
Re: xlog location arithmetic

On 06-12-2011 13:11, Robert Haas wrote:

On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote:

I've been considering similar things, as you can find in the archives,
but what I was thinking of was converting the number to just a plain
bigint, then letting the user apply whatever arithmetic wanted at the
SQL level. I never got around to acutally coding it, though. It could
easily be extracted from your patch of course - and I think that's a
more flexible approach. Is there some advantage to your method that
I'm missing?

I went so far as to put together an lsn data type. I didn't actually
get all that far with it, which is why I haven't posted it sooner, but
here's what I came up with. It's missing indexing support and stuff,
but that could be added if people like the approach. It solves this
problem by implementing -(lsn,lsn) => numeric (not int8, that can
overflow since it is not unsigned), which allows an lsn => numeric
conversion by just subtracting '0/0'::lsn.

Interesting approach. I don't want to go that far. If so, you want to change
all of those functions that deal with LSNs and add some implicit conversion
between text and lsn data types (for backward compatibility). As of int8, I'm
not aware of any modern plataform that int8 is not 64 bits. I'm not against
numeric use; I'm just saying that int8 is sufficient.

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

#6Robert Haas
robertmhaas@gmail.com
In reply to: Euler Taveira de Oliveira (#5)
Re: xlog location arithmetic

On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 06-12-2011 13:11, Robert Haas wrote:

On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote:

I've been considering similar things, as you can find in the archives,
but what I was thinking of was converting the number to just a plain
bigint, then letting the user apply whatever arithmetic wanted at the
SQL level. I never got around to acutally coding it, though. It could
easily be extracted from your patch of course - and I think that's a
more flexible approach. Is there some advantage to your method that
I'm missing?

I went so far as to put together an lsn data type.  I didn't actually
get all that far with it, which is why I haven't posted it sooner, but
here's what I came up with.  It's missing indexing support and stuff,
but that could be added if people like the approach.  It solves this
problem by implementing -(lsn,lsn) => numeric (not int8, that can
overflow since it is not unsigned), which allows an lsn => numeric
conversion by just subtracting '0/0'::lsn.

Interesting approach. I don't want to go that far. If so, you want to change
all of those functions that deal with LSNs and add some implicit conversion
between text and lsn data types (for backward compatibility). As of int8, I'm
not aware of any modern plataform that int8 is not 64 bits. I'm not against
numeric use; I'm just saying that int8 is sufficient.

The point isn't that int8 might not be 64 bits - of course it has to
be 64 bits; that's why it's called int8 i.e. 8 bytes. The point is
that a large enough LSN, represented as an int8, will come out as a
negative values. int8 can only represent 2^63 *non-negative* values,
because one bit is reserved for sign.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#7Jim Nasby
jim@nasby.net
In reply to: Robert Haas (#6)
Re: xlog location arithmetic

On Dec 6, 2011, at 12:06 PM, Robert Haas wrote:

On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 06-12-2011 13:11, Robert Haas wrote:

On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote:

I've been considering similar things, as you can find in the archives,
but what I was thinking of was converting the number to just a plain
bigint, then letting the user apply whatever arithmetic wanted at the
SQL level. I never got around to acutally coding it, though. It could
easily be extracted from your patch of course - and I think that's a
more flexible approach. Is there some advantage to your method that
I'm missing?

I went so far as to put together an lsn data type. I didn't actually
get all that far with it, which is why I haven't posted it sooner, but
here's what I came up with. It's missing indexing support and stuff,
but that could be added if people like the approach. It solves this
problem by implementing -(lsn,lsn) => numeric (not int8, that can
overflow since it is not unsigned), which allows an lsn => numeric
conversion by just subtracting '0/0'::lsn.

Interesting approach. I don't want to go that far. If so, you want to change
all of those functions that deal with LSNs and add some implicit conversion
between text and lsn data types (for backward compatibility). As of int8, I'm
not aware of any modern plataform that int8 is not 64 bits. I'm not against
numeric use; I'm just saying that int8 is sufficient.

The point isn't that int8 might not be 64 bits - of course it has to
be 64 bits; that's why it's called int8 i.e. 8 bytes. The point is
that a large enough LSN, represented as an int8, will come out as a
negative values. int8 can only represent 2^63 *non-negative* values,
because one bit is reserved for sign.

I've often wondered about adding uint2/4/8... I suspect it's actually pretty uncommon for people to put negative numbers into int fields, since one of their biggest uses seems to be surrogate keys.

I realize that this opens a can of worms with casting, but perhaps that can be kept under control by not doing any implicit casting between int and uint... that just means that we'd have to be smart about casting from unknown, but hopefully that's doable since we already have a similar concern with casting unknown to int2/4/8 vs numeric?
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net

#8Robert Haas
robertmhaas@gmail.com
In reply to: Jim Nasby (#7)
Re: xlog location arithmetic

On Tue, Dec 13, 2011 at 12:48 PM, Jim Nasby <jim@nasby.net> wrote:

I've often wondered about adding uint2/4/8... I suspect it's actually pretty uncommon for people to put negative numbers into int fields, since one of their biggest uses seems to be surrogate keys.

I realize that this opens a can of worms with casting, but perhaps that can be kept under control by not doing any implicit casting between int and uint... that just means that we'd have to be smart about casting from unknown, but hopefully that's doable since we already have a similar concern with casting unknown to int2/4/8 vs numeric?

I've wondered about it too, but it seems like too large a can of worms
to open just to address this case. Returning the value as numeric is
hardly a disaster; the user can always downcast to int8 if they really
want, and as long as it's < 2^63 (which in practice it virtually
always will be) it will work. It's not clear what the point of this
is since for typical values numeric is going to take up less storage
anyway (e.g. 1000001 is 7 bytes on disk as a numeric), not to mention
that it only requires 4-byte alignment rather than 8-byte alignment,
and probably no one does enough arithmetic with LSN values for any
speed penalty to matter even slightly, but it should work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#9Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#6)
Re: xlog location arithmetic

On Tue, Dec 6, 2011 at 19:06, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 06-12-2011 13:11, Robert Haas wrote:

On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote:

I've been considering similar things, as you can find in the archives,
but what I was thinking of was converting the number to just a plain
bigint, then letting the user apply whatever arithmetic wanted at the
SQL level. I never got around to acutally coding it, though. It could
easily be extracted from your patch of course - and I think that's a
more flexible approach. Is there some advantage to your method that
I'm missing?

I went so far as to put together an lsn data type.  I didn't actually
get all that far with it, which is why I haven't posted it sooner, but
here's what I came up with.  It's missing indexing support and stuff,
but that could be added if people like the approach.  It solves this
problem by implementing -(lsn,lsn) => numeric (not int8, that can
overflow since it is not unsigned), which allows an lsn => numeric
conversion by just subtracting '0/0'::lsn.

Interesting approach. I don't want to go that far. If so, you want to change
all of those functions that deal with LSNs and add some implicit conversion
between text and lsn data types (for backward compatibility). As of int8, I'm

As long as you have the conversion, you don't really need to change
them, do you? It might be nice in some ways, but this is still a
pretty internal operation, so I don't see it as critical.

not aware of any modern plataform that int8 is not 64 bits. I'm not against
numeric use; I'm just saying that int8 is sufficient.

The point isn't that int8 might not be 64 bits - of course it has to
be 64 bits; that's why it's called int8 i.e. 8 bytes.  The point is
that a large enough LSN, represented as an int8, will come out as a
negative values.  int8 can only represent 2^63 *non-negative* values,
because one bit is reserved for sign.

Doing it in numeric should be perfectly fine. The only real reason to
pick int8 over in this context would be performance, but it's not like
this is something that's going to be called in really performance
critical paths...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

In reply to: Magnus Hagander (#9)
Re: xlog location arithmetic

On 20-12-2011 07:27, Magnus Hagander wrote:

On Tue, Dec 6, 2011 at 19:06, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 06-12-2011 13:11, Robert Haas wrote:

On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote:

I've been considering similar things, as you can find in the archives,
but what I was thinking of was converting the number to just a plain
bigint, then letting the user apply whatever arithmetic wanted at the
SQL level. I never got around to acutally coding it, though. It could
easily be extracted from your patch of course - and I think that's a
more flexible approach. Is there some advantage to your method that
I'm missing?

I went so far as to put together an lsn data type. I didn't actually
get all that far with it, which is why I haven't posted it sooner, but
here's what I came up with. It's missing indexing support and stuff,
but that could be added if people like the approach. It solves this
problem by implementing -(lsn,lsn) => numeric (not int8, that can
overflow since it is not unsigned), which allows an lsn => numeric
conversion by just subtracting '0/0'::lsn.

Interesting approach. I don't want to go that far. If so, you want to change
all of those functions that deal with LSNs and add some implicit conversion
between text and lsn data types (for backward compatibility). As of int8, I'm

As long as you have the conversion, you don't really need to change
them, do you? It might be nice in some ways, but this is still a
pretty internal operation, so I don't see it as critical.

For correctness, yes.

At this point, my question is: do we want to support the lsn data type idea or
a basic function that implements the difference between LSNs?

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

#11Magnus Hagander
magnus@hagander.net
In reply to: Euler Taveira de Oliveira (#10)
Re: xlog location arithmetic

On Tue, Dec 20, 2011 at 14:08, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 20-12-2011 07:27, Magnus Hagander wrote:

On Tue, Dec 6, 2011 at 19:06, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Dec 6, 2011 at 1:00 PM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 06-12-2011 13:11, Robert Haas wrote:

On Tue, Dec 6, 2011 at 5:14 AM, Magnus Hagander <magnus@hagander.net> wrote:

I've been considering similar things, as you can find in the archives,
but what I was thinking of was converting the number to just a plain
bigint, then letting the user apply whatever arithmetic wanted at the
SQL level. I never got around to acutally coding it, though. It could
easily be extracted from your patch of course - and I think that's a
more flexible approach. Is there some advantage to your method that
I'm missing?

I went so far as to put together an lsn data type.  I didn't actually
get all that far with it, which is why I haven't posted it sooner, but
here's what I came up with.  It's missing indexing support and stuff,
but that could be added if people like the approach.  It solves this
problem by implementing -(lsn,lsn) => numeric (not int8, that can
overflow since it is not unsigned), which allows an lsn => numeric
conversion by just subtracting '0/0'::lsn.

Interesting approach. I don't want to go that far. If so, you want to change
all of those functions that deal with LSNs and add some implicit conversion
between text and lsn data types (for backward compatibility). As of int8, I'm

As long as you have the conversion, you don't really need to change
them, do you? It might be nice in some ways, but this is still a
pretty internal operation, so I don't see it as critical.

For correctness, yes.

At this point, my question is: do we want to support the lsn data type idea or
a basic function that implements the difference between LSNs?

Personally I think a function is enough - it solves the only case that
I've actually seen. But a datatype would be a more complete solution,
of course - but it seems a bit of an overkill to me. Not really sure
which way we should go - I was hoping somebody else would comment as
well..

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#11)
Re: xlog location arithmetic

Magnus Hagander <magnus@hagander.net> writes:

At this point, my question is: do we want to support the lsn data type idea or
a basic function that implements the difference between LSNs?

Personally I think a function is enough - it solves the only case that
I've actually seen. But a datatype would be a more complete solution,
of course - but it seems a bit of an overkill to me. Not really sure
which way we should go - I was hoping somebody else would comment as
well..

I too think a datatype is overkill, if we're only planning on providing
one function. Just emit the values as numeric and have done.

regards, tom lane

#13Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#12)
Re: xlog location arithmetic

On Fri, Dec 23, 2011 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I too think a datatype is overkill, if we're only planning on providing
one function.

Are there any other functions we ought to provide?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#12)
Re: xlog location arithmetic

On 12/23/2011 10:05 AM, Tom Lane wrote:

Magnus Hagander<magnus@hagander.net> writes:

At this point, my question is: do we want to support the lsn data type idea or
a basic function that implements the difference between LSNs?

Personally I think a function is enough - it solves the only case that
I've actually seen. But a datatype would be a more complete solution,
of course - but it seems a bit of an overkill to me. Not really sure
which way we should go - I was hoping somebody else would comment as
well..

I too think a datatype is overkill, if we're only planning on providing
one function. Just emit the values as numeric and have done.

+1.

cheers

andrew

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#13)
Re: xlog location arithmetic

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Dec 23, 2011 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I too think a datatype is overkill, if we're only planning on providing
one function.

Are there any other functions we ought to provide?

Even if there are several, what exact advantage does a datatype offer
over representing LSN values as numerics? It seems to me to be adding
complication and extra code (I/O converters at least) for very little
gain.

regards, tom lane

#16Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#15)
Re: xlog location arithmetic

On Fri, Dec 23, 2011 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Dec 23, 2011 at 10:05 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I too think a datatype is overkill, if we're only planning on providing
one function.

Are there any other functions we ought to provide?

Even if there are several, what exact advantage does a datatype offer
over representing LSN values as numerics?  It seems to me to be adding
complication and extra code (I/O converters at least) for very little
gain.

I guess I'm just constitutionally averse to labeling things as "text"
when they really aren't. I do it all the time in Perl, of course, but
in PostgreSQL we have strong data typing, and it seems like we might
as well use it.

Also, we've occasionally talked (in the light of Pavan's single-pass
vacuum patch, for example) about needing to store LSNs in system
catalogs; and we're certainly not going to want to do that as text.
I'll admit that it's not 100% clear that anything like this will ever
happen, though, so maybe it's premature to worry about it.

I can see I'm in the minority on this one, though...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#16)
Re: xlog location arithmetic

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Dec 23, 2011 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Even if there are several, what exact advantage does a datatype offer
over representing LSN values as numerics? It seems to me to be adding
complication and extra code (I/O converters at least) for very little
gain.

I guess I'm just constitutionally averse to labeling things as "text"
when they really aren't.

Er ... text? I thought the proposal was to use numeric.

regards, tom lane

#18Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#17)
Re: xlog location arithmetic

On Fri, Dec 23, 2011 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Dec 23, 2011 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Even if there are several, what exact advantage does a datatype offer
over representing LSN values as numerics?  It seems to me to be adding
complication and extra code (I/O converters at least) for very little
gain.

I guess I'm just constitutionally averse to labeling things as "text"
when they really aren't.

Er ... text?  I thought the proposal was to use numeric.

The proposal is to make a function that takes a text argument (which
is really an LSN, but we choose to represent it as text) and returns
numeric.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#19Fujii Masao
masao.fujii@gmail.com
In reply to: Euler Taveira de Oliveira (#1)
Re: xlog location arithmetic

On Tue, Dec 6, 2011 at 1:19 PM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

Hi,

A while ago when blogging about WAL [1], I noticed a function to deal with
xlog location arithmetic is wanted. I remembered Depez [2] mentioning it and
after some questions during trainings and conferences I decided to translate
my shell script function in C.

The attached patch implements the function pg_xlog_location_diff (bikeshed
colors are welcome). It calculates the difference between two given
transaction log locations. Now that we have pg_stat_replication view, it will
be easy to get the lag just passing columns as parameters. Also, the
monitoring tools could take advantage of it instead of relying on a fragile
routine to get the lag.

I noticed that pg_xlogfile_name* functions does not sanity check the xrecoff
boundaries but that is material for another patch.

[1] http://eulerto.blogspot.com/2011/11/understanding-wal-nomenclature.html
[2]
http://www.depesz.com/index.php/2011/01/24/waiting-for-9-1-pg_stat_replication/

I think that this function is very useful. Can you add the patch into
CommitFest 2012-1 ?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In reply to: Fujii Masao (#19)
Re: xlog location arithmetic

On 14-01-2012 11:06, Fujii Masao wrote:

I think that this function is very useful. Can you add the patch into
CommitFest 2012-1 ?

Sure. But I must adjust the patch based on the thread comments (basically,
numeric output). I have a new patch but need to test it before submitting it.
I'll post this weekend.

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

#21Greg Smith
greg@2ndQuadrant.com
In reply to: Euler Taveira de Oliveira (#20)
Re: xlog location arithmetic

On 01/14/2012 09:12 AM, Euler Taveira de Oliveira wrote:

But I must adjust the patch based on the thread comments (basically,
numeric output). I have a new patch but need to test it before submitting it.
I'll post this weekend.

It's now at https://commitfest.postgresql.org/action/patch_view?id=776
listed as waiting on you right now. It's good to put patches into the
CF application early. Helps planning, and gives a safety net against
all sorts of things. We wouldn't want something this obviously useful
to get kicked out if, for example, you lost your Internet connection
over the weekend and then didn't technically qualify as having submitted
it there before the deadline. As someone who sweated today for two
hours when my power at home was turned off to install a new circuit
breaker, I'm feeling particularly paranoid right now about that sort of
thing here.

The fact that you got some review feedback before the official CF start
doesn't mean you can't be listed there right now. In fact, those are
things I like to see tracked. Having links to all of the e-mail
messages that were important turning points for a feature that changed
during review is very helpful to reviewers and committers. And the
easiest way to keep up with that is to start as early as possible: add
it to the app right after the first patch submission.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

#22Gurjeet Singh
singh.gurjeet@gmail.com
In reply to: Greg Smith (#21)
Re: xlog location arithmetic

On Sat, Jan 14, 2012 at 8:18 PM, Greg Smith <greg@2ndquadrant.com> wrote:

On 01/14/2012 09:12 AM, Euler Taveira de Oliveira wrote:

But I must adjust the patch based on the thread comments (basically,
numeric output). I have a new patch but need to test it before submitting
it.
I'll post this weekend.

It's now at https://commitfest.postgresql.**org/action/patch_view?id=776&lt;https://commitfest.postgresql.org/action/patch_view?id=776&gt;listed as waiting on you right now. It's good to put patches into the CF
application early. Helps planning, and gives a safety net against all
sorts of things. We wouldn't want something this obviously useful to get
kicked out if, for example, you lost your Internet connection over the
weekend and then didn't technically qualify as having submitted it there
before the deadline. As someone who sweated today for two hours when my
power at home was turned off to install a new circuit breaker, I'm feeling
particularly paranoid right now about that sort of thing here.
he patch
The fact that you got some review feedback before the official CF start
doesn't mean you can't be listed there right now. In fact, those are
things I like to see tracked. Having links to all of the e-mail messages
that were important turning points for a feature that changed during review
is very helpful to reviewers and committers. And the easiest way to keep
up with that is to start as early as possible: add it to the app right
after the first patch submission.

I agree.

So lets make it easy for the patch submitter to start the process. I
propose that we have a page in the CF application where people can
upload/attach the patch, and the app posts the patch to -hackers and uses
the post URL to create the CF entry.

Regards,
--
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

#23Greg Smith
greg@2ndQuadrant.com
In reply to: Gurjeet Singh (#22)
Re: automating CF submissions (was xlog location arithmetic)

On 01/14/2012 10:49 PM, Gurjeet Singh wrote:

So lets make it easy for the patch submitter to start the process. I
propose that we have a page in the CF application where people can
upload/attach the patch, and the app posts the patch to -hackers and
uses the post URL to create the CF entry.

That would be nice, but there's at least two serious problems with it,
which I would guess are both unsolvable without adding an unsupportable
amount of work to the current PostgreSQL web team. First, it is
technically risky for a web application hosted on postgresql.org to be
e-mailing this list. There are some things in the infrastructure that
do that already--I believe the pgsql-commiters list being driven from
commits is the busiest such bot. But all of the ones that currently
exist are either moderated, have a limited number of approved
submitters, or both.

If it were possible for a bot to create a postgresql.org community
account, then trigger an e-mail to pgsql-hackers just by filling out a
web form, I'd give it maybe six months before it has to be turned off
for a bit--because there are thousands messages queued up once the first
bored spammer figures that out. Securing web to e-mail gateways is a
giant headache, and everyone working on the PostgreSQL infrastructure
who might work on that is already overloaded with community volunteer
work. There's an element of zero-sum game here: while this would
provide some assistance to new contributors, the time to build and
maintain the thing would be coming mainly out of senior contributors. I
see the gain+risk vs. reward here skewed the wrong way.

Second, e-mail provides some level of validation that patches being
submitted are coming from the person they claim. We currently reject
patches that are only shared with the community on the web, via places
like github. The process around this mailing list tries to make it
clear sending patches to here is a code submission under the PostgreSQL
license. And e-mail nowadays keeps increasing the number of checks that
confirm it's coming from the person it claims sent it. I can go check
into the DKIM credentials your Gmail message to the list contained if
I'd like, to help confirm it really came from your account. E-mail
headers are certainly not perfectly traceable and audit-able, but they
are far better than what you'd get from a web submission. Little audit
trail there beyond "came from this IP address".

One unicorn I would like to have here would give the CF app a database
of recent e-mails to pgsql-hackers. I login to the CF app, click on
"Add recent submission", and anything matching my e-mail address appears
with a checkbox next to it. Click on the patch submissions, and then
something like you described would happen. That would save me the
annoying work around looking up message IDs so much.

The role CF manager would benefit even more from infrastructure like
that too. Something that listed all the recent e-mail messages for an
existing submission, such that you could just click on the ones that you
wanted added to the patch's e-mail history, would save me personally
enough time that I could probably even justify writing it.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

#24Magnus Hagander
magnus@hagander.net
In reply to: Greg Smith (#23)
Re: automating CF submissions (was xlog location arithmetic)

On Sun, Jan 15, 2012 at 05:44, Greg Smith <greg@2ndquadrant.com> wrote:

On 01/14/2012 10:49 PM, Gurjeet Singh wrote:

So lets make it easy for the patch submitter to start the process. I
propose that we have a page in the CF application where people can
upload/attach the patch, and the app posts the patch to -hackers and uses
the post URL to create the CF entry.

That would be nice, but there's at least two serious problems with it, which
I would guess are both unsolvable without adding an unsupportable amount of
work to the current PostgreSQL web team.  First, it is technically risky for
a web application hosted on postgresql.org to be e-mailing this list.  There
are some things in the infrastructure that do that already--I believe the
pgsql-commiters list being driven from commits is the busiest such bot.  But
all of the ones that currently exist are either moderated, have a limited
number of approved submitters, or both.

It's not really a problem from that perspective, as long as it
requires the user to be logged in. The mail would be sent from the
users account, with that one as a sender, and thus be exposed to the
same moderation rules as the rest of the list posts.

If it were possible for a bot to create a postgresql.org community account,
then trigger an e-mail to pgsql-hackers just by filling out a web form, I'd
give it maybe six months before it has to be turned off for a bit--because
there are thousands messages queued up once the first bored spammer figures

Said bot can already use the bug report form *without* having to sign
up for an account.

Or said bot could submit news or events, which trigger an email to at
least some lists, which hasn't bene done.

It's supposedly not easy for a bot to sign up for a community account,
since it requires you to have access to the email address it's
registered on. If that doesn't work, it's a bug and needs to be fixed
regardless.

that out.  Securing web to e-mail gateways is a giant headache, and everyone
working on the PostgreSQL infrastructure who might work on that is already
overloaded with community volunteer work.  There's an element of zero-sum

We've already solved that problem for other situtations, and given how
the infrastructure is built, that's fairly easy to replicate to
another node.

I think the bigger problem is "who'll write it". AFAIK, the CF app
*itself* is even more person- and time-constrained to senior
developers (Robert Haas only) than the infrastructure, and that's a
bigger problem. There are already a bunch of things that are a lot
simpler than this that has been pending on that one for well over half
a year.

Second, e-mail provides some level of validation that patches being
submitted are coming from the person they claim.  We currently reject
patches that are only shared with the community on the web, via places like
github.  The process around this mailing list tries to make it clear sending
patches to here is a code submission under the PostgreSQL license.  And
e-mail nowadays keeps increasing the number of checks that confirm it's
coming from the person it claims sent it.  I can go check into the DKIM
credentials your Gmail message to the list contained if I'd like, to help
confirm it really came from your account.  E-mail headers are certainly not

I think DKIM was a bad example, because AFAIK our lists mangle DKIM
and thus actually show them as *invalid* for at least the majority of
messages...

One unicorn I would like to have here would give the CF app a database of
recent e-mails to pgsql-hackers.  I login to the CF app, click on "Add
recent submission", and anything matching my e-mail address appears with a
checkbox next to it.  Click on the patch submissions, and then something
like you described would happen.  That would save me the annoying work
around looking up message IDs so much.

That would be neat.

And FWIW, I'd find it a lot more useful for the CF app to have the
ability to post *reviews* in it, that would end up being properly
threaded. The way it is now, half the reviewers create a *new* thread
to post their reviews on, making it a PITA to keep track of those
patches on the list at all, which somewhat takes away the whole idea
of "mail being the primary way to track it". Not saying it's critical,
but I'd put it a lot higher on the list than being able to post the
initial patch.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#25Greg Smith
greg@2ndQuadrant.com
In reply to: Magnus Hagander (#24)
Re: automating CF submissions (was xlog location arithmetic)

On 01/15/2012 03:17 AM, Magnus Hagander wrote:

And FWIW, I'd find it a lot more useful for the CF app to have the
ability to post *reviews* in it, that would end up being properly
threaded.

Next you'll be saying we should have some sort of web application to
help with the whole review process, show the work on an integrated sort
of Review Board or something. What crazy talk.

My contribution toward patch review ease for this week is that peg is
quite a bit smarter about referring to the correct part of the origin
git repo now, when you've been working on a branch for a while then
create a new one: https://github.com/gregs1104/peg

Last week's was that I confirmed that on a Mac using Homebrew for
package management, after "brew install postgresql" to get the
dependencies in, you can then use peg to setup a PostgreSQL in your home
directory for patch testing or development. Works fine out of the box,
you just won't have things like all the PLs installed.

Yes, I am aware I'm going at this bottom-up.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

#26Magnus Hagander
magnus@hagander.net
In reply to: Greg Smith (#25)
Re: automating CF submissions (was xlog location arithmetic)

On Sun, Jan 15, 2012 at 09:37, Greg Smith <greg@2ndquadrant.com> wrote:

On 01/15/2012 03:17 AM, Magnus Hagander wrote:

And FWIW, I'd find it a lot more useful for the CF app to have the
ability to post *reviews* in it, that would end up being properly
threaded.

Next you'll be saying we should have some sort of web application to help
with the whole review process, show the work on an integrated sort of Review
Board or something.  What crazy talk.

Well, it's early in the morning for being sunday, I blame that.

My contribution toward patch review ease for this week is that peg is quite
a bit smarter about referring to the correct part of the origin git repo
now, when you've been working on a branch for a while then create a new one:
 https://github.com/gregs1104/peg

Being able to refer to a git branch is one of those things that have
been on the todo list for the cf app since pgcon last year...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#27Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#26)
Re: automating CF submissions (was xlog location arithmetic)

On Sun, Jan 15, 2012 at 3:45 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Sun, Jan 15, 2012 at 09:37, Greg Smith <greg@2ndquadrant.com> wrote:

On 01/15/2012 03:17 AM, Magnus Hagander wrote:

And FWIW, I'd find it a lot more useful for the CF app to have the
ability to post *reviews* in it, that would end up being properly
threaded.

Next you'll be saying we should have some sort of web application to help
with the whole review process, show the work on an integrated sort of Review
Board or something.  What crazy talk.

Well, it's early in the morning for being sunday, I blame that.

My contribution toward patch review ease for this week is that peg is quite
a bit smarter about referring to the correct part of the origin git repo
now, when you've been working on a branch for a while then create a new one:
 https://github.com/gregs1104/peg

Being able to refer to a git branch is one of those things that have
been on the todo list for the cf app since pgcon last year...

Do we have an actual written TODO list for the cf app somewhere? If
not, I think creating one would be a good idea. I realize I've been
remiss in addressing some of the things people want, but the lack of
any centralized place where such items are collected doesn't make it
simpler.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#28Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#27)
Re: automating CF submissions (was xlog location arithmetic)

On Mon, Jan 16, 2012 at 18:57, Robert Haas <robertmhaas@gmail.com> wrote:

On Sun, Jan 15, 2012 at 3:45 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Sun, Jan 15, 2012 at 09:37, Greg Smith <greg@2ndquadrant.com> wrote:

On 01/15/2012 03:17 AM, Magnus Hagander wrote:

And FWIW, I'd find it a lot more useful for the CF app to have the
ability to post *reviews* in it, that would end up being properly
threaded.

Next you'll be saying we should have some sort of web application to help
with the whole review process, show the work on an integrated sort of Review
Board or something.  What crazy talk.

Well, it's early in the morning for being sunday, I blame that.

My contribution toward patch review ease for this week is that peg is quite
a bit smarter about referring to the correct part of the origin git repo
now, when you've been working on a branch for a while then create a new one:
 https://github.com/gregs1104/peg

Being able to refer to a git branch is one of those things that have
been on the todo list for the cf app since pgcon last year...

Do we have an actual written TODO list for the cf app somewhere?  If
not, I think creating one would be a good idea.  I realize I've been
remiss in addressing some of the things people want, but the lack of
any centralized place where such items are collected doesn't make it
simpler.

I don't think so - I've been keeping mine in your mailbox ;)

A simple wiki page is probably enough - going for an actual tracker or
anything seems vastly overkill...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#29Josh Berkus
josh@agliodbs.com
In reply to: Greg Smith (#23)
Re: automating CF submissions (was xlog location arithmetic)

On 1/14/12 8:44 PM, Greg Smith wrote:

Second, e-mail provides some level of validation that patches being
submitted are coming from the person they claim. We currently reject
patches that are only shared with the community on the web, via places
like github. The process around this mailing list tries to make it
clear sending patches to here is a code submission under the PostgreSQL
license. And e-mail nowadays keeps increasing the number of checks that
confirm it's coming from the person it claims sent it. I can go check
into the DKIM credentials your Gmail message to the list contained if
I'd like, to help confirm it really came from your account. E-mail
headers are certainly not perfectly traceable and audit-able, but they
are far better than what you'd get from a web submission. Little audit
trail there beyond "came from this IP address".

Putting submitters aside, I have to say based on teaching people how to
use the CF stuff on Thursday night that the process of submitting a
review of a patch is VERY unintuitive, or in the words of one reviewer
"astonishingly arcane". Summing up:

1. Log into CF. Claim the patch by editing it.

2. Write a review and email it to pgsql-hackers.

3. Dig the messageID out of your sent mail.

4. Add a comment to the patch, type "Review" with the messageID, and
ideally a short summary comment of the review.

5. Edit the patch to change its status as well as to remove yourself as
reviewer if you plan to do no further review.

There are so many things wrong with this workflow I wouldn't know where
to start. The end result, though, is that it strongly discourages the
occasional reviewer by making the review process cumbersome and confusing.

I'll also point out that the process for *applying* a patch, if you
don't subscribe to hackers and keep archives around on your personal
machine for months, is also very cumbersome and error-prone. Copy and
paste from a web page? Really?

Certainly we could spend the next 6 years incrementally improving the CF
app "in our spare time". But maybe it might be a better thing to look
at the code development tools which are already available?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#30Alvaro Herrera
alvherre@commandprompt.com
In reply to: Josh Berkus (#29)
Re: automating CF submissions (was xlog location arithmetic)

Excerpts from Josh Berkus's message of lun ene 16 17:48:41 -0300 2012:

Putting submitters aside, I have to say based on teaching people how to
use the CF stuff on Thursday night that the process of submitting a
review of a patch is VERY unintuitive, or in the words of one reviewer
"astonishingly arcane". Summing up:

1. Log into CF. Claim the patch by editing it.

2. Write a review and email it to pgsql-hackers.

3. Dig the messageID out of your sent mail.

4. Add a comment to the patch, type "Review" with the messageID, and
ideally a short summary comment of the review.

5. Edit the patch to change its status as well as to remove yourself as
reviewer if you plan to do no further review.

There are so many things wrong with this workflow I wouldn't know where
to start.

Other than having to figure out Message-Ids, which most MUAs seem to
hide as much as possible, is there anything here of substance? I mean,
if getting a message-id from Gmail is all that complicated, please
complain to Google.

I mean, is email arcane? Surely not. Are summary lines arcane? Give
me a break. So the only real complain point here is message-id, which
normally people don't care about and don't even know they exist. So
they have to learn about it.

Let's keep in mind that pgsql-hackers email is our preferred form of
communication.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#31Josh Berkus
josh@agliodbs.com
In reply to: Alvaro Herrera (#30)
Re: automating CF submissions (was xlog location arithmetic)

I mean, is email arcane? Surely not. Are summary lines arcane? Give
me a break. So the only real complain point here is message-id, which
normally people don't care about and don't even know they exist. So
they have to learn about it.

The complaint is that the reviewer is expected to use two different and
wholly incompatible methods of communication, each of which requires a
separate registration, to post the review.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#32Jeff Janes
jeff.janes@gmail.com
In reply to: Alvaro Herrera (#30)
Re: automating CF submissions (was xlog location arithmetic)

On Mon, Jan 16, 2012 at 1:10 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Excerpts from Josh Berkus's message of lun ene 16 17:48:41 -0300 2012:

Putting submitters aside, I have to say based on teaching people how to
use the CF stuff on Thursday night that the process of submitting a
review of a patch is VERY unintuitive, or in the words of one reviewer
"astonishingly arcane".  Summing up:

1. Log into CF.  Claim the patch by editing it.

2. Write a review and email it to pgsql-hackers.

3. Dig the messageID out of your sent mail.

4. Add a comment to the patch, type "Review" with the messageID, and
ideally a short summary comment of the review.

5. Edit the patch to change its status as well as to remove yourself as
reviewer if you plan to do no further review.

There are so many things wrong with this workflow I wouldn't know where
to start.

Other than having to figure out Message-Ids, which most MUAs seem to
hide as much as possible, is there anything here of substance?

I find it an annoyance, but can't get too worked up over it.

I mean,
if getting a message-id from Gmail is all that complicated, please
complain to Google.

But after digging the message-id out of gmail and entering it into the
commitfest app, the resulting link is broken because the email has not
yet shown up in the archives. So now I have to wonder if I did
something wrong, and keep coming back every few hours to see if will
start working.

I mean, is email arcane?  Surely not.  Are summary lines arcane?

The way you have to set them is pretty arcane. Again, I can't get too
worked over it, but if it were made simpler I'd be happier.

Cheers,

Jeff

#33Alvaro Herrera
alvherre@commandprompt.com
In reply to: Jeff Janes (#32)
Re: automating CF submissions (was xlog location arithmetic)

Excerpts from Jeff Janes's message of lun ene 16 18:37:59 -0300 2012:

I mean,
if getting a message-id from Gmail is all that complicated, please
complain to Google.

But after digging the message-id out of gmail and entering it into the
commitfest app, the resulting link is broken because the email has not
yet shown up in the archives. So now I have to wonder if I did
something wrong, and keep coming back every few hours to see if will
start working.

Hours? Unless a message is delayed for moderation, it should show up in
archives within tem minutes. If you have problems finding emails after
that period, by all means complain.

Now that we've moved archives to a new host, perhaps we could rerun the
archive script every two minutes instead of ten.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#34Greg Smith
greg@2ndQuadrant.com
In reply to: Josh Berkus (#29)
Re: automating CF submissions (was xlog location arithmetic)

On 01/16/2012 03:48 PM, Josh Berkus wrote:

3. Dig the messageID out of your sent mail.

4. Add a comment to the patch, type "Review" with the messageID, and
ideally a short summary comment of the review.

This is the time consuming part that would benefit the most from some
automation. The message-id digging is an obvious sore spot, which is
why I focused on improvements to eliminate so much of that first in my
suggestions. The problem is that we don't actually want every message
sent to the list on a thread to appear on the CF summary, and writing
that short summary content is an important step.

Archived messages deemed notable enough that someone linked the two are
the only ones that appear in the patch history. That makes it possible
to come up to speed on the most interesting history points of a patch in
a reasonable period of time--even if you missed the earlier discussion.
I think any of the other alternatives we might adopt would end up
associating all of the e-mail history around a patch. That's the
firehose, and spraying the CF app with it makes the whole thing a lot
less useful.

I don't think this is an unsolvable area to improve. It's been stuck
behind the major postgresql.org site overhaul, which is done now.
Adding some web service style APIs to probe the archives for message IDs
by a) ancestor and b) author would make it possible to sand off a whole
lot of rough edges here. While it's annoying in its current form, doing
all my work based on message IDs has been a huge improvement over the
old approach, where URLs into the archives were date based and not
always permanent.

I'll also point out that the process for *applying* a patch, if you
don't subscribe to hackers and keep archives around on your personal
machine for months, is also very cumbersome and error-prone. Copy and
paste from a web page? Really?

The most reasonable answer to this is for people to publish a git repo
URL in addition to the "official" submission of changes to the list in
patch form. And momentum toward doing that just keeps going up, even
among longer term contributors who weren't git advocates at all a year
during the transition. I nudged Simon that way and he's pushing
branches for major patches but not small ones yet, it looks like Andrew
fully embraced bitbucket recently, etc.

We're 16 months into git adoption. I'm pretty happy with how well
that's going. We don't need to add infrastructure to enable people to
push code to github and link to their branch comparison repo viewer as a
way to view the patch; that's already available to anyone who wants is.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

#35Alvaro Herrera
alvherre@commandprompt.com
In reply to: Greg Smith (#34)
Re: automating CF submissions (was xlog location arithmetic)

Excerpts from Greg Smith's message of lun ene 16 19:25:50 -0300 2012:

On 01/16/2012 03:48 PM, Josh Berkus wrote:

I'll also point out that the process for *applying* a patch, if you
don't subscribe to hackers and keep archives around on your personal
machine for months, is also very cumbersome and error-prone. Copy and
paste from a web page? Really?

The most reasonable answer to this is for people to publish a git repo
URL in addition to the "official" submission of changes to the list in
patch form.

It's expected that we'll get a more reasonable interface to attachments,
one that will allow you to download patches separately. (Currently,
attachments that have mime types other than text/plain are already
downloadable separately).

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#36Andrew Dunstan
andrew@dunslane.net
In reply to: Greg Smith (#34)
Re: automating CF submissions (was xlog location arithmetic)

On 01/16/2012 05:25 PM, Greg Smith wrote:

The most reasonable answer to this is for people to publish a git repo
URL in addition to the "official" submission of changes to the list in
patch form. And momentum toward doing that just keeps going up, even
among longer term contributors who weren't git advocates at all a year
during the transition. I nudged Simon that way and he's pushing
branches for major patches but not small ones yet, it looks like
Andrew fully embraced bitbucket recently, etc.

If we're going to do that, the refspec to be pulled needs to be a tag, I
think, not just a branch, and people would have to get into the habit of
tagging commits and explicitly pushing tags.

I probably should be doing that, and it is now built into the buildfarm
client release mechanism, but I usually don't when just publishing dev
work. Guess I need to start. I'll probably use tag names like
branch-YYYYMMDDHHMM.

I certainly like the idea of just being able to pull in a tag from a
remote instead of applying a patch.

(BTW, I use both bitbucket and github. They both have advantages.)

cheers

andrew

#37Peter Geoghegan
peter@2ndquadrant.com
In reply to: Magnus Hagander (#9)
Re: xlog location arithmetic

On 20 December 2011 10:27, Magnus Hagander <magnus@hagander.net> wrote:

Doing it in numeric should be perfectly fine. The only real reason to
pick int8 over in this context would be performance, but it's not like
this is something that's going to be called in really performance
critical paths...

FYI, my group commit patch has a little macro, in the spirit of
XLByteAdvance, to get the delta between two LSNs in bytes as an
uint64.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#38Matteo Beccati
php@beccati.com
In reply to: Alvaro Herrera (#35)
Re: automating CF submissions (was xlog location arithmetic)

On 16/01/2012 23:40, Alvaro Herrera wrote:

Excerpts from Greg Smith's message of lun ene 16 19:25:50 -0300 2012:

On 01/16/2012 03:48 PM, Josh Berkus wrote:

I'll also point out that the process for *applying* a patch, if you
don't subscribe to hackers and keep archives around on your personal
machine for months, is also very cumbersome and error-prone. Copy and
paste from a web page? Really?

The most reasonable answer to this is for people to publish a git repo
URL in addition to the "official" submission of changes to the list in
patch form.

It's expected that we'll get a more reasonable interface to attachments,
one that will allow you to download patches separately. (Currently,
attachments that have mime types other than text/plain are already
downloadable separately).

My proof of concept archive for the hackers ML site is still online, in
case anyone has trouble downloading the patches or just wants to have
the full thread handy.

All you need to do is to swap postgresql.org with beccati.org in the
"message-id" link:

http://archives.postgresql.org/message-id/1320343602-sup-2290@alvh.no-ip.org

->

http://archives.beccati.org/message-id/1320343602-sup-2290@alvh.no-ip.org

Cheers
--
Matteo Beccati

Development & Consulting - http://www.beccati.com/

#39Andrew Dunstan
andrew@dunslane.net
In reply to: Alvaro Herrera (#35)
Re: automating CF submissions (was xlog location arithmetic)

On 01/16/2012 05:40 PM, Alvaro Herrera wrote:

It's expected that we'll get a more reasonable interface to attachments,
one that will allow you to download patches separately. (Currently,
attachments that have mime types other than text/plain are already
downloadable separately).

Are you really sure about that? My recent JSON patch is at
<http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net&gt;.
I don't see any download link for the patch there, yet my mailer set the
attachment type to text/x-patch, not text/plain.

cheers

andrew

#40Peter Eisentraut
peter_e@gmx.net
In reply to: Greg Smith (#34)
Re: automating CF submissions (was xlog location arithmetic)

On mån, 2012-01-16 at 17:25 -0500, Greg Smith wrote:

The most reasonable answer to this is for people to publish a git repo
URL in addition to the "official" submission of changes to the list in
patch form.

Note that the original complaint was that for the occasional reviewer,
the current system takes at least 5 partially redundant steps in two
different systems. I doubt that adding a third system and more
partially redundant steps it going to help that.

I don't have anything against the general idea, but it won't address the
original point.

#41Alvaro Herrera
alvherre@commandprompt.com
In reply to: Matteo Beccati (#38)
Re: automating CF submissions (was xlog location arithmetic)

Excerpts from Matteo Beccati's message of mar ene 17 12:33:27 -0300 2012:

On 16/01/2012 23:40, Alvaro Herrera wrote:

Excerpts from Greg Smith's message of lun ene 16 19:25:50 -0300 2012:

On 01/16/2012 03:48 PM, Josh Berkus wrote:

I'll also point out that the process for *applying* a patch, if you
don't subscribe to hackers and keep archives around on your personal
machine for months, is also very cumbersome and error-prone. Copy and
paste from a web page? Really?

The most reasonable answer to this is for people to publish a git repo
URL in addition to the "official" submission of changes to the list in
patch form.

It's expected that we'll get a more reasonable interface to attachments,
one that will allow you to download patches separately. (Currently,
attachments that have mime types other than text/plain are already
downloadable separately).

My proof of concept archive for the hackers ML site is still online, in
case anyone has trouble downloading the patches or just wants to have
the full thread handy.

I was going to ping you about this, because I tried it when I wrote this
message and it got stuck waiting for response.

Now that we've migrated the website, it's time to get back to our
conversations about migrating archives to your stuff too. How confident
with Django are you?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#42Matteo Beccati
php@beccati.com
In reply to: Alvaro Herrera (#41)
Re: automating CF submissions (was xlog location arithmetic)

On 17/01/2012 17:50, Alvaro Herrera wrote:

Excerpts from Matteo Beccati's message of mar ene 17 12:33:27 -0300 2012:

My proof of concept archive for the hackers ML site is still online, in
case anyone has trouble downloading the patches or just wants to have
the full thread handy.

I was going to ping you about this, because I tried it when I wrote this
message and it got stuck waiting for response.

Hmm, works for me, e.g. the recently cited message:

http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net

Now that we've migrated the website, it's time to get back to our
conversations about migrating archives to your stuff too. How confident
with Django are you?

I've never wrote a line of Python in my life, so someone else should
work on porting the web part, I'm afraid...

Cheers
--
Matteo Beccati

Development & Consulting - http://www.beccati.com/

#43Matteo Beccati
php@beccati.com
In reply to: Matteo Beccati (#42)
Re: automating CF submissions (was xlog location arithmetic)

On 17/01/2012 18:10, Matteo Beccati wrote:

On 17/01/2012 17:50, Alvaro Herrera wrote:

Excerpts from Matteo Beccati's message of mar ene 17 12:33:27 -0300 2012:

My proof of concept archive for the hackers ML site is still online, in
case anyone has trouble downloading the patches or just wants to have
the full thread handy.

I was going to ping you about this, because I tried it when I wrote this
message and it got stuck waiting for response.

Hmm, works for me, e.g. the recently cited message:

http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net

Erm... I meant

http://archives.beccati.org/message-id/4F12F9E5.3090801@dunslane.net

which redirects to:

http://archives.beccati.org/pgsql-hackers/message/305925

for me.

Cheers
--
Matteo Beccati

Development & Consulting - http://www.beccati.com/

#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#39)
Re: automating CF submissions (was xlog location arithmetic)

Andrew Dunstan <andrew@dunslane.net> writes:

On 01/16/2012 05:40 PM, Alvaro Herrera wrote:

It's expected that we'll get a more reasonable interface to attachments,
one that will allow you to download patches separately. (Currently,
attachments that have mime types other than text/plain are already
downloadable separately).

Are you really sure about that? My recent JSON patch is at
<http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net&gt;.
I don't see any download link for the patch there, yet my mailer set the
attachment type to text/x-patch, not text/plain.

Yeah, AFAICT the archives treat text/x-patch the same as text/plain.
I tend to send stuff that way if I mean it primarily to be read in the
email. If I'm thinking people will download and apply it, it's better
to gzip the patch and pick a mime type appropriate to that, because that
makes it much easier to pull the patch off the archives at need, at the
cost that you can't just eyeball it in your mail reader.

Anyway, I agree with the general tenor of this thread that it'd be nice
to reduce the impedance mismatches a bit. Don't have any great ideas
about specific ways to do that.

regards, tom lane

#45Greg Smith
greg@2ndQuadrant.com
In reply to: Peter Eisentraut (#40)
Re: automating CF submissions (was xlog location arithmetic)

On 01/17/2012 11:50 AM, Peter Eisentraut wrote:

On mån, 2012-01-16 at 17:25 -0500, Greg Smith wrote:

The most reasonable answer to this is for people to publish a git repo
URL in addition to the "official" submission of changes to the list in
patch form.

Note that the original complaint was that for the occasional reviewer,
the current system takes at least 5 partially redundant steps in two
different systems. I doubt that adding a third system and more
partially redundant steps it going to help that.

Publishing the submission via git is an extra step for the patch
submitter. If that happens, the reviewer can test just be cloning that,
instead of first closing the PostgreSQL one then applying the patch. It
removes the "how do I fish the patch out of the archives?" problem from
the reviewer's side of things.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

#46Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#44)
Re: automating CF submissions (was xlog location arithmetic)

Excerpts from Tom Lane's message of mar ene 17 14:24:05 -0300 2012:

Andrew Dunstan <andrew@dunslane.net> writes:

On 01/16/2012 05:40 PM, Alvaro Herrera wrote:

It's expected that we'll get a more reasonable interface to attachments,
one that will allow you to download patches separately. (Currently,
attachments that have mime types other than text/plain are already
downloadable separately).

Are you really sure about that? My recent JSON patch is at
<http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net&gt;.
I don't see any download link for the patch there, yet my mailer set the
attachment type to text/x-patch, not text/plain.

Yeah, AFAICT the archives treat text/x-patch the same as text/plain.

Right, maybe it's text/* or something like that.

I tend to send stuff that way if I mean it primarily to be read in the
email. If I'm thinking people will download and apply it, it's better
to gzip the patch and pick a mime type appropriate to that, because that
makes it much easier to pull the patch off the archives at need, at the
cost that you can't just eyeball it in your mail reader.

Maybe we could find a way to convince Mhonarc to present links to
download all mime parts separately, not only those that are
undisplayable.

Anyway, I agree with the general tenor of this thread that it'd be nice
to reduce the impedance mismatches a bit. Don't have any great ideas
about specific ways to do that.

I'm hopeful that the migration to the Archivopteryx stuff by Matteo will
improve things a bit.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#47Alvaro Herrera
alvherre@commandprompt.com
In reply to: Andrew Dunstan (#39)
Re: automating CF submissions (was xlog location arithmetic)

Excerpts from Andrew Dunstan's message of mar ene 17 13:50:20 -0300 2012:

On 01/16/2012 05:40 PM, Alvaro Herrera wrote:

It's expected that we'll get a more reasonable interface to attachments,
one that will allow you to download patches separately. (Currently,
attachments that have mime types other than text/plain are already
downloadable separately).

Are you really sure about that? My recent JSON patch is at
<http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net&gt;.
I don't see any download link for the patch there, yet my mailer set the
attachment type to text/x-patch, not text/plain.

I tweaked the Mhonarc config and now this attachment (as well as many
others) is shown as a downloadable link. Please give it a look.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#48Alvaro Herrera
alvherre@commandprompt.com
In reply to: Alvaro Herrera (#47)
Re: automating CF submissions (was xlog location arithmetic)

Excerpts from Alvaro Herrera's message of mar ene 17 22:23:13 -0300 2012:

Excerpts from Andrew Dunstan's message of mar ene 17 13:50:20 -0300 2012:

On 01/16/2012 05:40 PM, Alvaro Herrera wrote:

It's expected that we'll get a more reasonable interface to attachments,
one that will allow you to download patches separately. (Currently,
attachments that have mime types other than text/plain are already
downloadable separately).

Are you really sure about that? My recent JSON patch is at
<http://archives.postgresql.org/message-id/4F12F9E5.3090801@dunslane.net&gt;.
I don't see any download link for the patch there, yet my mailer set the
attachment type to text/x-patch, not text/plain.

I tweaked the Mhonarc config and now this attachment (as well as many
others) is shown as a downloadable link. Please give it a look.

Hm, I notice it works almost every patch I've checked, except the ones
from Tom such as this one:
http://archives.postgresql.org/message-id/4643.1326776814@sss.pgh.pa.us

The problem is that this one doesn't have the
Content-Disposition: attachment
line in the MIME header. I don't know what we can do about it.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#49Alex Shulgin
ash@commandprompt.com
In reply to: Greg Smith (#23)
Re: automating CF submissions (was xlog location arithmetic)

Greg Smith <greg@2ndQuadrant.com> writes:

One unicorn I would like to have here would give the CF app a database
of recent e-mails to pgsql-hackers. I login to the CF app, click on
"Add recent submission", and anything matching my e-mail address
appears with a checkbox next to it. Click on the patch submissions,
and then something like you described would happen. That would save
me the annoying work around looking up message IDs so much.

Another idea: introduce some simple tag system in mails sent to -hackers
to be treated specially, e.g:

@fest add-to-current

to add new patch to the commit fest currently in progress, or

@fest add-to-next

to add it to the next scheduled fest.

Attribute your mail with

@fest comment COMMENT TEXT

or

@fest comment <<EOF
...
EOF

to add a (long) comment, ditto for "patch" and "review".

How does that sound?

--
Alex

#50Andrew Dunstan
andrew@dunslane.net
In reply to: Euler Taveira de Oliveira (#1)
Re: automating CF submissions (was xlog location arithmetic)

On 01/19/2012 12:59 PM, Alex Shulgin wrote:

Greg Smith<greg@2ndQuadrant.com> writes:

One unicorn I would like to have here would give the CF app a database
of recent e-mails to pgsql-hackers. I login to the CF app, click on
"Add recent submission", and anything matching my e-mail address
appears with a checkbox next to it. Click on the patch submissions,
and then something like you described would happen. That would save
me the annoying work around looking up message IDs so much.

Another idea: introduce some simple tag system in mails sent to -hackers
to be treated specially, e.g:

@fest add-to-current

to add new patch to the commit fest currently in progress, or

@fest add-to-next

to add it to the next scheduled fest.

Attribute your mail with

@fest comment COMMENT TEXT

or

@fest comment<<EOF
...
EOF

to add a (long) comment, ditto for "patch" and "review".

How does that sound?

Like a recipe for something that requires constant fixups, to be honest.

Seriously, adding something to the CF isn't *that* hard. I like Greg's
idea of a list of recent emails that you could choose from.

cheers

andrew

#51Alex Shulgin
ash@commandprompt.com
In reply to: Andrew Dunstan (#50)
Re: automating CF submissions (was xlog location arithmetic)

Andrew Dunstan <andrew@dunslane.net> writes:

On 01/19/2012 12:59 PM, Alex Shulgin wrote:

Another idea: introduce some simple tag system in mails sent to -hackers
to be treated specially, e.g:

@fest add-to-current

to add new patch to the commit fest currently in progress, or

@fest add-to-next

to add it to the next scheduled fest.

Attribute your mail with

@fest comment COMMENT TEXT

or

@fest comment<<EOF
...
EOF

to add a (long) comment, ditto for "patch" and "review".

How does that sound?

Like a recipe for something that requires constant fixups, to be honest.

Seriously, adding something to the CF isn't *that* hard. I like Greg's
idea of a list of recent emails that you could choose from.

I've just added a comment about a patch and it took me to:

a. Login to commitfest app
b. Locate the patch and review I was replying to
c. Fetch archives thread index, refresh the index page for ~10 minutes
to see my reply appear
d. Copy message id and finally register comment in the commitfest app

(IIRC, something close to that was already described in this thread)

With the proposed approach it would only take me to include

@fest comment "Patch applies cleanly"

and possibly

@fest status Needs Review

to update the patch status and that'd be it.

--
Alex

PS: yes, I could just copy message id from the sent mail in my MUA, but
I like to make sure links I post aren't broke, so still I'll need to
wait until archives catches up to double check.

#52Alvaro Herrera
alvherre@commandprompt.com
In reply to: Alex Shulgin (#51)
Re: automating CF submissions (was xlog location arithmetic)

Excerpts from Alex Shulgin's message of jue ene 19 15:41:54 -0300 2012:

PS: yes, I could just copy message id from the sent mail in my MUA, but
I like to make sure links I post aren't broke, so still I'll need to
wait until archives catches up to double check.

I find this a bad excuse. If you're a pgsql-hackers regular, then you
already know your posts are going to show up with the correct
message-id. The links might be broken for the next 10 minutes, but
links that stay broken for a longer period than that should be rare.
Surely you don't change your MUA once a month or anything.

I know I don't waste time waiting for my posts to show up in the
archives before adding links to the CF app.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#53Andrew Dunstan
andrew@dunslane.net
In reply to: Euler Taveira de Oliveira (#1)
Re: automating CF submissions (was xlog location arithmetic)

On 01/19/2012 01:41 PM, Alex Shulgin wrote:

With the proposed approach it would only take me to include

@fest comment "Patch applies cleanly"

and possibly

@fest status Needs Review

to update the patch status and that'd be it.

It will be easy if you get it right. My point was that it's way too easy
to get it wrong.

cheers

andrew

In reply to: Tom Lane (#12)
2 attachment(s)
Re: xlog location arithmetic

On 23-12-2011 12:05, Tom Lane wrote:

I too think a datatype is overkill, if we're only planning on providing
one function. Just emit the values as numeric and have done.

Here it is. Output changed to numeric. While the output was int8 I could use
pg_size_pretty but now I couldn't. I attached another patch that implements
pg_size_pretty(numeric).

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Attachments:

xlogdiff.patchtext/x-patch; name=xlogdiff.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 48631cc..04bc24d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14378,6 +14378,9 @@ SELECT set_config('log_statement_stats', 'off', false);
    <indexterm>
     <primary>pg_xlogfile_name_offset</primary>
    </indexterm>
+   <indexterm>
+    <primary>pg_xlog_location_diff</primary>
+   </indexterm>
 
    <para>
     The functions shown in <xref
@@ -14450,6 +14453,13 @@ SELECT set_config('log_statement_stats', 'off', false);
        <entry><type>text</>, <type>integer</></entry>
        <entry>Convert transaction log location string to file name and decimal byte offset within file</entry>
       </row>
+      <row>
+       <entry>
+        <literal><function>pg_xlog_location_diff(<parameter>location</> <type>text</>, <parameter>location</> <type>text</>)</function></literal>
+        </entry>
+       <entry><type>numeric</></entry>
+       <entry>Calculate the difference between two transaction log locations</entry>
+      </row>
      </tbody>
     </tgroup>
    </table>
@@ -14543,6 +14553,13 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
    </para>
 
    <para>
+	<function>pg_xlog_location_diff</> calculates the difference in bytes
+	between two transaction log locations. It can be used with
+	<structname>pg_stat_replication</structname> or some functions shown in
+	<xref linkend="functions-admin-backup-table"> to get the replication lag.
+   </para>
+
+   <para>
     For details about proper usage of these functions, see
     <xref linkend="continuous-archiving">.
    </para>
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2e10d4d..e03c5e8 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "replication/walreceiver.h"
 #include "storage/smgr.h"
 #include "utils/builtins.h"
+#include "utils/numeric.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -465,3 +466,84 @@ pg_is_in_recovery(PG_FUNCTION_ARGS)
 {
 	PG_RETURN_BOOL(RecoveryInProgress());
 }
+
+static void
+validate_xlog_location(char *str)
+{
+#define	MAXLSNCOMPONENT		8
+
+	int	len1, len2;
+
+	len1 = strspn(str, "0123456789abcdefABCDEF");
+	if (len1 < 1 || len1 > MAXLSNCOMPONENT || str[len1] != '/')
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+			 errmsg("invalid input syntax for transaction log location: \"%s\"", str)));
+	len2 = strspn(str + len1 + 1, "0123456789abcdefABCDEF");
+	if (len2 < 1 || len2 > MAXLSNCOMPONENT || str[len1 + 1 + len2] != '\0')
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+			 errmsg("invalid input syntax for transaction log location: \"%s\"", str)));
+}
+
+/*
+ * Compute the difference in bytes between two WAL locations.
+ */
+Datum
+pg_xlog_location_diff(PG_FUNCTION_ARGS)
+{
+	text	*location1 = PG_GETARG_TEXT_P(0);
+	text	*location2 = PG_GETARG_TEXT_P(1);
+	char	*str1, *str2;
+	uint64	xlogid1, xrecoff1;
+	uint64	xlogid2, xrecoff2;
+	Numeric	result;
+
+	/*
+	 * Read and parse input
+	 */
+	str1 = text_to_cstring(location1);
+	str2 = text_to_cstring(location2);
+
+	validate_xlog_location(str1);
+	validate_xlog_location(str2);
+
+	if (sscanf(str1, "%8lX/%8lX", &xlogid1, &xrecoff1) != 2)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("could not parse transaction log location \"%s\"", str1)));
+	if (sscanf(str2, "%8lX/%8lX", &xlogid2, &xrecoff2) != 2)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("could not parse transaction log location \"%s\"", str2)));
+
+	/*
+	 * Sanity check
+	 */
+	if (xrecoff1 > XLogFileSize)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("xrecoff \"%lX\" is out of valid range, 0..%X", xrecoff1, XLogFileSize)));
+	if (xrecoff2 > XLogFileSize)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("xrecoff \"%lX\" is out of valid range, 0..%X", xrecoff2, XLogFileSize)));
+
+	/*
+	 * result = XLogFileSize * (xlogid1 - xlogid2) + xrecoff1 - xrecoff2
+	 */
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_sub,
+							DirectFunctionCall1(int8_numeric, Int64GetDatum(xlogid1)),
+							DirectFunctionCall1(int8_numeric, Int64GetDatum(xlogid2))));
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_mul,
+							DirectFunctionCall1(int8_numeric, Int64GetDatum(XLogFileSize)),
+							NumericGetDatum(result)));
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_add,
+							NumericGetDatum(result),
+							DirectFunctionCall1(int8_numeric, Int64GetDatum(xrecoff1))));
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_sub,
+							NumericGetDatum(result),
+							DirectFunctionCall1(int8_numeric, Int64GetDatum(xrecoff2))));
+
+	PG_RETURN_NUMERIC(result);
+}
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index db6380f..fa45aa1 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -281,5 +281,6 @@ extern Datum pg_is_in_recovery(PG_FUNCTION_ARGS);
 extern Datum pg_xlog_replay_pause(PG_FUNCTION_ARGS);
 extern Datum pg_xlog_replay_resume(PG_FUNCTION_ARGS);
 extern Datum pg_is_xlog_replay_paused(PG_FUNCTION_ARGS);
+extern Datum pg_xlog_location_diff(PG_FUNCTION_ARGS);
 
 #endif   /* XLOG_INTERNAL_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index b6ac195..9940658 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2902,6 +2902,8 @@ DATA(insert OID = 2850 ( pg_xlogfile_name_offset	PGNSP PGUID 12 1 0 0 0 f f f t
 DESCR("xlog filename and byte offset, given an xlog location");
 DATA(insert OID = 2851 ( pg_xlogfile_name			PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 25 "25" _null_ _null_ _null_ _null_ pg_xlogfile_name _null_ _null_ _null_ ));
 DESCR("xlog filename, given an xlog location");
+DATA(insert OID = 3150 ( pg_xlog_location_diff		PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 1700 "25 25" _null_ _null_ _null_ _null_ pg_xlog_location_diff _null_ _null_ _null_ ));
+DESCR("difference in bytes, given two xlog locations");
 
 DATA(insert OID = 3809 ( pg_export_snapshot		PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 25 "" _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
 DESCR("export a snapshot");
sizepretty.patchtext/x-patch; name=sizepretty.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 48631cc..eff2b0e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14881,6 +14881,13 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
       </row>
       <row>
        <entry>
+        <literal><function>pg_size_pretty(<type>numeric</type>)</function></literal>
+        </entry>
+       <entry><type>text</type></entry>
+       <entry>Converts a size in bytes into a human-readable format with size units</entry>
+      </row>
+      <row>
+       <entry>
         <literal><function>pg_table_size(<type>regclass</type>)</function></literal>
         </entry>
        <entry><type>bigint</type></entry>
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 26a8c01..494ce27 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -24,6 +24,7 @@
 #include "storage/fd.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
+#include "utils/numeric.h"
 #include "utils/rel.h"
 #include "utils/relmapper.h"
 #include "utils/syscache.h"
@@ -550,6 +551,93 @@ pg_size_pretty(PG_FUNCTION_ARGS)
 	PG_RETURN_TEXT_P(cstring_to_text(buf));
 }
 
+Datum
+pg_size_pretty_num(PG_FUNCTION_ARGS)
+{
+	Numeric		size = PG_GETARG_NUMERIC(0);
+	Numeric		limit, limit2;
+
+	char		*buf, *result;
+
+	limit = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) (10 * 1024))));
+	limit2 = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) (10 * 1024 * 2 - 1))));
+
+	if (DatumGetBool(DirectFunctionCall2(numeric_lt, NumericGetDatum(size), NumericGetDatum(limit))))
+	{
+		buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+		result = strcat(buf, " bytes");
+	}
+	else
+	{
+		Numeric		arg2;
+
+		/* keep one extra bit for rounding */
+		/* size >>= 9 */
+		arg2 = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) pow(2, 9))));
+		size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), NumericGetDatum(arg2)));
+
+		if (DatumGetBool(DirectFunctionCall2(numeric_lt, NumericGetDatum(size), NumericGetDatum(limit2))))
+		{
+			/* size = (size + 1) / 2 */
+			size = DatumGetNumeric(DirectFunctionCall2(numeric_add, NumericGetDatum(size), 
+													DirectFunctionCall1(int8_numeric, Int64GetDatum(1))));
+			size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), 
+													DirectFunctionCall1(int8_numeric, Int64GetDatum(2))));
+			buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+			result = strcat(buf, " kB");
+		}
+		else
+		{
+			Numeric		arg3;
+
+			/* size >>= 10 */
+			arg3 = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) pow(2, 10))));
+			size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), NumericGetDatum(arg3)));
+
+			if (DatumGetBool(DirectFunctionCall2(numeric_lt, NumericGetDatum(size), NumericGetDatum(limit2))))
+			{
+				/* size = (size + 1) / 2 */
+				size = DatumGetNumeric(DirectFunctionCall2(numeric_add, NumericGetDatum(size), 
+														DirectFunctionCall1(int8_numeric, Int64GetDatum(1))));
+				size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), 
+														DirectFunctionCall1(int8_numeric, Int64GetDatum(2))));
+				buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+				result = strcat(buf, " MB");
+			}
+			else
+			{
+				/* size >>= 10 */
+				size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), NumericGetDatum(arg3)));
+
+				if (DatumGetBool(DirectFunctionCall2(numeric_lt, NumericGetDatum(size), NumericGetDatum(limit2))))
+				{
+					/* size = (size + 1) / 2 */
+					size = DatumGetNumeric(DirectFunctionCall2(numeric_add, NumericGetDatum(size), 
+															DirectFunctionCall1(int8_numeric, Int64GetDatum(1))));
+					size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), 
+															DirectFunctionCall1(int8_numeric, Int64GetDatum(2))));
+					buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+					result = strcat(buf, " GB");
+				}
+				else
+				{
+					/* size = (size + 1) / 2 */
+					size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), NumericGetDatum(arg3)));
+					size = DatumGetNumeric(DirectFunctionCall2(numeric_add, NumericGetDatum(size), 
+															DirectFunctionCall1(int8_numeric, Int64GetDatum(1))));
+					size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), 
+															DirectFunctionCall1(int8_numeric, Int64GetDatum(2))));
+
+					buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+					result = strcat(buf, " TB");
+				}
+			}
+		}
+	}
+
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
 /*
  * Get the filenode of a relation
  *
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index b6ac195..0d8d486 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -3364,6 +3364,7 @@ DESCR("disk space usage for the specified fork of a table or index");
 DATA(insert OID = 2286 ( pg_total_relation_size PGNSP PGUID 12 1 0 0 0 f f f t f v 1 0 20 "2205" _null_ _null_ _null_ _null_ pg_total_relation_size _null_ _null_ _null_ ));
 DESCR("total disk space usage for the specified table and associated indexes");
 DATA(insert OID = 2288 ( pg_size_pretty			PGNSP PGUID 12 1 0 0 0 f f f t f v 1 0 25 "20" _null_ _null_ _null_ _null_ pg_size_pretty _null_ _null_ _null_ ));
+DATA(insert OID = 3151 ( pg_size_pretty			PGNSP PGUID 12 1 0 0 0 f f f t f v 1 0 25 "1700" _null_ _null_ _null_ _null_ pg_size_pretty_num _null_ _null_ _null_ ));
 DESCR("convert a long int to a human readable text using size units");
 DATA(insert OID = 2997 ( pg_table_size			PGNSP PGUID 12 1 0 0 0 f f f t f v 1 0 20 "2205" _null_ _null_ _null_ _null_ pg_table_size _null_ _null_ _null_ ));
 DESCR("disk space usage for the specified table, including TOAST, free space and visibility map");
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 68179d5..3d07f29 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -453,6 +453,7 @@ extern Datum pg_database_size_name(PG_FUNCTION_ARGS);
 extern Datum pg_relation_size(PG_FUNCTION_ARGS);
 extern Datum pg_total_relation_size(PG_FUNCTION_ARGS);
 extern Datum pg_size_pretty(PG_FUNCTION_ARGS);
+extern Datum pg_size_pretty_num(PG_FUNCTION_ARGS);
 extern Datum pg_table_size(PG_FUNCTION_ARGS);
 extern Datum pg_indexes_size(PG_FUNCTION_ARGS);
 extern Datum pg_relation_filenode(PG_FUNCTION_ARGS);
#55Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Alvaro Herrera (#48)
Re: automating CF submissions (was xlog location arithmetic)

Alvaro Herrera <alvherre@commandprompt.com> writes:

The problem is that this one doesn't have the
Content-Disposition: attachment
line in the MIME header. I don't know what we can do about it.

It's sent with an inline attachment AFAICT, some MA will make it easy
to process the attachment and some others will just make the content
appear within the mail. It seems the vast majority falls into the
unhelpful second category.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#56Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Alex Shulgin (#49)
Re: automating CF submissions (was xlog location arithmetic)

Alex Shulgin <ash@commandprompt.com> writes:

Another idea: introduce some simple tag system in mails sent to -hackers
to be treated specially, e.g:

[...]

How does that sound?

Very much like what debbugs does already.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#57Fujii Masao
masao.fujii@gmail.com
In reply to: Euler Taveira de Oliveira (#54)
Re: xlog location arithmetic

On Sun, Jan 22, 2012 at 1:13 AM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 23-12-2011 12:05, Tom Lane wrote:

I too think a datatype is overkill, if we're only planning on providing
one function.  Just emit the values as numeric and have done.

Here it is. Output changed to numeric.

Thanks!

When I compiled the source with xlogdiff.patch, I got the following warnings.

xlogfuncs.c:511:2: warning: format '%lX' expects argument of type
'long unsigned int *', but argument 3 has type 'uint64 *' [-Wformat]
xlogfuncs.c:511:2: warning: format '%lX' expects argument of type
'long unsigned int *', but argument 4 has type 'uint64 *' [-Wformat]
xlogfuncs.c:515:2: warning: format '%lX' expects argument of type
'long unsigned int *', but argument 3 has type 'uint64 *' [-Wformat]
xlogfuncs.c:515:2: warning: format '%lX' expects argument of type
'long unsigned int *', but argument 4 has type 'uint64 *' [-Wformat]
xlogfuncs.c:524:3: warning: format '%lX' expects argument of type
'long unsigned int', but argument 2 has type 'uint64' [-Wformat]
xlogfuncs.c:528:3: warning: format '%lX' expects argument of type
'long unsigned int', but argument 2 has type 'uint64' [-Wformat]

When I tested the patch, I got the following error:

postgres=# SELECT pg_current_xlog_location();
pg_current_xlog_location
--------------------------
0/2000074
(1 row)

postgres=# SELECT pg_xlog_location_diff('0/2000074', '0/2000074');
ERROR: xrecoff "2000074" is out of valid range, 0..A4A534C

In func.sgml
<para>
The functions shown in <xref
linkend="functions-admin-backup-table"> assist in making on-line backups.
These functions cannot be executed during recovery.
</para>

Since pg_xlog_location_diff() can be executed during recovery,
the above needs to be updated.

While the output was int8 I could use
pg_size_pretty but now I couldn't. I attached another patch that implements
pg_size_pretty(numeric).

I agree it's necessary.

* Note: every entry in pg_proc.h is expected to have a DESCR() comment,
* except for functions that implement pg_operator.h operators and don't
* have a good reason to be called directly rather than via the operator.

According to the above source code comment in pg_proc.h, ISTM
pg_size_pretty() for numeric also needs to have its own DESCR().

+			buf = DatumGetCString(DirectFunctionCall1(numeric_out,
NumericGetDatum(size)));
+			result = strcat(buf, " kB");

According to "man strcat", the dest string must have enough space for
the result.
"buf" has enough space?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In reply to: Fujii Masao (#57)
2 attachment(s)
Re: xlog location arithmetic

On 26-01-2012 06:19, Fujii Masao wrote:

Thanks for your review. Comments below.

When I compiled the source with xlogdiff.patch, I got the following warnings.

xlogfuncs.c:511:2: warning: format '%lX' expects argument of type
'long unsigned int *', but argument 3 has type 'uint64 *' [-Wformat]

What is your compiler? I'm using gcc 4.6.2. I refactored the patch so I'm now
using XLogRecPtr and %X.

postgres=# SELECT pg_xlog_location_diff('0/2000074', '0/2000074');
ERROR: xrecoff "2000074" is out of valid range, 0..A4A534C

Ugh? I can't reproduce that. It seems to be related to long int used by the
prior version.

Since pg_xlog_location_diff() can be executed during recovery,
the above needs to be updated.

Fixed.

While the output was int8 I could use
pg_size_pretty but now I couldn't. I attached another patch that implements
pg_size_pretty(numeric).

I realized that it collides with the pg_size_pretty(int8) if we don't specify
a type. Hence, I decided to drop the pg_size_pretty(int8) in favor of
pg_size_pretty(numeric). It is slower than the former but it is not a
performance critical function.

According to the above source code comment in pg_proc.h, ISTM
pg_size_pretty() for numeric also needs to have its own DESCR().

Fixed.

According to "man strcat", the dest string must have enough space for
the result.
"buf" has enough space?

Ops. Fixed.

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Attachments:

sizepretty2.difftext/x-patch; name=sizepretty2.diffDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 236a60a..511a918 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14942,7 +14942,7 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
       </row>
       <row>
        <entry>
-        <literal><function>pg_size_pretty(<type>bigint</type>)</function></literal>
+        <literal><function>pg_size_pretty(<type>numeric</type>)</function></literal>
         </entry>
        <entry><type>text</type></entry>
        <entry>Converts a size in bytes into a human-readable format with size units</entry>
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 26a8c01..d4a142b 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -24,6 +24,7 @@
 #include "storage/fd.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
+#include "utils/numeric.h"
 #include "utils/rel.h"
 #include "utils/relmapper.h"
 #include "utils/syscache.h"
@@ -506,48 +507,101 @@ pg_total_relation_size(PG_FUNCTION_ARGS)
 	PG_RETURN_INT64(size);
 }
 
-/*
- * formatting with size units
- */
 Datum
 pg_size_pretty(PG_FUNCTION_ARGS)
 {
-	int64		size = PG_GETARG_INT64(0);
-	char		buf[64];
-	int64		limit = 10 * 1024;
-	int64		limit2 = limit * 2 - 1;
+	Numeric		size = PG_GETARG_NUMERIC(0);
+	Numeric		limit, limit2;
+
+	char		*buf, *result;
 
-	if (size < limit)
-		snprintf(buf, sizeof(buf), INT64_FORMAT " bytes", size);
+	limit = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) (10 * 1024))));
+	limit2 = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) (10 * 1024 * 2 - 1))));
+
+	if (DatumGetBool(DirectFunctionCall2(numeric_lt, NumericGetDatum(size), NumericGetDatum(limit))))
+	{
+		buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+		result = palloc(strlen(buf) + 7);
+		strcpy(result, buf);
+		strcat(result, " bytes");
+	}
 	else
 	{
-		size >>= 9;				/* keep one extra bit for rounding */
-		if (size < limit2)
-			snprintf(buf, sizeof(buf), INT64_FORMAT " kB",
-					 (size + 1) / 2);
+		Numeric		arg2;
+
+		/* keep one extra bit for rounding */
+		/* size >>= 9 */
+		arg2 = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) pow(2, 9))));
+		size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), NumericGetDatum(arg2)));
+
+		if (DatumGetBool(DirectFunctionCall2(numeric_lt, NumericGetDatum(size), NumericGetDatum(limit2))))
+		{
+			/* size = (size + 1) / 2 */
+			size = DatumGetNumeric(DirectFunctionCall2(numeric_add, NumericGetDatum(size), 
+													DirectFunctionCall1(int8_numeric, Int64GetDatum(1))));
+			size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), 
+													DirectFunctionCall1(int8_numeric, Int64GetDatum(2))));
+			buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+			result = palloc(strlen(buf) + 4);
+			strcpy(result, buf);
+			strcat(result, " kB");
+		}
 		else
 		{
-			size >>= 10;
-			if (size < limit2)
-				snprintf(buf, sizeof(buf), INT64_FORMAT " MB",
-						 (size + 1) / 2);
+			Numeric		arg3;
+
+			/* size >>= 10 */
+			arg3 = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) pow(2, 10))));
+			size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), NumericGetDatum(arg3)));
+
+			if (DatumGetBool(DirectFunctionCall2(numeric_lt, NumericGetDatum(size), NumericGetDatum(limit2))))
+			{
+				/* size = (size + 1) / 2 */
+				size = DatumGetNumeric(DirectFunctionCall2(numeric_add, NumericGetDatum(size), 
+														DirectFunctionCall1(int8_numeric, Int64GetDatum(1))));
+				size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), 
+														DirectFunctionCall1(int8_numeric, Int64GetDatum(2))));
+				buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+				result = palloc(strlen(buf) + 4);
+				strcpy(result, buf);
+				strcat(result, " MB");
+			}
 			else
 			{
-				size >>= 10;
-				if (size < limit2)
-					snprintf(buf, sizeof(buf), INT64_FORMAT " GB",
-							 (size + 1) / 2);
+				/* size >>= 10 */
+				size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), NumericGetDatum(arg3)));
+
+				if (DatumGetBool(DirectFunctionCall2(numeric_lt, NumericGetDatum(size), NumericGetDatum(limit2))))
+				{
+					/* size = (size + 1) / 2 */
+					size = DatumGetNumeric(DirectFunctionCall2(numeric_add, NumericGetDatum(size), 
+															DirectFunctionCall1(int8_numeric, Int64GetDatum(1))));
+					size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), 
+															DirectFunctionCall1(int8_numeric, Int64GetDatum(2))));
+					buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+					result = palloc(strlen(buf) + 4);
+					strcpy(result, buf);
+					strcat(result, " GB");
+				}
 				else
 				{
-					size >>= 10;
-					snprintf(buf, sizeof(buf), INT64_FORMAT " TB",
-							 (size + 1) / 2);
+					/* size = (size + 1) / 2 */
+					size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), NumericGetDatum(arg3)));
+					size = DatumGetNumeric(DirectFunctionCall2(numeric_add, NumericGetDatum(size), 
+															DirectFunctionCall1(int8_numeric, Int64GetDatum(1))));
+					size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), 
+															DirectFunctionCall1(int8_numeric, Int64GetDatum(2))));
+
+					buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+					result = palloc(strlen(buf) + 4);
+					strcpy(result, buf);
+					strcat(result, " TB");
 				}
 			}
 		}
 	}
 
-	PG_RETURN_TEXT_P(cstring_to_text(buf));
+	PG_RETURN_TEXT_P(cstring_to_text(result));
 }
 
 /*
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 8fc4ddb..d150584 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -3376,8 +3376,8 @@ DATA(insert OID = 2332 ( pg_relation_size		PGNSP PGUID 12 1 0 0 0 f f f t f v 2
 DESCR("disk space usage for the specified fork of a table or index");
 DATA(insert OID = 2286 ( pg_total_relation_size PGNSP PGUID 12 1 0 0 0 f f f t f v 1 0 20 "2205" _null_ _null_ _null_ _null_ pg_total_relation_size _null_ _null_ _null_ ));
 DESCR("total disk space usage for the specified table and associated indexes");
-DATA(insert OID = 2288 ( pg_size_pretty			PGNSP PGUID 12 1 0 0 0 f f f t f v 1 0 25 "20" _null_ _null_ _null_ _null_ pg_size_pretty _null_ _null_ _null_ ));
-DESCR("convert a long int to a human readable text using size units");
+DATA(insert OID = 3158 ( pg_size_pretty			PGNSP PGUID 12 1 0 0 0 f f f t f v 1 0 25 "1700" _null_ _null_ _null_ _null_ pg_size_pretty _null_ _null_ _null_ ));
+DESCR("convert a numeric to a human readable text using size units");
 DATA(insert OID = 2997 ( pg_table_size			PGNSP PGUID 12 1 0 0 0 f f f t f v 1 0 20 "2205" _null_ _null_ _null_ _null_ pg_table_size _null_ _null_ _null_ ));
 DESCR("disk space usage for the specified table, including TOAST, free space and visibility map");
 DATA(insert OID = 2998 ( pg_indexes_size		PGNSP PGUID 12 1 0 0 0 f f f t f v 1 0 20 "2205" _null_ _null_ _null_ _null_ pg_indexes_size _null_ _null_ _null_ ));
xlogdiff2.difftext/x-patch; name=xlogdiff2.diffDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 236a60a..826f002 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14446,11 +14446,15 @@ SELECT set_config('log_statement_stats', 'off', false);
    <indexterm>
     <primary>pg_xlogfile_name_offset</primary>
    </indexterm>
+   <indexterm>
+    <primary>pg_xlog_location_diff</primary>
+   </indexterm>
 
    <para>
     The functions shown in <xref
     linkend="functions-admin-backup-table"> assist in making on-line backups.
-    These functions cannot be executed during recovery.
+	These functions cannot be executed during recovery (except
+	<function>pg_xlog_location_diff</function>).
    </para>
 
    <table id="functions-admin-backup-table">
@@ -14518,6 +14522,13 @@ SELECT set_config('log_statement_stats', 'off', false);
        <entry><type>text</>, <type>integer</></entry>
        <entry>Convert transaction log location string to file name and decimal byte offset within file</entry>
       </row>
+      <row>
+       <entry>
+        <literal><function>pg_xlog_location_diff(<parameter>location</> <type>text</>, <parameter>location</> <type>text</>)</function></literal>
+        </entry>
+       <entry><type>numeric</></entry>
+       <entry>Calculate the difference between two transaction log locations</entry>
+      </row>
      </tbody>
     </tgroup>
    </table>
@@ -14611,6 +14622,13 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
    </para>
 
    <para>
+	<function>pg_xlog_location_diff</> calculates the difference in bytes
+	between two transaction log locations. It can be used with
+	<structname>pg_stat_replication</structname> or some functions shown in
+	<xref linkend="functions-admin-backup-table"> to get the replication lag.
+   </para>
+
+   <para>
     For details about proper usage of these functions, see
     <xref linkend="continuous-archiving">.
    </para>
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2e10d4d..2a6bbcf 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "replication/walreceiver.h"
 #include "storage/smgr.h"
 #include "utils/builtins.h"
+#include "utils/numeric.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -465,3 +466,83 @@ pg_is_in_recovery(PG_FUNCTION_ARGS)
 {
 	PG_RETURN_BOOL(RecoveryInProgress());
 }
+
+static void
+validate_xlog_location(char *str)
+{
+#define	MAXLSNCOMPONENT		8
+
+	int	len1, len2;
+
+	len1 = strspn(str, "0123456789abcdefABCDEF");
+	if (len1 < 1 || len1 > MAXLSNCOMPONENT || str[len1] != '/')
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+			 errmsg("invalid input syntax for transaction log location: \"%s\"", str)));
+	len2 = strspn(str + len1 + 1, "0123456789abcdefABCDEF");
+	if (len2 < 1 || len2 > MAXLSNCOMPONENT || str[len1 + 1 + len2] != '\0')
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+			 errmsg("invalid input syntax for transaction log location: \"%s\"", str)));
+}
+
+/*
+ * Compute the difference in bytes between two WAL locations.
+ */
+Datum
+pg_xlog_location_diff(PG_FUNCTION_ARGS)
+{
+	text		*location1 = PG_GETARG_TEXT_P(0);
+	text		*location2 = PG_GETARG_TEXT_P(1);
+	char		*str1, *str2;
+	XLogRecPtr	loc1, loc2;
+	Numeric		result;
+
+	/*
+	 * Read and parse input
+	 */
+	str1 = text_to_cstring(location1);
+	str2 = text_to_cstring(location2);
+
+	validate_xlog_location(str1);
+	validate_xlog_location(str2);
+
+	if (sscanf(str1, "%X/%X", &loc1.xlogid, &loc1.xrecoff) != 2)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("could not parse transaction log location \"%s\"", str1)));
+	if (sscanf(str2, "%X/%X", &loc2.xlogid, &loc2.xrecoff) != 2)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("could not parse transaction log location \"%s\"", str2)));
+
+	/*
+	 * Sanity check
+	 */
+	if (loc1.xrecoff > XLogFileSize)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("xrecoff \"%X\" is out of valid range, 0..%X", loc1.xrecoff, XLogFileSize)));
+	if (loc2.xrecoff > XLogFileSize)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("xrecoff \"%X\" is out of valid range, 0..%X", loc2.xrecoff, XLogFileSize)));
+
+	/*
+	 * result = XLogFileSize * (xlogid1 - xlogid2) + xrecoff1 - xrecoff2
+	 */
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_sub,
+							DirectFunctionCall1(int8_numeric, UInt32GetDatum(loc1.xlogid)),
+							DirectFunctionCall1(int8_numeric, UInt32GetDatum(loc2.xlogid))));
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_mul,
+							DirectFunctionCall1(int8_numeric, UInt32GetDatum(XLogFileSize)),
+							NumericGetDatum(result)));
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_add,
+							NumericGetDatum(result),
+							DirectFunctionCall1(int8_numeric, UInt32GetDatum(loc1.xrecoff))));
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_sub,
+							NumericGetDatum(result),
+							DirectFunctionCall1(int8_numeric, UInt32GetDatum(loc2.xrecoff))));
+
+	PG_RETURN_NUMERIC(result);
+}
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index b81c156..c079a9a 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -281,5 +281,6 @@ extern Datum pg_is_in_recovery(PG_FUNCTION_ARGS);
 extern Datum pg_xlog_replay_pause(PG_FUNCTION_ARGS);
 extern Datum pg_xlog_replay_resume(PG_FUNCTION_ARGS);
 extern Datum pg_is_xlog_replay_paused(PG_FUNCTION_ARGS);
+extern Datum pg_xlog_location_diff(PG_FUNCTION_ARGS);
 
 #endif   /* XLOG_INTERNAL_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 8fc4ddb..4064a8a 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2915,6 +2915,8 @@ DATA(insert OID = 2850 ( pg_xlogfile_name_offset	PGNSP PGUID 12 1 0 0 0 f f f t
 DESCR("xlog filename and byte offset, given an xlog location");
 DATA(insert OID = 2851 ( pg_xlogfile_name			PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 25 "25" _null_ _null_ _null_ _null_ pg_xlogfile_name _null_ _null_ _null_ ));
 DESCR("xlog filename, given an xlog location");
+DATA(insert OID = 3157 ( pg_xlog_location_diff		PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 1700 "25 25" _null_ _null_ _null_ _null_ pg_xlog_location_diff _null_ _null_ _null_ ));
+DESCR("difference in bytes, given two xlog locations");
 
 DATA(insert OID = 3809 ( pg_export_snapshot		PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 25 "" _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
 DESCR("export a snapshot");
#59Fujii Masao
masao.fujii@gmail.com
In reply to: Euler Taveira de Oliveira (#58)
Re: xlog location arithmetic

On Wed, Feb 8, 2012 at 2:29 AM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 26-01-2012 06:19, Fujii Masao wrote:

Thanks for your review. Comments below.

When I compiled the source with xlogdiff.patch, I got the following warnings.

xlogfuncs.c:511:2: warning: format '%lX' expects argument of type
'long unsigned int *', but argument 3 has type 'uint64 *' [-Wformat]

What is your compiler? I'm using gcc 4.6.2. I refactored the patch so I'm now
using XLogRecPtr and %X.

gcc version 4.6.1 (Ubuntu/Linaro 4.6.1-9ubuntu3)

$ uname -a
Linux hermes 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20 15:59:53 UTC
2012 i686 i686 i386 GNU/Linux

In the updated version of the patch, I got no warnings at the compile time.
But initdb failed because the OID which you assigned to pg_xlog_location_diff()
has already been used for other function. So you need to update pg_proc.h.

postgres=# SELECT pg_xlog_location_diff('0/2000074', '0/2000074');
ERROR:  xrecoff "2000074" is out of valid range, 0..A4A534C

Ugh? I can't reproduce that. It seems to be related to long int used by the
prior version.

Maybe.

But another problem happened. When I changed pg_proc.h so that the unused
OID was assigned to pg_xlog_location_diff(), and executed the above again,
I encountered the segmentation fault:

LOG: server process (PID 14384) was terminated by signal 11: Segmentation fault
DETAIL: Failed process was running: SELECT
pg_xlog_location_diff('0/2000074', '0/2000074');
LOG: terminating any other active server processes

ISTM that the cause is that int8_numeric() is executed for uint32 value. We
should use int4_numeric(), instead?

While the output was int8 I could use
pg_size_pretty but now I couldn't. I attached another patch that implements
pg_size_pretty(numeric).

I realized that it collides with the pg_size_pretty(int8) if we don't specify
a type. Hence, I decided to drop the pg_size_pretty(int8) in favor of
pg_size_pretty(numeric). It is slower than the former but it is not a
performance critical function.

I'm OK with this.

-DATA(insert OID = 2288 ( pg_size_pretty			PGNSP PGUID 12 1 0 0 0 f f
f t f v 1 0 25 "20" _null_ _null_ _null_ _null_ pg_size_pretty _null_
_null_ _null_ ));
-DESCR("convert a long int to a human readable text using size units");
+DATA(insert OID = 3158 ( pg_size_pretty			PGNSP PGUID 12 1 0 0 0 f f
f t f v 1 0 25 "1700" _null_ _null_ _null_ _null_ pg_size_pretty
_null_ _null_ _null_ ));
+DESCR("convert a numeric to a human readable text using size units");

Why OID needs to be reassigned?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In reply to: Fujii Masao (#59)
1 attachment(s)
Re: xlog location arithmetic

On 08-02-2012 09:35, Fujii Masao wrote:

Fujii, new patch attached. Thanks for your tests.

But another problem happened. When I changed pg_proc.h so that the unused
OID was assigned to pg_xlog_location_diff(), and executed the above again,
I encountered the segmentation fault:

I reproduced the problems in my old 32-bit laptop. I fixed it casting to
int64. I also updated the duplicated OID.

Why OID needs to be reassigned?

There isn't a compelling reason. It is just a way to say: "hey, it is another
function with the same old name".

I'll not attach another version for pg_size_pretty because it is a matter of
updating a duplicated OID.

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Attachments:

xlogdiff3.difftext/x-patch; name=xlogdiff3.diffDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 236a60a..826f002 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14446,11 +14446,15 @@ SELECT set_config('log_statement_stats', 'off', false);
    <indexterm>
     <primary>pg_xlogfile_name_offset</primary>
    </indexterm>
+   <indexterm>
+    <primary>pg_xlog_location_diff</primary>
+   </indexterm>
 
    <para>
     The functions shown in <xref
     linkend="functions-admin-backup-table"> assist in making on-line backups.
-    These functions cannot be executed during recovery.
+	These functions cannot be executed during recovery (except
+	<function>pg_xlog_location_diff</function>).
    </para>
 
    <table id="functions-admin-backup-table">
@@ -14518,6 +14522,13 @@ SELECT set_config('log_statement_stats', 'off', false);
        <entry><type>text</>, <type>integer</></entry>
        <entry>Convert transaction log location string to file name and decimal byte offset within file</entry>
       </row>
+      <row>
+       <entry>
+        <literal><function>pg_xlog_location_diff(<parameter>location</> <type>text</>, <parameter>location</> <type>text</>)</function></literal>
+        </entry>
+       <entry><type>numeric</></entry>
+       <entry>Calculate the difference between two transaction log locations</entry>
+      </row>
      </tbody>
     </tgroup>
    </table>
@@ -14611,6 +14622,13 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
    </para>
 
    <para>
+	<function>pg_xlog_location_diff</> calculates the difference in bytes
+	between two transaction log locations. It can be used with
+	<structname>pg_stat_replication</structname> or some functions shown in
+	<xref linkend="functions-admin-backup-table"> to get the replication lag.
+   </para>
+
+   <para>
     For details about proper usage of these functions, see
     <xref linkend="continuous-archiving">.
    </para>
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2e10d4d..be7d388 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "replication/walreceiver.h"
 #include "storage/smgr.h"
 #include "utils/builtins.h"
+#include "utils/numeric.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -465,3 +466,83 @@ pg_is_in_recovery(PG_FUNCTION_ARGS)
 {
 	PG_RETURN_BOOL(RecoveryInProgress());
 }
+
+static void
+validate_xlog_location(char *str)
+{
+#define	MAXLSNCOMPONENT		8
+
+	int	len1, len2;
+
+	len1 = strspn(str, "0123456789abcdefABCDEF");
+	if (len1 < 1 || len1 > MAXLSNCOMPONENT || str[len1] != '/')
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+			 errmsg("invalid input syntax for transaction log location: \"%s\"", str)));
+	len2 = strspn(str + len1 + 1, "0123456789abcdefABCDEF");
+	if (len2 < 1 || len2 > MAXLSNCOMPONENT || str[len1 + 1 + len2] != '\0')
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+			 errmsg("invalid input syntax for transaction log location: \"%s\"", str)));
+}
+
+/*
+ * Compute the difference in bytes between two WAL locations.
+ */
+Datum
+pg_xlog_location_diff(PG_FUNCTION_ARGS)
+{
+	text		*location1 = PG_GETARG_TEXT_P(0);
+	text		*location2 = PG_GETARG_TEXT_P(1);
+	char		*str1, *str2;
+	XLogRecPtr	loc1, loc2;
+	Numeric		result;
+
+	/*
+	 * Read and parse input
+	 */
+	str1 = text_to_cstring(location1);
+	str2 = text_to_cstring(location2);
+
+	validate_xlog_location(str1);
+	validate_xlog_location(str2);
+
+	if (sscanf(str1, "%X/%X", &loc1.xlogid, &loc1.xrecoff) != 2)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("could not parse transaction log location \"%s\"", str1)));
+	if (sscanf(str2, "%X/%X", &loc2.xlogid, &loc2.xrecoff) != 2)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("could not parse transaction log location \"%s\"", str2)));
+
+	/*
+	 * Sanity check
+	 */
+	if (loc1.xrecoff > XLogFileSize)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("xrecoff \"%X\" is out of valid range, 0..%X", loc1.xrecoff, XLogFileSize)));
+	if (loc2.xrecoff > XLogFileSize)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+			errmsg("xrecoff \"%X\" is out of valid range, 0..%X", loc2.xrecoff, XLogFileSize)));
+
+	/*
+	 * result = XLogFileSize * (xlogid1 - xlogid2) + xrecoff1 - xrecoff2
+	 */
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_sub,
+							DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) loc1.xlogid)),
+							DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) loc2.xlogid))));
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_mul,
+							DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) XLogFileSize)),
+							NumericGetDatum(result)));
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_add,
+							NumericGetDatum(result),
+							DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) loc1.xrecoff))));
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_sub,
+							NumericGetDatum(result),
+							DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) loc2.xrecoff))));
+
+	PG_RETURN_NUMERIC(result);
+}
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index b81c156..c079a9a 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -281,5 +281,6 @@ extern Datum pg_is_in_recovery(PG_FUNCTION_ARGS);
 extern Datum pg_xlog_replay_pause(PG_FUNCTION_ARGS);
 extern Datum pg_xlog_replay_resume(PG_FUNCTION_ARGS);
 extern Datum pg_is_xlog_replay_paused(PG_FUNCTION_ARGS);
+extern Datum pg_xlog_location_diff(PG_FUNCTION_ARGS);
 
 #endif   /* XLOG_INTERNAL_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index d926a88..f6b67b0 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2925,6 +2925,8 @@ DATA(insert OID = 2850 ( pg_xlogfile_name_offset	PGNSP PGUID 12 1 0 0 0 f f f t
 DESCR("xlog filename and byte offset, given an xlog location");
 DATA(insert OID = 2851 ( pg_xlogfile_name			PGNSP PGUID 12 1 0 0 0 f f f t f i 1 0 25 "25" _null_ _null_ _null_ _null_ pg_xlogfile_name _null_ _null_ _null_ ));
 DESCR("xlog filename, given an xlog location");
+DATA(insert OID = 3165 ( pg_xlog_location_diff		PGNSP PGUID 12 1 0 0 0 f f f t f i 2 0 1700 "25 25" _null_ _null_ _null_ _null_ pg_xlog_location_diff _null_ _null_ _null_ ));
+DESCR("difference in bytes, given two xlog locations");
 
 DATA(insert OID = 3809 ( pg_export_snapshot		PGNSP PGUID 12 1 0 0 0 f f f t f v 0 0 25 "" _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
 DESCR("export a snapshot");
#61Fujii Masao
masao.fujii@gmail.com
In reply to: Euler Taveira de Oliveira (#60)
Re: xlog location arithmetic

On Fri, Feb 10, 2012 at 7:00 AM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 08-02-2012 09:35, Fujii Masao wrote:

Fujii, new patch attached. Thanks for your tests.

Thanks for the new patch!

But another problem happened. When I changed pg_proc.h so that the unused
OID was assigned to pg_xlog_location_diff(), and executed the above again,
I encountered the segmentation fault:

I reproduced the problems in my old 32-bit laptop. I fixed it casting to
int64. I also updated the duplicated OID.

Yep, in the updated patch, I could confirm that the function works fine without
any error in my machine. The patch looks fine to me except the following minor
comments:

In the document, it's better to explain clearly that the function subtracts the
second argument from the first.

-    These functions cannot be executed during recovery.
+	These functions cannot be executed during recovery (except
+	<function>pg_xlog_location_diff</function>).
+	<function>pg_xlog_location_diff</> calculates the difference in bytes
+	between two transaction log locations. It can be used with
+	<structname>pg_stat_replication</structname> or some functions shown in
+	<xref linkend="functions-admin-backup-table"> to get the replication lag.

Very minor comment: you should use spaces rather than a tab to indent each line.

Why OID needs to be reassigned?

There isn't a compelling reason. It is just a way to say: "hey, it is another
function with the same old name".

I'll not attach another version for pg_size_pretty because it is a matter of
updating a duplicated OID.

Okay, I reviewed the previous patch again. That looks fine to me.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#62Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#61)
1 attachment(s)
Re: xlog location arithmetic

On Fri, Feb 10, 2012 at 09:32, Fujii Masao <masao.fujii@gmail.com> wrote:

On Fri, Feb 10, 2012 at 7:00 AM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 08-02-2012 09:35, Fujii Masao wrote:

Fujii, new patch attached. Thanks for your tests.

Thanks for the new patch!

But another problem happened. When I changed pg_proc.h so that the unused
OID was assigned to pg_xlog_location_diff(), and executed the above again,
I encountered the segmentation fault:

I reproduced the problems in my old 32-bit laptop. I fixed it casting to
int64. I also updated the duplicated OID.

Yep, in the updated patch, I could confirm that the function works fine without
any error in my machine. The patch looks fine to me except the following minor
comments:

I started working on this one to commit it, and came up with a few things more.

Do we even *need* the validate_xlog_location() function? If we just
remove those calls, won't we still catch all the incorrectly formatted
ones in the errors of the sscanf() calls? Or am I too deep into
weekend-mode and missing something obvious?

I've also removed tabs in the documentation, fixed the merge confllict
in pg_proc.h that happened during the wait, and fixed some indentation
(updated patch with these changes attached).

But I'm going to hold off committing it until someone confirms I'm not
caught too deeply in weekend-mode and am missing something obvious in
the comment above about validate_xlog_location.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachments:

xlogdiff4.difftext/x-patch; charset=US-ASCII; name=xlogdiff4.diffDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e8e637b..4ae76e2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14454,11 +14454,15 @@ SELECT set_config('log_statement_stats', 'off', false);
    <indexterm>
     <primary>pg_xlogfile_name_offset</primary>
    </indexterm>
+   <indexterm>
+    <primary>pg_xlog_location_diff</primary>
+   </indexterm>
 
    <para>
     The functions shown in <xref
     linkend="functions-admin-backup-table"> assist in making on-line backups.
-    These functions cannot be executed during recovery.
+    These functions cannot be executed during recovery (except
+    <function>pg_xlog_location_diff</function>).
    </para>
 
    <table id="functions-admin-backup-table">
@@ -14526,6 +14530,13 @@ SELECT set_config('log_statement_stats', 'off', false);
        <entry><type>text</>, <type>integer</></entry>
        <entry>Convert transaction log location string to file name and decimal byte offset within file</entry>
       </row>
+      <row>
+       <entry>
+        <literal><function>pg_xlog_location_diff(<parameter>location</> <type>text</>, <parameter>location</> <type>text</>)</function></literal>
+       </entry>
+       <entry><type>numeric</></entry>
+       <entry>Calculate the difference between two transaction log locations</entry>
+      </row>
      </tbody>
     </tgroup>
    </table>
@@ -14619,6 +14630,13 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
    </para>
 
    <para>
+    <function>pg_xlog_location_diff</> calculates the difference in bytes
+    between two transaction log locations. It can be used with
+    <structname>pg_stat_replication</structname> or some functions shown in
+    <xref linkend="functions-admin-backup-table"> to get the replication lag.
+   </para>
+
+   <para>
     For details about proper usage of these functions, see
     <xref linkend="continuous-archiving">.
    </para>
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 2e10d4d..b8f8152 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "replication/walreceiver.h"
 #include "storage/smgr.h"
 #include "utils/builtins.h"
+#include "utils/numeric.h"
 #include "utils/guc.h"
 #include "utils/timestamp.h"
 
@@ -465,3 +466,87 @@ pg_is_in_recovery(PG_FUNCTION_ARGS)
 {
 	PG_RETURN_BOOL(RecoveryInProgress());
 }
+
+static void
+validate_xlog_location(char *str)
+{
+#define MAXLSNCOMPONENT		8
+
+	int			len1,
+				len2;
+
+	len1 = strspn(str, "0123456789abcdefABCDEF");
+	if (len1 < 1 || len1 > MAXLSNCOMPONENT || str[len1] != '/')
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+				 errmsg("invalid input syntax for transaction log location: \"%s\"", str)));
+
+	len2 = strspn(str + len1 + 1, "0123456789abcdefABCDEF");
+	if (len2 < 1 || len2 > MAXLSNCOMPONENT || str[len1 + 1 + len2] != '\0')
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+				 errmsg("invalid input syntax for transaction log location: \"%s\"", str)));
+}
+
+/*
+ * Compute the difference in bytes between two WAL locations.
+ */
+Datum
+pg_xlog_location_diff(PG_FUNCTION_ARGS)
+{
+	text	   *location1 = PG_GETARG_TEXT_P(0);
+	text	   *location2 = PG_GETARG_TEXT_P(1);
+	char	   *str1,
+			   *str2;
+	XLogRecPtr	loc1,
+				loc2;
+	Numeric		result;
+
+	/*
+	 * Read and parse input
+	 */
+	str1 = text_to_cstring(location1);
+	str2 = text_to_cstring(location2);
+
+	validate_xlog_location(str1);
+	validate_xlog_location(str2);
+
+	if (sscanf(str1, "%X/%X", &loc1.xlogid, &loc1.xrecoff) != 2)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		   errmsg("could not parse transaction log location \"%s\"", str1)));
+	if (sscanf(str2, "%X/%X", &loc2.xlogid, &loc2.xrecoff) != 2)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		   errmsg("could not parse transaction log location \"%s\"", str2)));
+
+	/*
+	 * Sanity check
+	 */
+	if (loc1.xrecoff > XLogFileSize)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("xrecoff \"%X\" is out of valid range, 0..%X", loc1.xrecoff, XLogFileSize)));
+	if (loc2.xrecoff > XLogFileSize)
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("xrecoff \"%X\" is out of valid range, 0..%X", loc2.xrecoff, XLogFileSize)));
+
+	/*
+	 * result = XLogFileSize * (xlogid1 - xlogid2) + xrecoff1 - xrecoff2
+	 */
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_sub,
+	   DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) loc1.xlogid)),
+	 DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) loc2.xlogid))));
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_mul,
+	  DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) XLogFileSize)),
+												 NumericGetDatum(result)));
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_add,
+												 NumericGetDatum(result),
+	DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) loc1.xrecoff))));
+	result = DatumGetNumeric(DirectFunctionCall2(numeric_sub,
+												 NumericGetDatum(result),
+	DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) loc2.xrecoff))));
+
+	PG_RETURN_NUMERIC(result);
+}
diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h
index b81c156..c079a9a 100644
--- a/src/include/access/xlog_internal.h
+++ b/src/include/access/xlog_internal.h
@@ -281,5 +281,6 @@ extern Datum pg_is_in_recovery(PG_FUNCTION_ARGS);
 extern Datum pg_xlog_replay_pause(PG_FUNCTION_ARGS);
 extern Datum pg_xlog_replay_resume(PG_FUNCTION_ARGS);
 extern Datum pg_is_xlog_replay_paused(PG_FUNCTION_ARGS);
+extern Datum pg_xlog_location_diff(PG_FUNCTION_ARGS);
 
 #endif   /* XLOG_INTERNAL_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 8700d0d..b848a27 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2928,6 +2928,9 @@ DESCR("xlog filename and byte offset, given an xlog location");
 DATA(insert OID = 2851 ( pg_xlogfile_name			PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 25 "25" _null_ _null_ _null_ _null_ pg_xlogfile_name _null_ _null_ _null_ ));
 DESCR("xlog filename, given an xlog location");
 
+DATA(insert OID = 3165 ( pg_xlog_location_diff		PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 1700 "25 25" _null_ _null_ _null_ _null_ pg_xlog_location_diff _null_ _null_ _null_ ));
+DESCR("difference in bytes, given two xlog locations");
+
 DATA(insert OID = 3809 ( pg_export_snapshot		PGNSP PGUID 12 1 0 0 0 f f f f t f v 0 0 25 "" _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
 DESCR("export a snapshot");
 
In reply to: Magnus Hagander (#62)
Re: xlog location arithmetic

On 25-02-2012 09:23, Magnus Hagander wrote:

Do we even *need* the validate_xlog_location() function? If we just
remove those calls, won't we still catch all the incorrectly formatted
ones in the errors of the sscanf() calls? Or am I too deep into
weekend-mode and missing something obvious?

sscanf() is too fragile for input sanity check. Try
pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing
that function if you protect xlog location input from silly users.

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

#64Fujii Masao
masao.fujii@gmail.com
In reply to: Euler Taveira de Oliveira (#63)
Re: xlog location arithmetic

On Sun, Feb 26, 2012 at 8:53 AM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 25-02-2012 09:23, Magnus Hagander wrote:

Do we even *need* the validate_xlog_location() function? If we just
remove those calls, won't we still catch all the incorrectly formatted
ones in the errors of the sscanf() calls? Or am I too deep into
weekend-mode and missing something obvious?

sscanf() is too fragile for input sanity check. Try
pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing
that function if you protect xlog location input from silly users.

After this patch will have been committed, it would be better to change
pg_xlogfile_name() and pg_xlogfile_name_offset() so that they use
the validate_xlog_location() function to validate the input.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#65Magnus Hagander
magnus@hagander.net
In reply to: Euler Taveira de Oliveira (#63)
Re: xlog location arithmetic

On Sun, Feb 26, 2012 at 00:53, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 25-02-2012 09:23, Magnus Hagander wrote:

Do we even *need* the validate_xlog_location() function? If we just
remove those calls, won't we still catch all the incorrectly formatted
ones in the errors of the sscanf() calls? Or am I too deep into
weekend-mode and missing something obvious?

sscanf() is too fragile for input sanity check. Try
pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing
that function if you protect xlog location input from silly users.

Ah, good point. No, that's the reason I was missing :-)

Patch applied, thanks!

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#66Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#64)
Re: xlog location arithmetic

On Tue, Feb 28, 2012 at 07:21, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sun, Feb 26, 2012 at 8:53 AM, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 25-02-2012 09:23, Magnus Hagander wrote:

Do we even *need* the validate_xlog_location() function? If we just
remove those calls, won't we still catch all the incorrectly formatted
ones in the errors of the sscanf() calls? Or am I too deep into
weekend-mode and missing something obvious?

sscanf() is too fragile for input sanity check. Try
pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing
that function if you protect xlog location input from silly users.

After this patch will have been committed, it would be better to change
pg_xlogfile_name() and pg_xlogfile_name_offset() so that they use
the validate_xlog_location() function to validate the input.

And I've done this part as well.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#67Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#65)
Re: xlog location arithmetic

On Sun, Mar 4, 2012 at 8:26 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Sun, Feb 26, 2012 at 00:53, Euler Taveira de Oliveira
<euler@timbira.com> wrote:

On 25-02-2012 09:23, Magnus Hagander wrote:

Do we even *need* the validate_xlog_location() function? If we just
remove those calls, won't we still catch all the incorrectly formatted
ones in the errors of the sscanf() calls? Or am I too deep into
weekend-mode and missing something obvious?

sscanf() is too fragile for input sanity check. Try
pg_xlog_location_diff('12/3', '-10/0'), for example. I won't object removing
that function if you protect xlog location input from silly users.

Ah, good point. No, that's the reason I was missing :-)

Patch applied, thanks!

Thanks for committing the patch!

Euler proposed one more patch upthread, which replaces pg_size_pretty(bigint)
with pg_size_pretty(numeric) so that pg_size_pretty(pg_xlog_location_diff())
succeeds. It's also worth committing this patch?
http://archives.postgresql.org/message-id/4F315F6C.8030700@timbira.com

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#68Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fujii Masao (#67)
Re: xlog location arithmetic

Fujii Masao <masao.fujii@gmail.com> writes:

Euler proposed one more patch upthread, which replaces pg_size_pretty(bigint)
with pg_size_pretty(numeric) so that pg_size_pretty(pg_xlog_location_diff())
succeeds. It's also worth committing this patch?

Why would it be useful to use pg_size_pretty on xlog locations?
-1 because of the large expense of bigint->numeric->whatever conversion
that would be added to existing uses.

regards, tom lane

#69Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#68)
Re: xlog location arithmetic

On Fri, Mar 9, 2012 at 9:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Fujii Masao <masao.fujii@gmail.com> writes:

Euler proposed one more patch upthread, which replaces pg_size_pretty(bigint)
with pg_size_pretty(numeric) so that pg_size_pretty(pg_xlog_location_diff())
succeeds. It's also worth committing this patch?

Why would it be useful to use pg_size_pretty on xlog locations?
-1 because of the large expense of bigint->numeric->whatever conversion
that would be added to existing uses.

The point is that it would be useful to use it on the difference
between two xlog locations, but that is a numeric value, not int8,
because of signedness issues.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#70Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#68)
Re: xlog location arithmetic

I wrote:

Fujii Masao <masao.fujii@gmail.com> writes:

Euler proposed one more patch upthread, which replaces pg_size_pretty(bigint)
with pg_size_pretty(numeric) so that pg_size_pretty(pg_xlog_location_diff())
succeeds. It's also worth committing this patch?

Why would it be useful to use pg_size_pretty on xlog locations?
-1 because of the large expense of bigint->numeric->whatever conversion
that would be added to existing uses.

Actually ... now that I look at it, isn't it completely bogus to be
using numeric for the result of pg_xlog_location_diff? There's no way
for the difference of two xlog locations to be anywhere near as wide as
64 bits. That'd only be possible if XLogFileSize exceeded 1GB, which we
don't let it get anywhere near.

regards, tom lane

#71Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#69)
Re: xlog location arithmetic

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 9, 2012 at 9:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Why would it be useful to use pg_size_pretty on xlog locations?

The point is that it would be useful to use it on the difference
between two xlog locations,

Um, that is exactly the claim I was questioning. Why is that useful?

but that is a numeric value, not int8, because of signedness issues.

See my followup --- this statement appears factually incorrect,
whatever you may feel about the usefulness issue.

regards, tom lane

#72Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#70)
Re: xlog location arithmetic

On Fri, Mar 9, 2012 at 9:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

Fujii Masao <masao.fujii@gmail.com> writes:

Euler proposed one more patch upthread, which replaces pg_size_pretty(bigint)
with pg_size_pretty(numeric) so that pg_size_pretty(pg_xlog_location_diff())
succeeds. It's also worth committing this patch?

Why would it be useful to use pg_size_pretty on xlog locations?
-1 because of the large expense of bigint->numeric->whatever conversion
that would be added to existing uses.

Actually ... now that I look at it, isn't it completely bogus to be
using numeric for the result of pg_xlog_location_diff?  There's no way
for the difference of two xlog locations to be anywhere near as wide as
64 bits.  That'd only be possible if XLogFileSize exceeded 1GB, which we
don't let it get anywhere near.

rhaas=# select pg_xlog_location_diff('ffffffff/0', '0/0');
pg_xlog_location_diff
-----------------------
18374686475393433600
(1 row)

rhaas=# select pg_xlog_location_diff('ffffffff/0', '0/0')::int8;
ERROR: bigint out of range
STATEMENT: select pg_xlog_location_diff('ffffffff/0', '0/0')::int8;

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#73Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#72)
Re: xlog location arithmetic

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 9, 2012 at 9:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Actually ... now that I look at it, isn't it completely bogus to be
using numeric for the result of pg_xlog_location_diff?

rhaas=# select pg_xlog_location_diff('ffffffff/0', '0/0')::int8;
ERROR: bigint out of range

Oh ... I see my mistake. I was looking at this:

/*
* result = XLogFileSize * (xlogid1 - xlogid2) + xrecoff1 - xrecoff2
*/

and confusing XLogFileSize with XLogSegSize. Not the best choice of
names.

regards, tom lane

#74Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#73)
Re: xlog location arithmetic

On Fri, Mar 9, 2012 at 16:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 9, 2012 at 9:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Actually ... now that I look at it, isn't it completely bogus to be
using numeric for the result of pg_xlog_location_diff?

rhaas=# select pg_xlog_location_diff('ffffffff/0', '0/0')::int8;
ERROR:  bigint out of range

Oh ... I see my mistake.  I was looking at this:

       /*
        * result = XLogFileSize * (xlogid1 - xlogid2) + xrecoff1 - xrecoff2
        */

and confusing XLogFileSize with XLogSegSize.  Not the best choice of
names.

Yeah, the use of XLogFile to mean something other than, well a file in
the xlog, is greatly annoying.. I guess we could change it, but it
goes pretty deep in the system so it's not a small change...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#75Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#68)
Re: xlog location arithmetic

On Fri, Mar 9, 2012 at 15:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Fujii Masao <masao.fujii@gmail.com> writes:

Euler proposed one more patch upthread, which replaces pg_size_pretty(bigint)
with pg_size_pretty(numeric) so that pg_size_pretty(pg_xlog_location_diff())
succeeds. It's also worth committing this patch?

Why would it be useful to use pg_size_pretty on xlog locations?
-1 because of the large expense of bigint->numeric->whatever conversion
that would be added to existing uses.

Given the expense, perhaps we need to different (overloaded) functions instead?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#76Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#75)
Re: xlog location arithmetic

Magnus Hagander <magnus@hagander.net> writes:

On Fri, Mar 9, 2012 at 15:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Why would it be useful to use pg_size_pretty on xlog locations?
-1 because of the large expense of bigint->numeric->whatever conversion
that would be added to existing uses.

Given the expense, perhaps we need to different (overloaded) functions instead?

That would be a workable solution, but I continue to not believe that
this is useful enough to be worth the trouble.

regards, tom lane

#77Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#76)
Re: xlog location arithmetic

On Fri, Mar 9, 2012 at 18:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Fri, Mar 9, 2012 at 15:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Why would it be useful to use pg_size_pretty on xlog locations?
-1 because of the large expense of bigint->numeric->whatever conversion
that would be added to existing uses.

Given the expense, perhaps we need to different (overloaded) functions instead?

That would be a workable solution, but I continue to not believe that
this is useful enough to be worth the trouble.

There's certainly some use to being able to prettify it. Wouldn't a
pg_size_pretty(numeric) also be useful if you want to pg_size_() a
sum() of something? Used on files it doesn't make too much sense,
given how big those files have to be, but it can be used on other
things as well...

I can see a usecase for having a pg_size_pretty(numeric) as an option.
Not necessarily a very big one, but a >0 one.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#78Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#77)
Re: xlog location arithmetic

On Fri, Mar 9, 2012 at 12:38 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Fri, Mar 9, 2012 at 18:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Fri, Mar 9, 2012 at 15:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Why would it be useful to use pg_size_pretty on xlog locations?
-1 because of the large expense of bigint->numeric->whatever conversion
that would be added to existing uses.

Given the expense, perhaps we need to different (overloaded) functions instead?

That would be a workable solution, but I continue to not believe that
this is useful enough to be worth the trouble.

There's certainly some use to being able to prettify it. Wouldn't a
pg_size_pretty(numeric) also be useful if you want to pg_size_() a
sum() of something? Used on files it doesn't make too much sense,
given how big those files have to be, but it can be used on other
things as well...

I can see a usecase for having a pg_size_pretty(numeric) as an option.
Not necessarily a very big one, but a >0 one.

+1.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#79Peter Eisentraut
peter_e@gmx.net
In reply to: Magnus Hagander (#74)
Re: xlog location arithmetic

On fre, 2012-03-09 at 18:13 +0100, Magnus Hagander wrote:

and confusing XLogFileSize with XLogSegSize. Not the best choice of
names.

Yeah, the use of XLogFile to mean something other than, well a file in
the xlog, is greatly annoying.. I guess we could change it, but it
goes pretty deep in the system so it's not a small change...

The whole thing was built around the lack of 64 bit integers. If we bit
the bullet and changed the whole thing to be just a single 64-bit
counter, we could probably delete thousands of lines of code.

#80Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#79)
Re: xlog location arithmetic

Peter Eisentraut <peter_e@gmx.net> writes:

Yeah, the use of XLogFile to mean something other than, well a file in
the xlog, is greatly annoying.. I guess we could change it, but it
goes pretty deep in the system so it's not a small change...

The whole thing was built around the lack of 64 bit integers. If we bit
the bullet and changed the whole thing to be just a single 64-bit
counter, we could probably delete thousands of lines of code.

Hm. I think "thousands" is an overestimate, but yeah the logic could be
greatly simplified. However, I'm not sure we could avoid breaking the
existing naming convention for WAL files. How much do we care about
that?

regards, tom lane

#81Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#80)
Re: xlog location arithmetic

On Fri, Mar 9, 2012 at 2:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Yeah, the use of XLogFile to mean something other than, well a file in
the xlog, is greatly annoying.. I guess we could change it, but it
goes pretty deep in the system so it's not a small change...

The whole thing was built around the lack of 64 bit integers.  If we bit
the bullet and changed the whole thing to be just a single 64-bit
counter, we could probably delete thousands of lines of code.

Hm.  I think "thousands" is an overestimate, but yeah the logic could be
greatly simplified.  However, I'm not sure we could avoid breaking the
existing naming convention for WAL files.  How much do we care about
that?

Probably not very much, since WAL files aren't portable across major
versions anyway. But I don't see why you couldn't keep the naming
convention - there's nothing to prevent you from converting a 64-bit
integer back into two 32-bit integers if and where needed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#82Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tom Lane (#80)
Re: xlog location arithmetic

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

The whole thing was built around the lack of 64 bit integers. If
we bit the bullet and changed the whole thing to be just a single
64-bit counter, we could probably delete thousands of lines of
code.

Hm. I think "thousands" is an overestimate, but yeah the logic
could be greatly simplified. However, I'm not sure we could avoid
breaking the existing naming convention for WAL files. How much
do we care about that?

We have a few scripts in our backup area that are based around the
current WAL file naming convention, so there would be some impact;
but I have to believe it would be pretty minor. Most of the pain
would be related to the need to support both naming conventions for
some transition period. If it simplifies the WAL-related logic, it
seems well worth it to me. We just have to know it's coming and be
clear on what the new naming rules are.

-Kevin

#83Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#81)
Re: xlog location arithmetic

On Fri, Mar 9, 2012 at 2:34 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 9, 2012 at 2:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Yeah, the use of XLogFile to mean something other than, well a file in
the xlog, is greatly annoying.. I guess we could change it, but it
goes pretty deep in the system so it's not a small change...

The whole thing was built around the lack of 64 bit integers.  If we bit
the bullet and changed the whole thing to be just a single 64-bit
counter, we could probably delete thousands of lines of code.

Hm.  I think "thousands" is an overestimate, but yeah the logic could be
greatly simplified.  However, I'm not sure we could avoid breaking the
existing naming convention for WAL files.  How much do we care about
that?

Probably not very much, since WAL files aren't portable across major
versions anyway.  But I don't see why you couldn't keep the naming
convention - there's nothing to prevent you from converting a 64-bit
integer back into two 32-bit integers if and where needed.

On further reflection, this seems likely to break quite a few
third-party tools. Maybe it'd be worth it anyway, but it definitely
seems like it would be worth going to at least some minor trouble to
avoid it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#84Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#83)
Re: xlog location arithmetic

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 9, 2012 at 2:34 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 9, 2012 at 2:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm. �I think "thousands" is an overestimate, but yeah the logic could be
greatly simplified. �However, I'm not sure we could avoid breaking the
existing naming convention for WAL files. �How much do we care about
that?

Probably not very much, since WAL files aren't portable across major
versions anyway. �But I don't see why you couldn't keep the naming
convention - there's nothing to prevent you from converting a 64-bit
integer back into two 32-bit integers if and where needed.

On further reflection, this seems likely to break quite a few
third-party tools. Maybe it'd be worth it anyway, but it definitely
seems like it would be worth going to at least some minor trouble to
avoid it.

The main actual simplification would be in getting rid of the "hole"
at the end of each 4GB worth of WAL, cf this bit in xlog_internal.h:

/*
* We break each logical log file (xlogid value) into segment files of the
* size indicated by XLOG_SEG_SIZE. One possible segment at the end of each
* log file is wasted, to ensure that we don't have problems representing
* last-byte-position-plus-1.
*/
#define XLogSegSize ((uint32) XLOG_SEG_SIZE)
#define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize)
#define XLogFileSize (XLogSegsPerFile * XLogSegSize)

If we can't get rid of that and have a continuous 64-bit WAL address
space then it's unlikely we can actually simplify any logic.

Now, doing that doesn't break the naming convention exactly; what it
changes is that there will be WAL files numbered xxxFFFF (for some
number of trailing-1-bits I'm too lazy to work out at the moment) where
before there were not. So the question really is how much external code
there is that is aware of that specific noncontiguous numbering behavior
and would be broken if things stopped being that way.

regards, tom lane

#85Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#84)
Re: xlog location arithmetic

On Fri, Mar 9, 2012 at 3:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 9, 2012 at 2:34 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 9, 2012 at 2:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm.  I think "thousands" is an overestimate, but yeah the logic could be
greatly simplified.  However, I'm not sure we could avoid breaking the
existing naming convention for WAL files.  How much do we care about
that?

Probably not very much, since WAL files aren't portable across major
versions anyway.  But I don't see why you couldn't keep the naming
convention - there's nothing to prevent you from converting a 64-bit
integer back into two 32-bit integers if and where needed.

On further reflection, this seems likely to break quite a few
third-party tools.  Maybe it'd be worth it anyway, but it definitely
seems like it would be worth going to at least some minor trouble to
avoid it.

The main actual simplification would be in getting rid of the "hole"
at the end of each 4GB worth of WAL, cf this bit in xlog_internal.h:

/*
 * We break each logical log file (xlogid value) into segment files of the
 * size indicated by XLOG_SEG_SIZE.  One possible segment at the end of each
 * log file is wasted, to ensure that we don't have problems representing
 * last-byte-position-plus-1.
 */
#define XLogSegSize             ((uint32) XLOG_SEG_SIZE)
#define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize)
#define XLogFileSize    (XLogSegsPerFile * XLogSegSize)

If we can't get rid of that and have a continuous 64-bit WAL address
space then it's unlikely we can actually simplify any logic.

Now, doing that doesn't break the naming convention exactly; what it
changes is that there will be WAL files numbered xxxFFFF (for some
number of trailing-1-bits I'm too lazy to work out at the moment) where
before there were not.  So the question really is how much external code
there is that is aware of that specific noncontiguous numbering behavior
and would be broken if things stopped being that way.

I would expect that most things would NOT know about that particular
foible, and just be matching pathnames on an RE, which should be fine.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#86Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#84)
Re: xlog location arithmetic

On Fri, Mar 09, 2012 at 03:04:23PM -0500, Tom Lane wrote:

The main actual simplification would be in getting rid of the "hole"
at the end of each 4GB worth of WAL, cf this bit in xlog_internal.h:

/*
* We break each logical log file (xlogid value) into segment files of the
* size indicated by XLOG_SEG_SIZE. One possible segment at the end of each
* log file is wasted, to ensure that we don't have problems representing
* last-byte-position-plus-1.
*/
#define XLogSegSize ((uint32) XLOG_SEG_SIZE)
#define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize)
#define XLogFileSize (XLogSegsPerFile * XLogSegSize)

If we can't get rid of that and have a continuous 64-bit WAL address
space then it's unlikely we can actually simplify any logic.

Now, doing that doesn't break the naming convention exactly; what it
changes is that there will be WAL files numbered xxxFFFF (for some
number of trailing-1-bits I'm too lazy to work out at the moment) where
before there were not. So the question really is how much external code
there is that is aware of that specific noncontiguous numbering behavior
and would be broken if things stopped being that way.

Our current WAL naming is hopelessly arcane, and we would certainly be
benfitting users to simplify it. Is this a TODO?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#87Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#78)
1 attachment(s)
Re: xlog location arithmetic

On Sat, Mar 10, 2012 at 3:23 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 9, 2012 at 12:38 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Fri, Mar 9, 2012 at 18:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Fri, Mar 9, 2012 at 15:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Why would it be useful to use pg_size_pretty on xlog locations?
-1 because of the large expense of bigint->numeric->whatever conversion
that would be added to existing uses.

Given the expense, perhaps we need to different (overloaded) functions instead?

Agreed. Attached patch introduces the overloaded funtion
pg_size_pretty(numeric).

That would be a workable solution, but I continue to not believe that
this is useful enough to be worth the trouble.

There's certainly some use to being able to prettify it. Wouldn't a
pg_size_pretty(numeric) also be useful if you want to pg_size_() a
sum() of something? Used on files it doesn't make too much sense,
given how big those files have to be, but it can be used on other
things as well...

I can see a usecase for having a pg_size_pretty(numeric) as an option.
Not necessarily a very big one, but a >0 one.

+1.

+1, too.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

sizepretty_v3.patchtext/x-diff; charset=US-ASCII; name=sizepretty_v3.patchDownload
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***************
*** 14989,14995 **** postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
        </row>
        <row>
         <entry>
!         <literal><function>pg_size_pretty(<type>bigint</type>)</function></literal>
          </entry>
         <entry><type>text</type></entry>
         <entry>Converts a size in bytes into a human-readable format with size units</entry>
--- 14989,14995 ----
        </row>
        <row>
         <entry>
!         <literal><function>pg_size_pretty(<type>bigint</type> or <type>numeric</type>)</function></literal>
          </entry>
         <entry><type>text</type></entry>
         <entry>Converts a size in bytes into a human-readable format with size units</entry>
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***************
*** 24,29 ****
--- 24,30 ----
  #include "storage/fd.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
+ #include "utils/numeric.h"
  #include "utils/rel.h"
  #include "utils/relmapper.h"
  #include "utils/syscache.h"
***************
*** 550,555 **** pg_size_pretty(PG_FUNCTION_ARGS)
--- 551,652 ----
  	PG_RETURN_TEXT_P(cstring_to_text(buf));
  }
  
+ Datum
+ pg_size_pretty_numeric(PG_FUNCTION_ARGS)
+ {
+ 	Numeric		size = PG_GETARG_NUMERIC(0);
+ 	Numeric		limit, limit2;
+ 	char		*buf, *result;
+ 
+ 	limit = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) (10 * 1024))));
+ 	limit2 = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) (10 * 1024 * 2 - 1))));
+ 
+ 	if (DatumGetBool(DirectFunctionCall2(numeric_lt, NumericGetDatum(size), NumericGetDatum(limit))))
+ 	{
+ 		buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+ 		result = palloc(strlen(buf) + 7);
+ 		strcpy(result, buf);
+ 		strcat(result, " bytes");
+ 	}
+ 	else
+ 	{
+ 		Numeric		arg2;
+ 
+ 		/* keep one extra bit for rounding */
+ 		/* size >>= 9 */
+ 		arg2 = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) pow(2, 9))));
+ 		size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), NumericGetDatum(arg2)));
+ 
+ 		if (DatumGetBool(DirectFunctionCall2(numeric_lt, NumericGetDatum(size), NumericGetDatum(limit2))))
+ 		{
+ 			/* size = (size + 1) / 2 */
+ 			size = DatumGetNumeric(DirectFunctionCall2(numeric_add, NumericGetDatum(size),
+ 													   DirectFunctionCall1(int8_numeric, Int64GetDatum(1))));
+ 			size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size),
+ 													   DirectFunctionCall1(int8_numeric, Int64GetDatum(2))));
+ 			buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+ 			result = palloc(strlen(buf) + 4);
+ 			strcpy(result, buf);
+ 			strcat(result, " kB");
+ 		}
+ 		else
+ 		{
+ 			Numeric		arg3;
+ 
+ 			/* size >>= 10 */
+ 			arg3 = DatumGetNumeric(DirectFunctionCall1(int8_numeric, Int64GetDatum((int64) pow(2, 10))));
+ 			size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), NumericGetDatum(arg3)));
+ 
+ 			if (DatumGetBool(DirectFunctionCall2(numeric_lt, NumericGetDatum(size), NumericGetDatum(limit2))))
+ 			{
+ 				/* size = (size + 1) / 2 */
+ 				size = DatumGetNumeric(DirectFunctionCall2(numeric_add, NumericGetDatum(size),
+ 														   DirectFunctionCall1(int8_numeric, Int64GetDatum(1))));
+ 				size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size),
+ 														   DirectFunctionCall1(int8_numeric, Int64GetDatum(2))));
+ 				buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+ 				result = palloc(strlen(buf) + 4);
+ 				strcpy(result, buf);
+ 				strcat(result, " MB");
+ 			}
+ 			else
+ 			{
+ 				/* size >>= 10 */
+ 				size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), NumericGetDatum(arg3)));
+ 
+ 				if (DatumGetBool(DirectFunctionCall2(numeric_lt, NumericGetDatum(size), NumericGetDatum(limit2))))
+ 				{
+ 					/* size = (size + 1) / 2 */
+ 					size = DatumGetNumeric(DirectFunctionCall2(numeric_add, NumericGetDatum(size),
+ 															   DirectFunctionCall1(int8_numeric, Int64GetDatum(1))));
+ 					size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size),
+ 															   DirectFunctionCall1(int8_numeric, Int64GetDatum(2))));
+ 					buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+ 					result = palloc(strlen(buf) + 4);
+ 					strcpy(result, buf);
+ 					strcat(result, " GB");
+ 				}
+ 				else
+ 				{
+ 					/* size = (size + 1) / 2 */
+ 					size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size), NumericGetDatum(arg3)));
+ 					size = DatumGetNumeric(DirectFunctionCall2(numeric_add, NumericGetDatum(size),
+ 															   DirectFunctionCall1(int8_numeric, Int64GetDatum(1))));
+ 					size = DatumGetNumeric(DirectFunctionCall2(numeric_div_trunc, NumericGetDatum(size),
+ 															   DirectFunctionCall1(int8_numeric, Int64GetDatum(2))));
+ 
+ 					buf = DatumGetCString(DirectFunctionCall1(numeric_out, NumericGetDatum(size)));
+ 					result = palloc(strlen(buf) + 4);
+ 					strcpy(result, buf);
+ 					strcat(result, " TB");
+ 				}
+ 			}
+ 		}
+ 	}
+ 
+ 	PG_RETURN_TEXT_P(cstring_to_text(result));
+ }
+ 
  /*
   * Get the filenode of a relation
   *
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***************
*** 2938,2943 **** DESCR("xlog filename, given an xlog location");
--- 2938,2945 ----
  
  DATA(insert OID = 3165 ( pg_xlog_location_diff		PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 1700 "25 25" _null_ _null_ _null_ _null_ pg_xlog_location_diff _null_ _null_ _null_ ));
  DESCR("difference in bytes, given two xlog locations");
+ DATA(insert OID = 3166 ( pg_size_pretty			PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 25 "1700" _null_ _null_ _null_ _null_ pg_size_pretty_numeric _null_ _null_ _null_ ));
+ DESCR("convert a numeric to a human readable text using size units");
  
  DATA(insert OID = 3809 ( pg_export_snapshot		PGNSP PGUID 12 1 0 0 0 f f f f t f v 0 0 25 "" _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
  DESCR("export a snapshot");
*** a/src/include/utils/builtins.h
--- b/src/include/utils/builtins.h
***************
*** 453,458 **** extern Datum pg_database_size_name(PG_FUNCTION_ARGS);
--- 453,459 ----
  extern Datum pg_relation_size(PG_FUNCTION_ARGS);
  extern Datum pg_total_relation_size(PG_FUNCTION_ARGS);
  extern Datum pg_size_pretty(PG_FUNCTION_ARGS);
+ extern Datum pg_size_pretty_numeric(PG_FUNCTION_ARGS);
  extern Datum pg_table_size(PG_FUNCTION_ARGS);
  extern Datum pg_indexes_size(PG_FUNCTION_ARGS);
  extern Datum pg_relation_filenode(PG_FUNCTION_ARGS);
#88Fujii Masao
masao.fujii@gmail.com
In reply to: Tom Lane (#84)
Re: xlog location arithmetic

On Sat, Mar 10, 2012 at 5:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 9, 2012 at 2:34 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Mar 9, 2012 at 2:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm.  I think "thousands" is an overestimate, but yeah the logic could be
greatly simplified.  However, I'm not sure we could avoid breaking the
existing naming convention for WAL files.  How much do we care about
that?

Probably not very much, since WAL files aren't portable across major
versions anyway.  But I don't see why you couldn't keep the naming
convention - there's nothing to prevent you from converting a 64-bit
integer back into two 32-bit integers if and where needed.

On further reflection, this seems likely to break quite a few
third-party tools.  Maybe it'd be worth it anyway, but it definitely
seems like it would be worth going to at least some minor trouble to
avoid it.

The main actual simplification would be in getting rid of the "hole"
at the end of each 4GB worth of WAL, cf this bit in xlog_internal.h:

/*
 * We break each logical log file (xlogid value) into segment files of the
 * size indicated by XLOG_SEG_SIZE.  One possible segment at the end of each
 * log file is wasted, to ensure that we don't have problems representing
 * last-byte-position-plus-1.
 */
#define XLogSegSize             ((uint32) XLOG_SEG_SIZE)
#define XLogSegsPerFile (((uint32) 0xffffffff) / XLogSegSize)
#define XLogFileSize    (XLogSegsPerFile * XLogSegSize)

If we can't get rid of that and have a continuous 64-bit WAL address
space then it's unlikely we can actually simplify any logic.

Now, doing that doesn't break the naming convention exactly; what it
changes is that there will be WAL files numbered xxxFFFF (for some
number of trailing-1-bits I'm too lazy to work out at the moment) where
before there were not.  So the question really is how much external code
there is that is aware of that specific noncontiguous numbering behavior
and would be broken if things stopped being that way.

A page header contains WAL location, so getting rid of "hole" seems to
break pg_upgrade. No? Unless pg_upgrade converts noncontinuous
location to continuous one, we still need to handle noncontinuous one
after upgrade.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#89Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fujii Masao (#88)
Re: xlog location arithmetic

Fujii Masao <masao.fujii@gmail.com> writes:

On Sat, Mar 10, 2012 at 5:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The main actual simplification would be in getting rid of the "hole"
at the end of each 4GB worth of WAL, cf this bit in xlog_internal.h:
If we can't get rid of that and have a continuous 64-bit WAL address
space then it's unlikely we can actually simplify any logic.
...
Now, doing that doesn't break the naming convention exactly; what it
changes is that there will be WAL files numbered xxxFFFF (for some
number of trailing-1-bits I'm too lazy to work out at the moment) where
before there were not. So the question really is how much external code
there is that is aware of that specific noncontiguous numbering behavior
and would be broken if things stopped being that way.

A page header contains WAL location, so getting rid of "hole" seems to
break pg_upgrade. No?

No, why would it do that? The meaning and ordering of WAL addresses is
the same as before. The only difference is that after the upgrade, the
system will stop skipping over 16MB of potentially usable WAL addresses
at the end of each subsequently-used 4GB of space. The holes before
the switchover point are still holes, but that doesn't matter.

regards, tom lane

#90Fujii Masao
masao.fujii@gmail.com
In reply to: Tom Lane (#89)
Re: xlog location arithmetic

On Tue, Mar 13, 2012 at 12:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Fujii Masao <masao.fujii@gmail.com> writes:

On Sat, Mar 10, 2012 at 5:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The main actual simplification would be in getting rid of the "hole"
at the end of each 4GB worth of WAL, cf this bit in xlog_internal.h:
If we can't get rid of that and have a continuous 64-bit WAL address
space then it's unlikely we can actually simplify any logic.
...
Now, doing that doesn't break the naming convention exactly; what it
changes is that there will be WAL files numbered xxxFFFF (for some
number of trailing-1-bits I'm too lazy to work out at the moment) where
before there were not.  So the question really is how much external code
there is that is aware of that specific noncontiguous numbering behavior
and would be broken if things stopped being that way.

A page header contains WAL location, so getting rid of "hole" seems to
break pg_upgrade. No?

No, why would it do that?  The meaning and ordering of WAL addresses is
the same as before.  The only difference is that after the upgrade, the
system will stop skipping over 16MB of potentially usable WAL addresses
at the end of each subsequently-used 4GB of space.  The holes before
the switchover point are still holes, but that doesn't matter.

Oh, I see. You're right.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#91Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#87)
1 attachment(s)
Re: xlog location arithmetic

On Mon, Mar 12, 2012 at 10:34 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

I can see a usecase for having a pg_size_pretty(numeric) as an option.
Not necessarily a very big one, but a >0 one.

+1.

+1, too.

I did some beautification of this patch. I think the attached version
is cleaner and easier to read. Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

sizepretty_v4.patchapplication/octet-stream; name=sizepretty_v4.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 94ef2f0..3b25b85 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14994,7 +14994,20 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
         <literal><function>pg_size_pretty(<type>bigint</type>)</function></literal>
         </entry>
        <entry><type>text</type></entry>
-       <entry>Converts a size in bytes into a human-readable format with size units</entry>
+       <entry>
+         Converts a size in bytes expressed as a 64-bit integer into a
+         human-readable format with size units
+       </entry>
+      </row>
+      <row>
+       <entry>
+        <literal><function>pg_size_pretty(<type>numeric</type>)</function></literal>
+        </entry>
+       <entry><type>text</type></entry>
+       <entry>
+         Converts a size in bytes expressed as a numeric value into a
+         human-readable format with size units
+       </entry>
       </row>
       <row>
        <entry>
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index 26a8c01..fd19de7 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -24,6 +24,7 @@
 #include "storage/fd.h"
 #include "utils/acl.h"
 #include "utils/builtins.h"
+#include "utils/numeric.h"
 #include "utils/rel.h"
 #include "utils/relmapper.h"
 #include "utils/syscache.h"
@@ -550,6 +551,137 @@ pg_size_pretty(PG_FUNCTION_ARGS)
 	PG_RETURN_TEXT_P(cstring_to_text(buf));
 }
 
+static char *
+numeric_to_cstring(Numeric n)
+{
+	Datum		d = NumericGetDatum(n);
+	return DatumGetCString(DirectFunctionCall1(numeric_out, d));
+}
+
+static Numeric
+int64_to_numeric(int64 v)
+{
+	Datum		d = Int64GetDatum(v);
+	return DatumGetNumeric(DirectFunctionCall1(int8_numeric, d));
+}
+
+static bool
+numeric_is_less(Numeric a, Numeric b)
+{
+	Datum		da = NumericGetDatum(a);
+	Datum		db = NumericGetDatum(b);
+
+	return DatumGetBool(DirectFunctionCall2(numeric_lt, da, db));
+}
+
+static Numeric
+numeric_plus_one_over_two(Numeric n)
+{
+	Datum		d = NumericGetDatum(n);
+	Datum		one;
+	Datum		two;
+	Datum		result;
+
+	one = DirectFunctionCall1(int8_numeric, Int64GetDatum(1));
+	two = DirectFunctionCall1(int8_numeric, Int64GetDatum(2));
+	result = DirectFunctionCall2(numeric_add, d, one);
+	result = DirectFunctionCall2(numeric_div_trunc, result, two);
+	return DatumGetNumeric(result);
+}
+
+static Numeric
+numeric_shift_right(Numeric n, unsigned count)
+{
+	Datum		d = NumericGetDatum(n);
+	Datum		divisor_int64;
+	Datum		divisor_numeric;
+	Datum		result;
+
+	divisor_int64 = Int64GetDatum((int64) (1 << count));
+	divisor_numeric = DirectFunctionCall1(int8_numeric, divisor_int64);
+	result = DirectFunctionCall2(numeric_div_trunc, d, divisor_numeric);
+	return DatumGetNumeric(result);
+}
+
+Datum
+pg_size_pretty_numeric(PG_FUNCTION_ARGS)
+{
+	Numeric		size = PG_GETARG_NUMERIC(0);
+	Numeric		limit,
+				limit2;
+	char	   *buf,
+			   *result;
+
+	limit = int64_to_numeric(10 * 1024);
+	limit2 = int64_to_numeric(10 * 1024 * 2 - 1);
+
+	if (numeric_is_less(size, limit))
+	{
+		buf = numeric_to_cstring(size);
+		result = palloc(strlen(buf) + 7);
+		strcpy(result, buf);
+		strcat(result, " bytes");
+	}
+	else
+	{
+		/* keep one extra bit for rounding */
+		/* size >>= 9 */
+		size = numeric_shift_right(size, 9);
+
+		if (numeric_is_less(size, limit2))
+		{
+			/* size = (size + 1) / 2 */
+			size = numeric_plus_one_over_two(size);
+			buf = numeric_to_cstring(size);
+			result = palloc(strlen(buf) + 4);
+			strcpy(result, buf);
+			strcat(result, " kB");
+		}
+		else
+		{
+			/* size >>= 10 */
+			size = numeric_shift_right(size, 10);
+			if (numeric_is_less(size, limit2))
+			{
+				/* size = (size + 1) / 2 */
+				size = numeric_plus_one_over_two(size);
+				buf = numeric_to_cstring(size);
+				result = palloc(strlen(buf) + 4);
+				strcpy(result, buf);
+				strcat(result, " MB");
+			}
+			else
+			{
+				/* size >>= 10 */
+				size = numeric_shift_right(size, 10);
+
+				if (numeric_is_less(size, limit2))
+				{
+					/* size = (size + 1) / 2 */
+					size = numeric_plus_one_over_two(size);
+					buf = numeric_to_cstring(size);
+					result = palloc(strlen(buf) + 4);
+					strcpy(result, buf);
+					strcat(result, " GB");
+				}
+				else
+				{
+					/* size >>= 10 */
+					size = numeric_shift_right(size, 10);
+					/* size = (size + 1) / 2 */
+					size = numeric_plus_one_over_two(size);
+					buf = numeric_to_cstring(size);
+					result = palloc(strlen(buf) + 4);
+					strcpy(result, buf);
+					strcat(result, " TB");
+				}
+			}
+		}
+	}
+
+	PG_RETURN_TEXT_P(cstring_to_text(result));
+}
+
 /*
  * Get the filenode of a relation
  *
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 6414b33..4854e20 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2947,6 +2947,8 @@ DESCR("xlog filename, given an xlog location");
 
 DATA(insert OID = 3165 ( pg_xlog_location_diff		PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 1700 "25 25" _null_ _null_ _null_ _null_ pg_xlog_location_diff _null_ _null_ _null_ ));
 DESCR("difference in bytes, given two xlog locations");
+DATA(insert OID = 3166 ( pg_size_pretty			PGNSP PGUID 12 1 0 0 0 f f f f t f v 1 0 25 "1700" _null_ _null_ _null_ _null_ pg_size_pretty_numeric _null_ _null_ _null_ ));
+DESCR("convert a numeric to a human readable text using size units");
 
 DATA(insert OID = 3809 ( pg_export_snapshot		PGNSP PGUID 12 1 0 0 0 f f f f t f v 0 0 25 "" _null_ _null_ _null_ _null_ pg_export_snapshot _null_ _null_ _null_ ));
 DESCR("export a snapshot");
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 9fda7ad..312b5e9 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -453,6 +453,7 @@ extern Datum pg_database_size_name(PG_FUNCTION_ARGS);
 extern Datum pg_relation_size(PG_FUNCTION_ARGS);
 extern Datum pg_total_relation_size(PG_FUNCTION_ARGS);
 extern Datum pg_size_pretty(PG_FUNCTION_ARGS);
+extern Datum pg_size_pretty_numeric(PG_FUNCTION_ARGS);
 extern Datum pg_table_size(PG_FUNCTION_ARGS);
 extern Datum pg_indexes_size(PG_FUNCTION_ARGS);
 extern Datum pg_relation_filenode(PG_FUNCTION_ARGS);
#92Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#91)
Re: xlog location arithmetic

On Sat, Apr 14, 2012 at 5:30 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Mar 12, 2012 at 10:34 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

I can see a usecase for having a pg_size_pretty(numeric) as an option.
Not necessarily a very big one, but a >0 one.

+1.

+1, too.

I did some beautification of this patch.  I think the attached version
is cleaner and easier to read.  Thoughts?

Looks good to me. Thanks for polishing the patch!

Regards,

--
Fujii Masao

#93Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#92)
Re: xlog location arithmetic

On Sat, Apr 14, 2012 at 7:25 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sat, Apr 14, 2012 at 5:30 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Mar 12, 2012 at 10:34 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

I can see a usecase for having a pg_size_pretty(numeric) as an option.
Not necessarily a very big one, but a >0 one.

+1.

+1, too.

I did some beautification of this patch.  I think the attached version
is cleaner and easier to read.  Thoughts?

Looks good to me. Thanks for polishing the patch!

You're welcome. Committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company