number of loaded/unloaded COPY rows

Started by Volkan YAZICIabout 20 years ago6 messages
#1Volkan YAZICI
yazicivo@ttnet.net.tr
1 attachment(s)

I prepared a patch for "Have COPY return the number of rows
loaded/unloaded?" TODO. (Sorry for disturbing list with such a simple
topic, but per warning from Bruce Momjian, I send my proposal to -hackers
first.)

I used the "appending related information to commandTag" method which is
used for INSERT/UPDATE/DELETE/FETCH commands too. Furthermore, I edited
libpq to make PQcmdTuples() interpret affected rows from cmdStatus value
for COPY command. (Changes don't cause any compatibility problems for API
and seems like work with triggers too.)

One of the problems related with the used concept is trying to encapsulate
processed number of rows within an uint32 variable. This causes an internal
limit for counting COPY when we think it can process billions of rows. I
couldn't find a solution for this. (Maybe, two uint32 can be used to store
row count.) But other processed row counters (like INSERT/UPDATE) uses
uint32 too.

What's your suggestions and comments?

Regards.

Attachments:

copy_cmdstatus.patchtext/plain; charset=us-asciiDownload
Index: src/backend/commands/copy.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/commands/copy.c,v
retrieving revision 1.255
diff -u -r1.255 copy.c
--- src/backend/commands/copy.c	22 Nov 2005 18:17:08 -0000	1.255
+++ src/backend/commands/copy.c	12 Dec 2005 17:18:44 -0000
@@ -102,6 +102,7 @@
 	int			client_encoding;	/* remote side's character encoding */
 	bool		need_transcoding;		/* client encoding diff from server? */
 	bool		client_only_encoding;	/* encoding not valid on server? */
+	uint32		processed;				/* # of tuples processed */
 
 	/* parameters from the COPY command */
 	Relation	rel;			/* relation to copy to or from */
@@ -646,7 +647,7 @@
  * Do not allow the copy if user doesn't have proper permission to access
  * the table.
  */
-void
+uint32
 DoCopy(const CopyStmt *stmt)
 {
 	CopyState	cstate;
@@ -660,6 +661,7 @@
 	AclMode		required_access = (is_from ? ACL_INSERT : ACL_SELECT);
 	AclResult	aclresult;
 	ListCell   *option;
+	uint32		processed;
 
 	/* Allocate workspace and zero all fields */
 	cstate = (CopyStateData *) palloc0(sizeof(CopyStateData));
@@ -935,7 +937,7 @@
 	initStringInfo(&cstate->line_buf);
 	cstate->line_buf_converted = false;
 	cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
-	cstate->raw_buf_index = cstate->raw_buf_len = 0;
+	cstate->raw_buf_index = cstate->raw_buf_len = cstate->processed = 0;
 
 	/* Set up encoding conversion info */
 	cstate->client_encoding = pg_get_client_encoding();
@@ -1080,7 +1082,10 @@
 	pfree(cstate->attribute_buf.data);
 	pfree(cstate->line_buf.data);
 	pfree(cstate->raw_buf);
+
+	processed = cstate->processed;
 	pfree(cstate);
+	return processed;
 }
 
 
@@ -1310,6 +1315,8 @@
 								 VARSIZE(outputbytes) - VARHDRSZ);
 				}
 			}
+
+			cstate->processed++;
 		}
 
 		CopySendEndOfRow(cstate);
@@ -1916,6 +1923,8 @@
 
 			/* AFTER ROW INSERT Triggers */
 			ExecARInsertTriggers(estate, resultRelInfo, tuple);
+
+			cstate->processed++;
 		}
 	}
 
Index: src/backend/tcop/utility.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/tcop/utility.c,v
retrieving revision 1.250
diff -u -r1.250 utility.c
--- src/backend/tcop/utility.c	29 Nov 2005 01:25:49 -0000	1.250
+++ src/backend/tcop/utility.c	12 Dec 2005 17:18:45 -0000
@@ -640,7 +640,12 @@
 			break;
 
 		case T_CopyStmt:
-			DoCopy((CopyStmt *) parsetree);
+			{
+				uint32	processed = DoCopy((CopyStmt *) parsetree);
+				
+				snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
+						 "COPY %u", processed);
+			}
 			break;
 
 		case T_PrepareStmt:
Index: src/include/commands/copy.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/commands/copy.h,v
retrieving revision 1.25
diff -u -r1.25 copy.h
--- src/include/commands/copy.h	31 Dec 2004 22:03:28 -0000	1.25
+++ src/include/commands/copy.h	12 Dec 2005 17:19:07 -0000
@@ -17,6 +17,6 @@
 #include "nodes/parsenodes.h"
 
 
-extern void DoCopy(const CopyStmt *stmt);
+extern uint32 DoCopy(const CopyStmt *stmt);
 
 #endif   /* COPY_H */
Index: src/interfaces/libpq/fe-exec.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/fe-exec.c,v
retrieving revision 1.177
diff -u -r1.177 fe-exec.c
--- src/interfaces/libpq/fe-exec.c	22 Nov 2005 18:17:32 -0000	1.177
+++ src/interfaces/libpq/fe-exec.c	12 Dec 2005 17:18:48 -0000
@@ -2177,7 +2177,7 @@
 char *
 PQcmdTuples(PGresult *res)
 {
-	char	   *p;
+	char	   *p, *c;
 
 	if (!res)
 		return "";
@@ -2195,7 +2195,8 @@
 		p = res->cmdStatus + 6;
 	else if (strncmp(res->cmdStatus, "FETCH ", 6) == 0)
 		p = res->cmdStatus + 5;
-	else if (strncmp(res->cmdStatus, "MOVE ", 5) == 0)
+	else if (strncmp(res->cmdStatus, "MOVE ", 5) == 0 ||
+			 strncmp(res->cmdStatus, "COPY ", 5) == 0)
 		p = res->cmdStatus + 4;
 	else
 		return "";
@@ -2203,14 +2204,19 @@
 	p++;
 
 	if (*p == 0)
-	{
-		pqInternalNotice(&res->noticeHooks,
-						 "could not interpret result from server: %s",
-						 res->cmdStatus);
-		return "";
-	}
+		goto error;
 
-	return p;
+	/* check if we have an int */
+	for (c = p; isdigit((int) *c); ++c)
+		;
+	if (*c == 0)
+		return p;
+
+error:
+	pqInternalNotice(&res->noticeHooks,
+					 "could not interpret result from server: %s",
+					 res->cmdStatus);
+	return "";
 }
 
 /*
#2Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Volkan YAZICI (#1)
Re: number of loaded/unloaded COPY rows

Volkan YAZICI wrote:

I prepared a patch for "Have COPY return the number of rows
loaded/unloaded?" TODO. (Sorry for disturbing list with such a simple
topic, but per warning from Bruce Momjian, I send my proposal to -hackers
first.)

I used the "appending related information to commandTag" method which is
used for INSERT/UPDATE/DELETE/FETCH commands too. Furthermore, I edited
libpq to make PQcmdTuples() interpret affected rows from cmdStatus value
for COPY command. (Changes don't cause any compatibility problems for API
and seems like work with triggers too.)

One of the problems related with the used concept is trying to encapsulate
processed number of rows within an uint32 variable. This causes an internal
limit for counting COPY when we think it can process billions of rows. I
couldn't find a solution for this. (Maybe, two uint32 can be used to store
row count.) But other processed row counters (like INSERT/UPDATE) uses
uint32 too.

What's your suggestions and comments?

Good question. The libpq interface returns the count as a character
string using PQcmdTuples(), meaning libpq doesn't have any size
limitation. The problem is only the counter you are using, and I think
an int64 is the proper solution. If int64 isn't really 64-bits, the
code should still work though.

In fact we have this TODO, which is related:

* Change LIMIT/OFFSET and FETCH/MOVE to use int8

This requires the same type of change.

I have added this TODO:

* Allow the count returned by SELECT, etc to be to represent as an int64
to allow a higher range of values

This requires a change to es_processed, I think.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#3Volkan YAZICI
yazicivo@ttnet.net.tr
In reply to: Bruce Momjian (#2)
2 attachment(s)
Re: number of loaded/unloaded COPY rows

On Dec 16 08:47, Bruce Momjian wrote:

I think an int64 is the proper solution. If int64 isn't really
64-bits, the code should still work though.

(I'd prefer uint64 instead of an int64.) In src/include/c.h, in
this or that way it'll assign a type for uint64, so there won't
be any problem for both 64-bit and non-64-bit architectures.

I've attached the updated patch. This one uses uint64 and
UINT64_FORMAT while printing uint64 value inside string.

I used char[20+1] as buffer size to store uint64 value's string
representation. (Because, AFAIK, maximum decimal digit length of
an [u]int64 equals to 2^64 - 1 = 20.) In this context, when I
looked at the example usages (for instance as in
backend/commands/sequence.c) it's stored in a char[100] buffer.
Maybe we should define a constant in pg_config.h like
INT64_PRINT_LEN. This will define a standard behaviour with
INT64_FORMAT for using integers inside strings.

For instance:
char buf[INT64_PRINT_LEN+1];
snprintf(buf, sizeof(buf), INT64_FORMAT, var);

In fact we have this TODO, which is related:

* Change LIMIT/OFFSET and FETCH/MOVE to use int8

This requires the same type of change.

I have added this TODO:

* Allow the count returned by SELECT, etc to be to represent
as an int64 to allow a higher range of values

This requires a change to es_processed, I think.

I think so. es_processed is defined as uint32. It should be
uint64 too.

I tried to prepare a patch for es_processed issue. But when I look
further in the code, found that there're lots of mixed usages of
"uint32" and "long" for row count related trackings. (Moreover,
as you can see from the patch, there's a problem with ULLONG_MAX
usage in there.)

I'm aware of the patch's out-of-usability, but I just tried to
underline some (IMHO) problems.

Last minute edit: Proposal: Maybe we can define a (./configure
controlled) type like pg_int (with bounds like PG_INT_MAX) to use
in counter related processes.

- * -

AFAIK, there're two ways to implement a counter:

1. Using integer types supplied by the compiler, like uint64 as we
discussed above.
Pros: Whole mathematical operations are handled by the compiler.
Cons: Implementation is bounded by the system architecture.

2. Using arrays to hold numeric values, like we did in the
implementation of SQL numeric types.
Pros: Value lengths bounded by available memory.
Cons: Mathematical operations have to be handled by software.
Therefore, this will cause a small overhead in performance
aspect compared to previous implementation.

I'm not sure if we can use the second implementation (in the
performance point of view) for the COPY command's counter. But IMHO
it can be agreeable for SELECT/INSERT/UPDATE/DELETE operations'
counters. OTOH, by using this way, we'll form a proper method for
counting without any (logical) bounds.

What's your opinion? If you aggree, I'll try to use the second
implementation for counters - except COPY.

Regards.

--
"We are the middle children of history, raised by television to believe
that someday we'll be millionaires and movie stars and rock stars, but
we won't. And we're just learning this fact," Tyler said. "So don't
fuck with us."

Attachments:

uint64.patchtext/plain; charset=us-asciiDownload
Index: src/backend/commands/portalcmds.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/commands/portalcmds.c,v
retrieving revision 1.44
diff -u -r1.44 portalcmds.c
--- src/backend/commands/portalcmds.c	3 Nov 2005 17:11:35 -0000	1.44
+++ src/backend/commands/portalcmds.c	17 Dec 2005 11:44:29 -0000
@@ -395,7 +395,7 @@
 
 		if (!portal->atEnd)
 		{
-			long		store_pos;
+			uint64	store_pos;
 
 			if (portal->posOverflow)	/* oops, cannot trust portalPos */
 				ereport(ERROR,
Index: src/backend/tcop/pquery.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/tcop/pquery.c,v
retrieving revision 1.98
diff -u -r1.98 pquery.c
--- src/backend/tcop/pquery.c	22 Nov 2005 18:17:21 -0000	1.98
+++ src/backend/tcop/pquery.c	17 Dec 2005 11:44:31 -0000
@@ -179,6 +179,7 @@
 	if (completionTag)
 	{
 		Oid			lastOid;
+		char		buf[21];
 
 		switch (operation)
 		{
@@ -190,16 +191,22 @@
 					lastOid = queryDesc->estate->es_lastoid;
 				else
 					lastOid = InvalidOid;
+				snprintf(buf, sizeof(buf), UINT64_FORMAT,
+						 queryDesc->estate->es_processed);
 				snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
-				   "INSERT %u %u", lastOid, queryDesc->estate->es_processed);
+						 "INSERT %u %s", lastOid, buf);
 				break;
 			case CMD_UPDATE:
+				snprintf(buf, sizeof(buf), UINT64_FORMAT,
+						 queryDesc->estate->es_processed);
 				snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
-						 "UPDATE %u", queryDesc->estate->es_processed);
+						 "UPDATE %s", buf);
 				break;
 			case CMD_DELETE:
+				snprintf(buf, sizeof(buf), UINT64_FORMAT,
+						 queryDesc->estate->es_processed);
 				snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
-						 "DELETE %u", queryDesc->estate->es_processed);
+						 "DELETE %s", buf);
 				break;
 			default:
 				strcpy(completionTag, "???");
@@ -536,7 +543,7 @@
  * suspended due to exhaustion of the count parameter.
  */
 bool
-PortalRun(Portal portal, long count,
+PortalRun(Portal portal, uint64 count,
 		  DestReceiver *dest, DestReceiver *altdest,
 		  char *completionTag)
 {
@@ -736,15 +743,15 @@
  *
  * Returns number of rows processed (suitable for use in result tag)
  */
-static long
+static uint64
 PortalRunSelect(Portal portal,
 				bool forward,
-				long count,
+				uint64 count,
 				DestReceiver *dest)
 {
 	QueryDesc  *queryDesc;
 	ScanDirection direction;
-	uint32		nprocessed;
+	uint64		nprocessed;
 
 	/*
 	 * NB: queryDesc will be NULL if we are fetching from a held cursor or a
@@ -797,12 +804,11 @@
 
 		if (direction != NoMovementScanDirection)
 		{
-			long		oldPos;
+			uint64	oldPos;
 
 			if (nprocessed > 0)
 				portal->atStart = false;		/* OK to go backward now */
-			if (count == 0 ||
-				(unsigned long) nprocessed < (unsigned long) count)
+			if (count == 0 || nprocessed < count)
 				portal->atEnd = true;	/* we retrieved 'em all */
 			oldPos = portal->portalPos;
 			portal->portalPos += nprocessed;
@@ -844,8 +850,7 @@
 				portal->atEnd = false;	/* OK to go forward now */
 				portal->portalPos++;	/* adjust for endpoint case */
 			}
-			if (count == 0 ||
-				(unsigned long) nprocessed < (unsigned long) count)
+			if (count == 0 || nprocessed < count)
 			{
 				portal->atStart = true; /* we retrieved 'em all */
 				portal->portalPos = 0;
@@ -853,7 +858,7 @@
 			}
 			else
 			{
-				long		oldPos;
+				uint64	oldPos;
 
 				oldPos = portal->portalPos;
 				portal->portalPos -= nprocessed;
@@ -879,11 +884,11 @@
  * are run in the caller's memory context (since we have no estate).  Watch
  * out for memory leaks.
  */
-static uint32
-RunFromStore(Portal portal, ScanDirection direction, long count,
+static uint64
+RunFromStore(Portal portal, ScanDirection direction, uint64 count,
 			 DestReceiver *dest)
 {
-	long		current_tuple_count = 0;
+	uint64			current_tuple_count = 0;
 	TupleTableSlot *slot;
 
 	slot = MakeSingleTupleTableSlot(portal->tupDesc);
@@ -935,7 +940,7 @@
 
 	ExecDropSingleTupleTableSlot(slot);
 
-	return (uint32) current_tuple_count;
+	return current_tuple_count;
 }
 
 /*
@@ -1239,10 +1244,10 @@
  *
  * Returns number of rows processed (suitable for use in result tag)
  */
-static long
+static uint64
 DoPortalRunFetch(Portal portal,
 				 FetchDirection fdirection,
-				 long count,
+				 uint64 count,
 				 DestReceiver *dest)
 {
 	bool		forward;
@@ -1278,7 +1283,7 @@
 				 * we are.	In any case, we arrange to fetch the target row
 				 * going forwards.
 				 */
-				if (portal->posOverflow || portal->portalPos == LONG_MAX ||
+				if (portal->posOverflow || portal->portalPos == ULLONG_MAX ||
 					count - 1 <= portal->portalPos / 2)
 				{
 					DoPortalRewind(portal);
@@ -1288,7 +1293,7 @@
 				}
 				else
 				{
-					long		pos = portal->portalPos;
+					uint64	pos = portal->portalPos;
 
 					if (portal->atEnd)
 						pos++;	/* need one extra fetch if off end */
@@ -1400,7 +1405,7 @@
 	 */
 	if (!forward && count == FETCH_ALL && dest->mydest == DestNone)
 	{
-		long		result = portal->portalPos;
+		uint64	result = portal->portalPos;
 
 		if (result > 0 && !portal->atEnd)
 			result--;
Index: src/include/nodes/execnodes.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/nodes/execnodes.h,v
retrieving revision 1.146
diff -u -r1.146 execnodes.h
--- src/include/nodes/execnodes.h	2 Dec 2005 20:03:42 -0000	1.146
+++ src/include/nodes/execnodes.h	17 Dec 2005 11:44:40 -0000
@@ -318,7 +318,7 @@
 
 	TupleTable	es_tupleTable;	/* Array of TupleTableSlots */
 
-	uint32		es_processed;	/* # of tuples processed */
+	uint64		es_processed;	/* # of tuples processed */
 	Oid			es_lastoid;		/* last oid processed (by INSERT) */
 	List	   *es_rowMarks;	/* not good place, but there is no other */
 	bool		es_forUpdate;	/* true = FOR UPDATE, false = FOR SHARE */
Index: src/include/utils/portal.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/utils/portal.h,v
retrieving revision 1.57
diff -u -r1.57 portal.h
--- src/include/utils/portal.h	15 Oct 2005 02:49:46 -0000	1.57
+++ src/include/utils/portal.h	17 Dec 2005 11:44:50 -0000
@@ -165,7 +165,7 @@
 	bool		atStart;
 	bool		atEnd;
 	bool		posOverflow;
-	long		portalPos;
+	uint64		portalPos;
 } PortalData;
 
 /*
copy_cmdstatus.patchtext/plain; charset=us-asciiDownload
Index: src/backend/commands/copy.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/commands/copy.c,v
retrieving revision 1.255
diff -u -r1.255 copy.c
--- src/backend/commands/copy.c	22 Nov 2005 18:17:08 -0000	1.255
+++ src/backend/commands/copy.c	17 Dec 2005 11:41:49 -0000
@@ -102,6 +102,7 @@
 	int			client_encoding;	/* remote side's character encoding */
 	bool		need_transcoding;		/* client encoding diff from server? */
 	bool		client_only_encoding;	/* encoding not valid on server? */
+	uint64		processed;				/* # of tuples processed */
 
 	/* parameters from the COPY command */
 	Relation	rel;			/* relation to copy to or from */
@@ -646,7 +647,7 @@
  * Do not allow the copy if user doesn't have proper permission to access
  * the table.
  */
-void
+uint64
 DoCopy(const CopyStmt *stmt)
 {
 	CopyState	cstate;
@@ -660,6 +661,7 @@
 	AclMode		required_access = (is_from ? ACL_INSERT : ACL_SELECT);
 	AclResult	aclresult;
 	ListCell   *option;
+	uint64		processed;
 
 	/* Allocate workspace and zero all fields */
 	cstate = (CopyStateData *) palloc0(sizeof(CopyStateData));
@@ -935,7 +937,7 @@
 	initStringInfo(&cstate->line_buf);
 	cstate->line_buf_converted = false;
 	cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
-	cstate->raw_buf_index = cstate->raw_buf_len = 0;
+	cstate->raw_buf_index = cstate->raw_buf_len = cstate->processed = 0;
 
 	/* Set up encoding conversion info */
 	cstate->client_encoding = pg_get_client_encoding();
@@ -1080,7 +1082,10 @@
 	pfree(cstate->attribute_buf.data);
 	pfree(cstate->line_buf.data);
 	pfree(cstate->raw_buf);
+
+	processed = cstate->processed;
 	pfree(cstate);
+	return processed;
 }
 
 
@@ -1310,6 +1315,8 @@
 								 VARSIZE(outputbytes) - VARHDRSZ);
 				}
 			}
+
+			cstate->processed++;
 		}
 
 		CopySendEndOfRow(cstate);
@@ -1916,6 +1923,8 @@
 
 			/* AFTER ROW INSERT Triggers */
 			ExecARInsertTriggers(estate, resultRelInfo, tuple);
+
+			cstate->processed++;
 		}
 	}
 
Index: src/backend/tcop/utility.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/tcop/utility.c,v
retrieving revision 1.250
diff -u -r1.250 utility.c
--- src/backend/tcop/utility.c	29 Nov 2005 01:25:49 -0000	1.250
+++ src/backend/tcop/utility.c	17 Dec 2005 11:41:51 -0000
@@ -640,7 +640,14 @@
 			break;
 
 		case T_CopyStmt:
-			DoCopy((CopyStmt *) parsetree);
+			{
+				uint64	processed = DoCopy((CopyStmt *) parsetree);
+				char	buf[21];
+
+				snprintf(buf, sizeof(buf), UINT64_FORMAT, processed);
+				snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
+						 "COPY %s", buf);
+			}
 			break;
 
 		case T_PrepareStmt:
Index: src/include/commands/copy.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/commands/copy.h,v
retrieving revision 1.25
diff -u -r1.25 copy.h
--- src/include/commands/copy.h	31 Dec 2004 22:03:28 -0000	1.25
+++ src/include/commands/copy.h	17 Dec 2005 11:41:51 -0000
@@ -16,7 +16,6 @@
 
 #include "nodes/parsenodes.h"
 
-
-extern void DoCopy(const CopyStmt *stmt);
+extern uint64 DoCopy(const CopyStmt *stmt);
 
 #endif   /* COPY_H */
Index: src/interfaces/libpq/fe-exec.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/interfaces/libpq/fe-exec.c,v
retrieving revision 1.177
diff -u -r1.177 fe-exec.c
--- src/interfaces/libpq/fe-exec.c	22 Nov 2005 18:17:32 -0000	1.177
+++ src/interfaces/libpq/fe-exec.c	17 Dec 2005 11:41:54 -0000
@@ -2177,7 +2177,7 @@
 char *
 PQcmdTuples(PGresult *res)
 {
-	char	   *p;
+	char	   *p, *c;
 
 	if (!res)
 		return "";
@@ -2195,7 +2195,8 @@
 		p = res->cmdStatus + 6;
 	else if (strncmp(res->cmdStatus, "FETCH ", 6) == 0)
 		p = res->cmdStatus + 5;
-	else if (strncmp(res->cmdStatus, "MOVE ", 5) == 0)
+	else if (strncmp(res->cmdStatus, "MOVE ", 5) == 0 ||
+			 strncmp(res->cmdStatus, "COPY ", 5) == 0)
 		p = res->cmdStatus + 4;
 	else
 		return "";
@@ -2203,14 +2204,19 @@
 	p++;
 
 	if (*p == 0)
-	{
-		pqInternalNotice(&res->noticeHooks,
-						 "could not interpret result from server: %s",
-						 res->cmdStatus);
-		return "";
-	}
-
-	return p;
+		goto interpret_error;
+	
+	/* check if we have an int */
+	for (c = p; isdigit((int) *c); ++c)
+		;
+	if (*c == 0)
+		return p;
+	
+interpret_error:
+	pqInternalNotice(&res->noticeHooks,
+					 "could not interpret result from server: %s",
+					 res->cmdStatus);
+	return "";
 }
 
 /*
#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Volkan YAZICI (#3)
Re: number of loaded/unloaded COPY rows

Volkan YAZICI wrote:

On Dec 16 08:47, Bruce Momjian wrote:

I think an int64 is the proper solution. If int64 isn't really
64-bits, the code should still work though.

(I'd prefer uint64 instead of an int64.) In src/include/c.h, in
this or that way it'll assign a type for uint64, so there won't
be any problem for both 64-bit and non-64-bit architectures.

Right.

I've attached the updated patch. This one uses uint64 and

You should use the patches email list instead for patches. What I
usually do is to post the discussion to hackers, and just send a second
email to patches with the patch, or put the patch at a URL and reference
the URL in the email message.

UINT64_FORMAT while printing uint64 value inside string.

Sounds good.

I used char[20+1] as buffer size to store uint64 value's string
representation. (Because, AFAIK, maximum decimal digit length of
an [u]int64 equals to 2^64 - 1 = 20.) In this context, when I
looked at the example usages (for instance as in
backend/commands/sequence.c) it's stored in a char[100] buffer.
Maybe we should define a constant in pg_config.h like
INT64_PRINT_LEN. This will define a standard behaviour with
INT64_FORMAT for using integers inside strings.

We could do that, but it might be overkill. I don't think we use a
#define for int32 to string either, but if you want to clean it up just
define the int32 and int64 defines in c.h and patch the code to use them
consistently. That might be nice.

For instance:
char buf[INT64_PRINT_LEN+1];
snprintf(buf, sizeof(buf), INT64_FORMAT, var);

Yep, that is nice. I am thinking you should submit your COPY patch
using a hard-coded constant, then submit a followup patch that fixes
this for int32 and int64 in all places.

In fact we have this TODO, which is related:

* Change LIMIT/OFFSET and FETCH/MOVE to use int8

This requires the same type of change.

I have added this TODO:

* Allow the count returned by SELECT, etc to be to represent
as an int64 to allow a higher range of values

This requires a change to es_processed, I think.

I think so. es_processed is defined as uint32. It should be
uint64 too.

Yep.

I tried to prepare a patch for es_processed issue. But when I look
further in the code, found that there're lots of mixed usages of
"uint32" and "long" for row count related trackings. (Moreover,
as you can see from the patch, there's a problem with ULLONG_MAX
usage in there.)

Agreed, it needs cleanup. We have tried to fix it as we worked on other
things, but we need a full review of what is happening. However, let's
do one thing in each patch so we can easily evaluate things.

I'm aware of the patch's out-of-usability, but I just tried to
underline some (IMHO) problems.

Last minute edit: Proposal: Maybe we can define a (./configure
controlled) type like pg_int (with bounds like PG_INT_MAX) to use
in counter related processes.

Seems we are best just defining a constant in c.h that is larger than
anything we might need on any platform, rather than run configure tests
for this.

AFAIK, there're two ways to implement a counter:

1. Using integer types supplied by the compiler, like uint64 as we
discussed above.
Pros: Whole mathematical operations are handled by the compiler.
Cons: Implementation is bounded by the system architecture.

I think we are at a point that people running on systems with no int64
support should not expect to get valid return values for >2 billion row
COPY operations. There is no way on their platform for them even to
work with those values, so why be concerned about platforms that do not
support it. Certainly converting everything to int64 from int32 isn't
going to make things any _worse_ for them than what they have now, and
it will fix the majority of our platforms. We are not getting lots of
complaints about our current behavior so I think we are OK just
improving the platforms that support int64.

2. Using arrays to hold numeric values, like we did in the
implementation of SQL numeric types.
Pros: Value lengths bounded by available memory.
Cons: Mathematical operations have to be handled by software.
Therefore, this will cause a small overhead in performance
aspect compared to previous implementation.

I'm not sure if we can use the second implementation (in the
performance point of view) for the COPY command's counter. But IMHO
it can be agreeable for SELECT/INSERT/UPDATE/DELETE operations'
counters. OTOH, by using this way, we'll form a proper method for
counting without any (logical) bounds.

What's your opinion? If you aggree, I'll try to use the second
implementation for counters - except COPY.

No, please use int64 at this point. I think the number of platforms
without int64 support is small, and dwindling, _and_ very few of those
non-int64-supporting platforms are doing operations of the size that
would hit this limit.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#5Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Volkan YAZICI (#3)
Re: number of loaded/unloaded COPY rows

Also, can I get a context diff for the patch? See the developers FAQ
for information on how our patch process works. Thanks.

---------------------------------------------------------------------------

Volkan YAZICI wrote:

On Dec 16 08:47, Bruce Momjian wrote:

I think an int64 is the proper solution. If int64 isn't really
64-bits, the code should still work though.

(I'd prefer uint64 instead of an int64.) In src/include/c.h, in
this or that way it'll assign a type for uint64, so there won't
be any problem for both 64-bit and non-64-bit architectures.

I've attached the updated patch. This one uses uint64 and
UINT64_FORMAT while printing uint64 value inside string.

I used char[20+1] as buffer size to store uint64 value's string
representation. (Because, AFAIK, maximum decimal digit length of
an [u]int64 equals to 2^64 - 1 = 20.) In this context, when I
looked at the example usages (for instance as in
backend/commands/sequence.c) it's stored in a char[100] buffer.
Maybe we should define a constant in pg_config.h like
INT64_PRINT_LEN. This will define a standard behaviour with
INT64_FORMAT for using integers inside strings.

For instance:
char buf[INT64_PRINT_LEN+1];
snprintf(buf, sizeof(buf), INT64_FORMAT, var);

In fact we have this TODO, which is related:

* Change LIMIT/OFFSET and FETCH/MOVE to use int8

This requires the same type of change.

I have added this TODO:

* Allow the count returned by SELECT, etc to be to represent
as an int64 to allow a higher range of values

This requires a change to es_processed, I think.

I think so. es_processed is defined as uint32. It should be
uint64 too.

I tried to prepare a patch for es_processed issue. But when I look
further in the code, found that there're lots of mixed usages of
"uint32" and "long" for row count related trackings. (Moreover,
as you can see from the patch, there's a problem with ULLONG_MAX
usage in there.)

I'm aware of the patch's out-of-usability, but I just tried to
underline some (IMHO) problems.

Last minute edit: Proposal: Maybe we can define a (./configure
controlled) type like pg_int (with bounds like PG_INT_MAX) to use
in counter related processes.

- * -

AFAIK, there're two ways to implement a counter:

1. Using integer types supplied by the compiler, like uint64 as we
discussed above.
Pros: Whole mathematical operations are handled by the compiler.
Cons: Implementation is bounded by the system architecture.

2. Using arrays to hold numeric values, like we did in the
implementation of SQL numeric types.
Pros: Value lengths bounded by available memory.
Cons: Mathematical operations have to be handled by software.
Therefore, this will cause a small overhead in performance
aspect compared to previous implementation.

I'm not sure if we can use the second implementation (in the
performance point of view) for the COPY command's counter. But IMHO
it can be agreeable for SELECT/INSERT/UPDATE/DELETE operations'
counters. OTOH, by using this way, we'll form a proper method for
counting without any (logical) bounds.

What's your opinion? If you aggree, I'll try to use the second
implementation for counters - except COPY.

Regards.

--
"We are the middle children of history, raised by television to believe
that someday we'll be millionaires and movie stars and rock stars, but
we won't. And we're just learning this fact," Tyler said. "So don't
fuck with us."

[ Attachment, skipping... ]

[ Attachment, skipping... ]

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#4)
Re: number of loaded/unloaded COPY rows

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I think we are at a point that people running on systems with no int64
support should not expect to get valid return values for >2 billion row
COPY operations.

I agree, there's no need to work harder on this than changing the
datatype to uint64.

There are some places (vacuum and index build IIRC) that use "double"
counters instead of either int32 or int64. That was a good compromise
a few years ago, but I'm not sure we still need it. I think we can
reasonably say that our goals for backward-compatibility to systems
with no int64 datatype do not include working nicely with tables
exceeding 4G rows.

I didn't like that the patch introduced new buffer variables that were
not there before --- that is just adding complexity to no point. This
patch should not need to do more than change some variables' datatypes
and adjust printf formats and string buffer lengths to match.

regards, tom lane