[PATCH] Performance Improvement For Copy From Binary Files

Started by Bharath Rupireddyover 5 years ago27 messages
#1Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
2 attachment(s)

Hi Hackers,

For Copy From Binary files, there exists below information for each
tuple/row.
1. field count(number of columns)
2. for every field, field size(column data length)
3. field data of field size(actual column data)

Currently, all the above data required at each step is read directly from
file using fread() and this happens for all the tuples/rows.

One observation is that in the total execution time of a copy from binary
file, the fread() call is taking upto 20% of time and the fread() function
call count is also too high.

For instance, with a dataset of size 5.3GB, 10million tuples with 10
columns,
total exec time in sec total time taken for fread() fread() function call
count
101.193 *21.33* 210000005
101.345 *21.436* 210000005

The total time taken for fread() and the corresponding function call count
may increase if we have more number of columns for instance 1000.

One solution to this problem is to read data from binary file in
RAW_BUF_SIZE(64KB) chunks to avoid repeatedly calling fread()(thus possibly
avoiding few disk IOs). This is similar to the approach followed for
csv/text files.

Attaching a patch, implementing the above solution for binary format files.

Below is the improvement gained.
total exec time in sec total time taken for fread() fread() function call
count
75.757 *2.73* 160884
75.351 *2.742* 160884

*Execution is 1.36X times faster, fread() time is reduced by 87%, fread()
call count is reduced by 99%.*

Request the community to take this patch for review if this approach and
improvement seem beneficial.

Any suggestions to improve further are most welcome.

Attached also is the config file used for testing the above use case.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

config.txttext/plain; charset=UTF-8; name=config.txtDownload
v1-0001-Performance-Improvement-For-Copy-From-Binary-File.patchapplication/octet-stream; name=v1-0001-Performance-Improvement-For-Copy-From-Binary-File.patchDownload
From 31b844ab11b6933421cd4ac2a56d6999b69a84ae Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddy@enterprisedb.com>
Date: Mon, 29 Jun 2020 09:08:02 +0530
Subject: [PATCH v1] Performance Improvement For Copy From Binary Files

Read data from binary file in RAW_BUF_SIZE bytes at a time for
COPY FROM command, instead of reading everytime from file for
field count, size and field data. This avoids fread calls
significantly and improves performance of the COPY FROM binary
files command.
---
 src/backend/commands/copy.c | 160 ++++++++++++++++++++++++++++--------
 1 file changed, 127 insertions(+), 33 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 6b1fd6d4cc..dd50643097 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -390,8 +390,6 @@ static int	CopyGetData(CopyState cstate, void *databuf,
 static void CopySendInt32(CopyState cstate, int32 val);
 static bool CopyGetInt32(CopyState cstate, int32 *val);
 static void CopySendInt16(CopyState cstate, int16 val);
-static bool CopyGetInt16(CopyState cstate, int16 *val);
-
 
 /*
  * Send copy start/stop messages for frontend copies.  These have changed
@@ -760,24 +758,6 @@ CopySendInt16(CopyState cstate, int16 val)
 	CopySendData(cstate, &buf, sizeof(buf));
 }
 
-/*
- * CopyGetInt16 reads an int16 that appears in network byte order
- */
-static bool
-CopyGetInt16(CopyState cstate, int16 *val)
-{
-	uint16		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int16) pg_ntoh16(buf);
-	return true;
-}
-
-
 /*
  * CopyLoadRawBuf loads some more data into raw_buf
  *
@@ -3367,7 +3347,12 @@ BeginCopyFrom(ParseState *pstate,
 	initStringInfo(&cstate->attribute_buf);
 	initStringInfo(&cstate->line_buf);
 	cstate->line_buf_converted = false;
-	cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
+
+	if (cstate->binary)
+		cstate->raw_buf = (char *) palloc0(RAW_BUF_SIZE);
+	else
+		cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
+
 	cstate->raw_buf_index = cstate->raw_buf_len = 0;
 
 	/* Assign range table, we'll need it in CopyFrom. */
@@ -3736,15 +3721,58 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 		/* binary */
 		int16		fld_count;
 		ListCell   *cur;
+		int readbytes = 0;
 
 		cstate->cur_lineno++;
 
-		if (!CopyGetInt16(cstate, &fld_count))
+		if (cstate->cur_lineno == 1)
 		{
-			/* EOF detected (end of file, or protocol-level EOF) */
-			return false;
+			/* This is for the first time, so read in buff size amount
+			 * of data from file.
+			 */
+			cstate->raw_buf_index = 0;
+
+			readbytes = CopyGetData(cstate, &cstate->raw_buf[0], 1, RAW_BUF_SIZE);
+
+			/* If true, then the file is empty, so return from here. */
+			if (cstate->reached_eof)
+				return false;
 		}
 
+		if (cstate->raw_buf_index + sizeof(fld_count) >= (RAW_BUF_SIZE - 1))
+		{
+			/* If the fld_count is not entirely within the current raw_buf's
+			 * data, so let's read next set of data from file.
+			 */
+			uint8 movebytes = 0;
+
+			/* Move bytes can either be 0, 1, or 2. */
+			movebytes = RAW_BUF_SIZE - cstate->raw_buf_index;
+
+			if (movebytes > 0)
+				memmove(&cstate->raw_buf[0], &cstate->raw_buf[cstate->raw_buf_index], movebytes);
+
+			readbytes = CopyGetData(cstate, &cstate->raw_buf[movebytes], 1, (RAW_BUF_SIZE - movebytes));
+
+			if (cstate->reached_eof)
+				ereport(ERROR,
+					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
+					errmsg("unexpected EOF in COPY data")));
+
+			/* If readbytes are lesser than the requested bytes, then initialize the
+			 * remaining bytes in the raw_buf to 0. This will be useful for checking
+			 * error "received copy data after EOF marker".
+			 */
+			if (readbytes < (RAW_BUF_SIZE - movebytes))
+				MemSet(&cstate->raw_buf[readbytes], 0, (RAW_BUF_SIZE - movebytes));
+
+			cstate->raw_buf_index = 0;
+		}
+
+		memcpy(&fld_count, &cstate->raw_buf[cstate->raw_buf_index], sizeof(fld_count));
+
+		fld_count = (int16) pg_ntoh16(fld_count);
+
 		if (fld_count == -1)
 		{
 			/*
@@ -3759,10 +3787,8 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 			 * error if there's data after the EOF marker, for consistency
 			 * with the new-protocol case.
 			 */
-			char		dummy;
-
 			if (cstate->copy_dest != COPY_OLD_FE &&
-				CopyGetData(cstate, &dummy, 1, 1) > 0)
+				cstate->raw_buf[cstate->raw_buf_index + sizeof(fld_count)] != 0)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("received copy data after EOF marker")));
@@ -3775,6 +3801,8 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 					 errmsg("row field count is %d, expected %d",
 							(int) fld_count, attr_count)));
 
+		cstate->raw_buf_index = cstate->raw_buf_index + sizeof(fld_count);
+
 		foreach(cur, cstate->attnumlist)
 		{
 			int			attnum = lfirst_int(cur);
@@ -4716,16 +4744,55 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 {
 	int32		fld_size;
 	Datum		result;
+	int   		readbytes;
 
-	if (!CopyGetInt32(cstate, &fld_size))
-		ereport(ERROR,
+	if ((cstate->raw_buf_index + sizeof(fld_size)) >= (RAW_BUF_SIZE - 1))
+	{
+		/* If the fld_size is not entirely within the current raw_buf's
+		* data, so let's read next set of data from file.
+		*/
+		uint8 		movebytes = 0;
+
+		/* Move bytes can either be 0, 1, 2, 3 or 4. */
+		movebytes = RAW_BUF_SIZE - cstate->raw_buf_index;
+
+		if (movebytes > 0)
+			memmove(&cstate->raw_buf[0], &cstate->raw_buf[cstate->raw_buf_index], movebytes);
+
+		readbytes = CopyGetData(cstate, &cstate->raw_buf[movebytes], 1, (RAW_BUF_SIZE - movebytes));
+
+		if (cstate->reached_eof)
+			ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
-				 errmsg("unexpected EOF in COPY data")));
+				errmsg("unexpected EOF in COPY data")));
+
+		/* If readbytes are lesser than the requested bytes, then initialize the
+		* remaining bytes in the raw_buf to 0. This will be useful for checking
+		* error "received copy data after EOF marker".
+		*/
+		if (readbytes < (RAW_BUF_SIZE - movebytes))
+			MemSet(&cstate->raw_buf[readbytes], 0, (RAW_BUF_SIZE - movebytes));
+
+		cstate->raw_buf_index = 0;
+	}
+
+	memcpy(&fld_size, &cstate->raw_buf[cstate->raw_buf_index], sizeof(fld_size));
+
+	cstate->raw_buf_index = cstate->raw_buf_index + sizeof(fld_size);
+
+	fld_size = (int32) pg_ntoh32(fld_size);
+
 	if (fld_size == -1)
 	{
 		*isnull = true;
 		return ReceiveFunctionCall(flinfo, NULL, typioparam, typmod);
 	}
+
+	if (fld_size == 0)
+		ereport(ERROR,
+				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
+				 errmsg("unexpected EOF in COPY data")));
+
 	if (fld_size < 0)
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
@@ -4735,12 +4802,39 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	resetStringInfo(&cstate->attribute_buf);
 
 	enlargeStringInfo(&cstate->attribute_buf, fld_size);
-	if (CopyGetData(cstate, cstate->attribute_buf.data,
-					fld_size, fld_size) != fld_size)
-		ereport(ERROR,
+
+	if ((RAW_BUF_SIZE - cstate->raw_buf_index) >= fld_size)
+	{
+		/* The field/column lies entirely within the current
+		 * raw_buf data.
+		 */
+		memcpy(&cstate->attribute_buf.data[0], &cstate->raw_buf[cstate->raw_buf_index], fld_size);
+		cstate->raw_buf_index = cstate->raw_buf_index + fld_size;
+	}
+	else
+	{
+		/* The field/column does not lie entirely within the current
+		 * raw_buf data, so let's first take the available field data
+		 * from current raw_buf data and get the remaning field data
+		 * directly from file.
+		 */
+		int32 remainingbytesincurrdatablock = RAW_BUF_SIZE - cstate->raw_buf_index;
+
+		memcpy(&cstate->attribute_buf.data[0], &cstate->raw_buf[cstate->raw_buf_index], remainingbytesincurrdatablock);
+
+		if (CopyGetData(cstate, &cstate->attribute_buf.data[remainingbytesincurrdatablock],
+					(fld_size - remainingbytesincurrdatablock), (fld_size - remainingbytesincurrdatablock)) != (fld_size - remainingbytesincurrdatablock))
+			ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
 
+		/* After getting the complete field/column data, now refill the
+		 * raw_buf with the data read from the file.
+		 */
+		readbytes = CopyGetData(cstate, &cstate->raw_buf[0], 1, RAW_BUF_SIZE);
+		cstate->raw_buf_index = 0;
+	}
+
 	cstate->attribute_buf.len = fld_size;
 	cstate->attribute_buf.data[fld_size] = '\0';
 
-- 
2.25.1

#2Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#1)
Re: [PATCH] Performance Improvement For Copy From Binary Files

Hi,

Added this to commitfest incase this is useful -
https://commitfest.postgresql.org/28/

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

On Mon, Jun 29, 2020 at 10:50 AM Bharath Rupireddy <
bharath.rupireddyforpostgres@gmail.com> wrote:

Show quoted text

Hi Hackers,

For Copy From Binary files, there exists below information for each
tuple/row.
1. field count(number of columns)
2. for every field, field size(column data length)
3. field data of field size(actual column data)

Currently, all the above data required at each step is read directly from
file using fread() and this happens for all the tuples/rows.

One observation is that in the total execution time of a copy from binary
file, the fread() call is taking upto 20% of time and the fread() function
call count is also too high.

For instance, with a dataset of size 5.3GB, 10million tuples with 10
columns,
total exec time in sec total time taken for fread() fread() function call
count
101.193 *21.33* 210000005
101.345 *21.436* 210000005

The total time taken for fread() and the corresponding function call count
may increase if we have more number of columns for instance 1000.

One solution to this problem is to read data from binary file in
RAW_BUF_SIZE(64KB) chunks to avoid repeatedly calling fread()(thus possibly
avoiding few disk IOs). This is similar to the approach followed for
csv/text files.

Attaching a patch, implementing the above solution for binary format files.

Below is the improvement gained.
total exec time in sec total time taken for fread() fread() function call
count
75.757 *2.73* 160884
75.351 *2.742* 160884

*Execution is 1.36X times faster, fread() time is reduced by 87%, fread()
call count is reduced by 99%.*

Request the community to take this patch for review if this approach and
improvement seem beneficial.

Any suggestions to improve further are most welcome.

Attached also is the config file used for testing the above use case.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#3Amit Langote
amitlangote09@gmail.com
In reply to: Bharath Rupireddy (#1)
1 attachment(s)
Re: [PATCH] Performance Improvement For Copy From Binary Files

Hi Bharath,

On Mon, Jun 29, 2020 at 2:21 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

For Copy From Binary files, there exists below information for each tuple/row.
1. field count(number of columns)
2. for every field, field size(column data length)
3. field data of field size(actual column data)

Currently, all the above data required at each step is read directly from file using fread() and this happens for all the tuples/rows.

One observation is that in the total execution time of a copy from binary file, the fread() call is taking upto 20% of time and the fread() function call count is also too high.

For instance, with a dataset of size 5.3GB, 10million tuples with 10 columns,
total exec time in sec total time taken for fread() fread() function call count
101.193 21.33 210000005
101.345 21.436 210000005

The total time taken for fread() and the corresponding function call count may increase if we have more number of columns for instance 1000.

One solution to this problem is to read data from binary file in RAW_BUF_SIZE(64KB) chunks to avoid repeatedly calling fread()(thus possibly avoiding few disk IOs). This is similar to the approach followed for csv/text files.

I agree that having the buffer in front of the file makes sense,
although we do now have an extra memcpy, that is, from raw_buf to
attribute_buf.data. Currently, fread() reads directly into
attribute_buf.data. But maybe that's okay as I don't see the new copy
being all that bad.

Attaching a patch, implementing the above solution for binary format files.

Below is the improvement gained.
total exec time in sec total time taken for fread() fread() function call count
75.757 2.73 160884
75.351 2.742 160884

Execution is 1.36X times faster, fread() time is reduced by 87%, fread() call count is reduced by 99%.

Request the community to take this patch for review if this approach and improvement seem beneficial.

Any suggestions to improve further are most welcome.

Noticed the following misbehaviors when trying to test the patch:

create table foo5 (a text, b text, c text, d text, e text);
insert into foo5 select repeat('a', (random()*100)::int), 'bbb', 'cc',
'd', 'eee' from generate_series(1, 10000000);
copy foo5 to '/tmp/foo5.bin' binary;
truncate foo5;
copy foo5 from '/tmp/foo5.bin' binary;
ERROR: unexpected EOF in COPY data
CONTEXT: COPY foo5, line 33, column a

create table bar (a numeric);
insert into bar select sqrt(a) from generate_series(1, 10000) a;
copy bar to '/tmp/bar.bin' binary;
copy bar from '/tmp/bar.bin' binary;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Trying to figure what was going wrong in each of these cases, I found
the new code a bit hard to navigate and debug :(. Here are a couple of
points that I think could have made things a bit easier:

* Avoid spreading the new buffering logic in multiple existing
functions, with similar pieces of code repeated in multiple places. I
would add a single new function that takes care of the various
buffering details and call it where CopyGetData() is being used
currently.

* You could've reused CopyLoadRawBuffer()'s functionality instead of
reimplementing it. I also see multiple instances of special case
handling, which often suggests that bugs are lurking.

Considering these points, I came up with the attached patch with a
much smaller footprint. As far as I can see, it implements the same
basic idea as your patch. With it, I was able to see an improvement
in loading time consistent with your numbers. I measured the time of
loading 10 million rows into tables with 5, 10, 20 text columns as
follows:

create table foo5 (a text, b text, c text, d text, e text);
insert into foo5 select repeat('a', (random()*100)::int), 'bbb', 'cc',
'd', 'eee' from generate_series(1, 10000000);
copy foo5 to '/tmp/foo5.bin' binary;
truncate foo5;
copy foo5 from '/tmp/foo5.bin' binary;

create table foo10 (a text, b text, c text, d text, e text, f text, g
text, h text, i text, j text);
insert into foo10 select repeat('a', (random()*100)::int), 'bbb',
'cc', 'd', 'eee', 'f', 'gg', 'hh', 'i', 'jjj' from generate_series(1,
10000000);
copy foo10 to '/tmp/foo10.bin' binary;
truncate foo10;
copy foo10 from '/tmp/foo10.bin' binary;

create table foo20 (a text, b text, c text, d text, e text, f numeric,
g text, h text, i text, j text, k text, l text, m text, n text, o
text, p text, q text, r text, s text, t text);
insert into foo20 select repeat('a', (random()*100)::int), 'bbb',
'cc', 'd', 'eee', '123.456', 'gg', 'hh', 'ii', 'jjjj', 'kkk', 'llll',
'mm', 'n', 'ooooo', 'pppp', 'q', 'rrrrr', 'ss', 'tttttttttttt' from
generate_series(1, 10000000);
copy foo20 to '/tmp/foo20.bin' binary;
truncate foo20;
copy foo20 from '/tmp/foo20.bin' binary;

The median times for the COPY FROM commands above, with and without
the patch, are as follows:

HEAD patched
foo5 8.5 6.5
foo10 14 10
foo20 25 18

A few more points to remember in the future:

* Commenting style:

+       /* If readbytes are lesser than the requested bytes, then initialize the
+       * remaining bytes in the raw_buf to 0. This will be useful for checking
+       * error "received copy data after EOF marker".
+       */

Multi-line comments are started like this:

/*
* <Start here>
*/

* As also mentioned above, it's a good idea in general to avoid having
special cases like these in the code:

+       if (cstate->cur_lineno == 1)
        {
-           /* EOF detected (end of file, or protocol-level EOF) */
-           return false;
+           /* This is for the first time, so read in buff size amount
+            * of data from file.
+            */

...

+
+           /* Move bytes can either be 0, 1, or 2. */
+           movebytes = RAW_BUF_SIZE - cstate->raw_buf_index;

...

+       uint8       movebytes = 0;
+
+       /* Move bytes can either be 0, 1, 2, 3 or 4. */
+       movebytes = RAW_BUF_SIZE - cstate->raw_buf_index;

* Please try to make variable names short if you can or follow the
guidelines around long names:

+ int32 remainingbytesincurrdatablock = RAW_BUF_SIZE -
cstate->raw_buf_index;

Maybe, remaining_bytes would've sufficed here, because "in the current
data block" might be clear to most readers by looking at the
surrounding code.

* The above point also helps avoid long code lines that don't fit
within 78 characters, like these:

+       memcpy(&cstate->attribute_buf.data[0],
&cstate->raw_buf[cstate->raw_buf_index],
remainingbytesincurrdatablock);
+
+       if (CopyGetData(cstate,
&cstate->attribute_buf.data[remainingbytesincurrdatablock],
+                   (fld_size - remainingbytesincurrdatablock),
(fld_size - remainingbytesincurrdatablock)) != (fld_size -
remainingbytesincurrdatablock))

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

CopyFrom-binary-buffering.patchapplication/octet-stream; name=CopyFrom-binary-buffering.patchDownload
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 3e199bd..c5be17e 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -367,6 +367,7 @@ static bool CopyReadLine(CopyState cstate);
 static bool CopyReadLineText(CopyState cstate);
 static int	CopyReadAttributesText(CopyState cstate);
 static int	CopyReadAttributesCSV(CopyState cstate);
+static int	CopyReadFromRawBuf(CopyState cstate, void *saveTo, int nbytes);
 static Datum CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 									 Oid typioparam, int32 typmod,
 									 bool *isnull);
@@ -739,7 +740,7 @@ CopyGetInt32(CopyState cstate, int32 *val)
 {
 	uint32		buf;
 
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
+	if (CopyReadFromRawBuf(cstate, &buf, sizeof(buf)) != sizeof(buf))
 	{
 		*val = 0;				/* suppress compiler warning */
 		return false;
@@ -768,7 +769,7 @@ CopyGetInt16(CopyState cstate, int16 *val)
 {
 	uint16		buf;
 
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
+	if (CopyReadFromRawBuf(cstate, &buf, sizeof(buf)) != sizeof(buf))
 	{
 		*val = 0;				/* suppress compiler warning */
 		return false;
@@ -813,6 +814,62 @@ CopyLoadRawBuf(CopyState cstate)
 	return (inbytes > 0);
 }
 
+/*
+ * CopyReadFromRawBuf
+ *		Reads 'nbytes' bytes from cstate->copy_file via cstate->raw_buf and
+ *		writes then to 'saveTo'
+ *
+ * Useful when reading binary data from the file.
+ */
+#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
+	do {\
+		memcpy((dest), (cstate)->raw_buf + (cstate)->raw_buf_index, (nbytes));\
+		(cstate)->raw_buf_index += (nbytes);\
+	} while(0)
+
+#define BUF_BYTES	(cstate->raw_buf_len - cstate->raw_buf_index)
+
+static int
+CopyReadFromRawBuf(CopyState cstate, void *saveTo, int nbytes)
+{
+	char   *dest = saveTo;
+	int		copied_bytes = 0;
+
+	if (BUF_BYTES >= nbytes)
+	{
+		/* Enough bytes are present in the buffer. */
+		DRAIN_COPY_RAW_BUF(cstate, dest, nbytes);
+		copied_bytes = nbytes;
+	}
+	else
+	{
+		/*
+		 * Not enough bytes in the buffer, so must read from the file.  Need
+		 * the loop considering that 'nbytes' may be larger than the maximum
+		 * bytes that the buffer can hold.
+		 */
+		do
+		{
+			int		copy_bytes;
+
+			/*
+			 * This tries to read up to RAW_BUF_SIZE bytes into raw_buf,
+			 * returning if no more were read.
+			 */
+			if (BUF_BYTES == 0 && !CopyLoadRawBuf(cstate))
+				return copied_bytes;
+
+			copy_bytes = Min(nbytes - copied_bytes, BUF_BYTES);
+			DRAIN_COPY_RAW_BUF(cstate, dest, copy_bytes);
+			dest += copy_bytes;
+			copied_bytes += copy_bytes;
+		} while (copied_bytes < nbytes);
+	}
+
+	return copied_bytes;
+}
+#undef DRAIN_COPY_RAW_BUF
+#undef BUF_BYTES
 
 /*
  *	 DoCopy executes the SQL COPY statement
@@ -4731,8 +4788,8 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	resetStringInfo(&cstate->attribute_buf);
 
 	enlargeStringInfo(&cstate->attribute_buf, fld_size);
-	if (CopyGetData(cstate, cstate->attribute_buf.data,
-					fld_size, fld_size) != fld_size)
+	if (CopyReadFromRawBuf(cstate, cstate->attribute_buf.data,
+						   fld_size) != fld_size)
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
#4Amit Langote
amitlangote09@gmail.com
In reply to: Amit Langote (#3)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Tue, Jul 7, 2020 at 4:28 PM Amit Langote <amitlangote09@gmail.com> wrote:

The median times for the COPY FROM commands above, with and without
the patch, are as follows:

HEAD patched
foo5 8.5 6.5
foo10 14 10
foo20 25 18

Sorry, I forgot to mention that these times are in seconds.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

#5Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Amit Langote (#3)
Re: [PATCH] Performance Improvement For Copy From Binary Files

Considering these points, I came up with the attached patch with a
much smaller footprint. As far as I can see, it implements the same
basic idea as your patch. With it, I was able to see an improvement
in loading time consistent with your numbers. I measured the time of
loading 10 million rows into tables with 5, 10, 20 text columns as
follows:

Thanks Amit for buying into the idea. I agree that your patch looks
clean and simple compared to mine and I'm okay with your patch.

I reviewed and tested your patch, below are few comments:

I think we can remove(and delete the function from the code) the
CopyGetInt16() and have the code directly to save the function call
cost. It gets called for each attribute/column for each row/tuple to
just call CopyReadFromRawBuf() and set the byte order. From a
readability perspective it's okay to have this function, but cost wise
I feel no need for that function at all. In one of our earlier
work(parallel copy), we observed that having a new function or few
extra statements in this copy from path which gets hit for each row,
incurs noticeable execution cost.

The same way, we can also avoid using CopyGetInt32() function call in
CopyReadBinaryAttribute() for the same reason stated above.

In CopyReadFromRawBuf(), can the "saveTo" parameter be named "dest"
and use that with (char *) typecast directly, instead of having a
local variable? Though it may/may not be a standard practice, let's
have the parameter name all lower case to keep it consistent with
other function's parameters in the copy.c file.

Seems like there's a bug in below part of the code. Use case is
simple, have some junk value at the end of the binary file, then with
your patch the query succeeds, but it should report the below error.
Here, on fld_count == -1 instead of reading from file, we must be
reading it from the buffer, as we would have already read all the data
from the file into the buffer.
if (cstate->copy_dest != COPY_OLD_FE &&
CopyGetData(cstate, &dummy, 1, 1) > 0)
ereport(ERROR,
(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
errmsg("received copy data after EOF marker")));

I also tried with some intentionally corrupted binary datasets, (apart
from above issue) patch works fine.

For the case where required nbytes may not fit into the buffer in
CopyReadFromRawBuf, I'm sure this can happen only for field data,
(field count , and field size are of fixed length and can fit in the
buffer), instead of reading them in parts of buff size into the buffer
(using CopyLoadRawBuf) and then DRAIN_COPY_RAW_BUF() to the
destination, I think we can detect this condition using requested
bytes and the buffer size and directly read from the file to the
destination buffer and then reload the raw_buffer for further
processing. I think this way, it will be good.

I have few synthesized test cases where fields can be of larger size.
I executed them on your patch, but didn't debug to see whether
actually we hit the code where required nbytes can't fit in the entire
buffer. I will try this on the next version of the patch.

HEAD patched
foo5 8.5 6.5
foo10 14 10
foo20 25 18

Numbers might improve a bit, if we remove the extra function calls as
stated above.

Overall, thanks for your suggestions in the previous mail, my patch
was prepared in a bit hurried manner, anyways, will take care next
time.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#6Amit Langote
amitlangote09@gmail.com
In reply to: Bharath Rupireddy (#5)
1 attachment(s)
Re: [PATCH] Performance Improvement For Copy From Binary Files

Hi Bharath,

On Thu, Jul 9, 2020 at 7:33 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks Amit for buying into the idea. I agree that your patch looks
clean and simple compared to mine and I'm okay with your patch.

I reviewed and tested your patch, below are few comments:

Thanks for checking it out.

I think we can remove(and delete the function from the code) the
CopyGetInt16() and have the code directly to save the function call
cost. It gets called for each attribute/column for each row/tuple to
just call CopyReadFromRawBuf() and set the byte order. From a
readability perspective it's okay to have this function, but cost wise
I feel no need for that function at all. In one of our earlier
work(parallel copy), we observed that having a new function or few
extra statements in this copy from path which gets hit for each row,
incurs noticeable execution cost.

The same way, we can also avoid using CopyGetInt32() function call in
CopyReadBinaryAttribute() for the same reason stated above.

I agree that removing the function call overhead in this case is worth
the slight loss of readability.

In CopyReadFromRawBuf(), can the "saveTo" parameter be named "dest"
and use that with (char *) typecast directly, instead of having a
local variable? Though it may/may not be a standard practice, let's
have the parameter name all lower case to keep it consistent with
other function's parameters in the copy.c file.

Agreed.

Seems like there's a bug in below part of the code. Use case is
simple, have some junk value at the end of the binary file, then with
your patch the query succeeds, but it should report the below error.
Here, on fld_count == -1 instead of reading from file, we must be
reading it from the buffer, as we would have already read all the data
from the file into the buffer.
if (cstate->copy_dest != COPY_OLD_FE &&
CopyGetData(cstate, &dummy, 1, 1) > 0)
ereport(ERROR,
(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
errmsg("received copy data after EOF marker")));

I also tried with some intentionally corrupted binary datasets, (apart
from above issue) patch works fine.

Yeah, I see the bug. I should've checked all the call sites of
CopyGetData() and made sure there is only one left, that is,
CopyLoadRawBuffer().

For the case where required nbytes may not fit into the buffer in
CopyReadFromRawBuf, I'm sure this can happen only for field data,
(field count , and field size are of fixed length and can fit in the
buffer), instead of reading them in parts of buff size into the buffer
(using CopyLoadRawBuf) and then DRAIN_COPY_RAW_BUF() to the
destination, I think we can detect this condition using requested
bytes and the buffer size and directly read from the file to the
destination buffer and then reload the raw_buffer for further
processing. I think this way, it will be good.

Hmm, I'm afraid that this will make the code more complex for
apparently small benefit. Is this really that much of a problem
performance wise?

I have few synthesized test cases where fields can be of larger size.
I executed them on your patch, but didn't debug to see whether
actually we hit the code where required nbytes can't fit in the entire
buffer. I will try this on the next version of the patch.

HEAD patched
foo5 8.5 6.5
foo10 14 10
foo20 25 18

Numbers might improve a bit, if we remove the extra function calls as
stated above.

Here the numbers with the updated patch:

HEAD patched (v2)
foo5 8.5 6.1
foo10 14 9.4
foo20 25 16.7

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

CopyFrom-binary-buffering_v2.patchapplication/octet-stream; name=CopyFrom-binary-buffering_v2.patchDownload
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 3e199bd..8a81bdd 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -367,6 +367,7 @@ static bool CopyReadLine(CopyState cstate);
 static bool CopyReadLineText(CopyState cstate);
 static int	CopyReadAttributesText(CopyState cstate);
 static int	CopyReadAttributesCSV(CopyState cstate);
+static int	CopyReadFromRawBuf(CopyState cstate, char *dest, int nbytes);
 static Datum CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 									 Oid typioparam, int32 typmod,
 									 bool *isnull);
@@ -388,9 +389,7 @@ static void CopySendEndOfRow(CopyState cstate);
 static int	CopyGetData(CopyState cstate, void *databuf,
 						int minread, int maxread);
 static void CopySendInt32(CopyState cstate, int32 val);
-static bool CopyGetInt32(CopyState cstate, int32 *val);
 static void CopySendInt16(CopyState cstate, int16 val);
-static bool CopyGetInt16(CopyState cstate, int16 *val);
 
 
 /*
@@ -730,25 +729,6 @@ CopySendInt32(CopyState cstate, int32 val)
 }
 
 /*
- * CopyGetInt32 reads an int32 that appears in network byte order
- *
- * Returns true if OK, false if EOF
- */
-static bool
-CopyGetInt32(CopyState cstate, int32 *val)
-{
-	uint32		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int32) pg_ntoh32(buf);
-	return true;
-}
-
-/*
  * CopySendInt16 sends an int16 in network byte order
  */
 static void
@@ -760,23 +740,6 @@ CopySendInt16(CopyState cstate, int16 val)
 	CopySendData(cstate, &buf, sizeof(buf));
 }
 
-/*
- * CopyGetInt16 reads an int16 that appears in network byte order
- */
-static bool
-CopyGetInt16(CopyState cstate, int16 *val)
-{
-	uint16		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int16) pg_ntoh16(buf);
-	return true;
-}
-
 
 /*
  * CopyLoadRawBuf loads some more data into raw_buf
@@ -813,6 +776,61 @@ CopyLoadRawBuf(CopyState cstate)
 	return (inbytes > 0);
 }
 
+/*
+ * CopyReadFromRawBuf
+ *		Reads 'nbytes' bytes from cstate->copy_file via cstate->raw_buf and
+ *		writes then to 'saveTo'
+ *
+ * Useful when reading binary data from the file.
+ */
+#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
+	do {\
+		memcpy((dest), (cstate)->raw_buf + (cstate)->raw_buf_index, (nbytes));\
+		(cstate)->raw_buf_index += (nbytes);\
+	} while(0)
+
+#define BUF_BYTES	(cstate->raw_buf_len - cstate->raw_buf_index)
+
+static int
+CopyReadFromRawBuf(CopyState cstate, char *dest, int nbytes)
+{
+	int		copied_bytes = 0;
+
+	if (BUF_BYTES >= nbytes)
+	{
+		/* Enough bytes are present in the buffer. */
+		DRAIN_COPY_RAW_BUF(cstate, dest, nbytes);
+		copied_bytes = nbytes;
+	}
+	else
+	{
+		/*
+		 * Not enough bytes in the buffer, so must read from the file.  Need
+		 * the loop considering that 'nbytes' may be larger than the maximum
+		 * bytes that the buffer can hold.
+		 */
+		do
+		{
+			int		copy_bytes;
+
+			/*
+			 * This tries to read up to RAW_BUF_SIZE bytes into raw_buf,
+			 * returning if no more were read.
+			 */
+			if (BUF_BYTES == 0 && !CopyLoadRawBuf(cstate))
+				return copied_bytes;
+
+			copy_bytes = Min(nbytes - copied_bytes, BUF_BYTES);
+			DRAIN_COPY_RAW_BUF(cstate, dest, copy_bytes);
+			dest += copy_bytes;
+			copied_bytes += copy_bytes;
+		} while (copied_bytes < nbytes);
+	}
+
+	return copied_bytes;
+}
+#undef DRAIN_COPY_RAW_BUF
+#undef BUF_BYTES
 
 /*
  *	 DoCopy executes the SQL COPY statement
@@ -3514,16 +3532,18 @@ BeginCopyFrom(ParseState *pstate,
 		int32		tmp;
 
 		/* Signature */
-		if (CopyGetData(cstate, readSig, 11, 11) != 11 ||
+		if (CopyReadFromRawBuf(cstate, readSig, 11) != 11 ||
 			memcmp(readSig, BinarySignature, 11) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("COPY file signature not recognized")));
 		/* Flags field */
-		if (!CopyGetInt32(cstate, &tmp))
+		if (CopyReadFromRawBuf(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp))
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing flags)")));
+		tmp = (int32) pg_ntoh32(tmp);
 		if ((tmp & (1 << 16)) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
@@ -3534,15 +3554,15 @@ BeginCopyFrom(ParseState *pstate,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("unrecognized critical flags in COPY file header")));
 		/* Header extension length */
-		if (!CopyGetInt32(cstate, &tmp) ||
-			tmp < 0)
+		if (CopyReadFromRawBuf(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp) || (tmp = (int32) pg_ntoh32(tmp)) < 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing length)")));
 		/* Skip extension header, if present */
 		while (tmp-- > 0)
 		{
-			if (CopyGetData(cstate, readSig, 1, 1) != 1)
+			if (CopyReadFromRawBuf(cstate, readSig, 1) != 1)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("invalid COPY file header (wrong length)")));
@@ -3735,12 +3755,14 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 
 		cstate->cur_lineno++;
 
-		if (!CopyGetInt16(cstate, &fld_count))
+		if (CopyReadFromRawBuf(cstate, (char *) &fld_count,
+							   sizeof(fld_count)) != sizeof(fld_count))
 		{
 			/* EOF detected (end of file, or protocol-level EOF) */
 			return false;
 		}
 
+		fld_count = (int16) pg_ntoh16(fld_count);
 		if (fld_count == -1)
 		{
 			/*
@@ -3758,7 +3780,7 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 			char		dummy;
 
 			if (cstate->copy_dest != COPY_OLD_FE &&
-				CopyGetData(cstate, &dummy, 1, 1) > 0)
+				CopyReadFromRawBuf(cstate, &dummy, 1) > 0)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("received copy data after EOF marker")));
@@ -4713,10 +4735,12 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	int32		fld_size;
 	Datum		result;
 
-	if (!CopyGetInt32(cstate, &fld_size))
+	if (CopyReadFromRawBuf(cstate, (char *) &fld_size, sizeof(fld_size)) !=
+		sizeof(fld_size))
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
+	fld_size = (int32) pg_ntoh32(fld_size);
 	if (fld_size == -1)
 	{
 		*isnull = true;
@@ -4731,8 +4755,8 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	resetStringInfo(&cstate->attribute_buf);
 
 	enlargeStringInfo(&cstate->attribute_buf, fld_size);
-	if (CopyGetData(cstate, cstate->attribute_buf.data,
-					fld_size, fld_size) != fld_size)
+	if (CopyReadFromRawBuf(cstate, cstate->attribute_buf.data,
+						   fld_size) != fld_size)
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
#7vignesh C
vignesh21@gmail.com
In reply to: Amit Langote (#6)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Fri, Jul 10, 2020 at 8:51 AM Amit Langote <amitlangote09@gmail.com> wrote:

Hi Bharath,
Here the numbers with the updated patch:

HEAD patched (v2)
foo5 8.5 6.1
foo10 14 9.4
foo20 25 16.7

Patch applies cleanly, make check & make check-world passes.
I had reviewed the changes. I felt one minor change required:
+ * CopyReadFromRawBuf
+ *             Reads 'nbytes' bytes from cstate->copy_file via
cstate->raw_buf and
+ *             writes then to 'saveTo'
+ *
+ * Useful when reading binary data from the file.
Should "writes then to 'saveTo'" be "writes them to 'dest'"?

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

#8Thomas Munro
thomas.munro@gmail.com
In reply to: vignesh C (#7)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Mon, Jul 13, 2020 at 1:13 AM vignesh C <vignesh21@gmail.com> wrote:

On Fri, Jul 10, 2020 at 8:51 AM Amit Langote <amitlangote09@gmail.com> wrote:

Here the numbers with the updated patch:

HEAD patched (v2)
foo5 8.5 6.1
foo10 14 9.4
foo20 25 16.7

Patch applies cleanly, make check & make check-world passes.

This error showed up when cfbot tried it:

COPY BINARY stud_emp FROM
'/home/travis/build/postgresql-cfbot/postgresql/src/test/regress/results/stud_emp.data';
+ERROR: could not read from COPY file: Bad address

#9Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Thomas Munro (#8)
1 attachment(s)
Re: [PATCH] Performance Improvement For Copy From Binary Files

Thanks Thomas for checking this feature.

On Mon, Jul 13, 2020 at 4:06 AM Thomas Munro <thomas.munro@gmail.com> wrote:

This error showed up when cfbot tried it:

COPY BINARY stud_emp FROM
'/home/travis/build/postgresql-cfbot/postgresql/src/test/regress/results/stud_emp.data';
+ERROR: could not read from COPY file: Bad address

This is due to the recent commit
cd22d3cdb9bd9963c694c01a8c0232bbae3ddcfb, in which we restricted the
raw_buf and line_buf allocations for binary files. Since we are using
raw_buf for this performance improvement feature, now, it's enough to
restrict only line_buf for binary files. I made the changes
accordingly in the v3 patch attached here.

Regression tests(make check & make check-world) ran cleanly with the v3 patch.

Please also find my responses for:

Vignesh's comment:

On Sun, Jul 12, 2020 at 6:43 PM vignesh C <vignesh21@gmail.com> wrote:
I had reviewed the changes. I felt one minor change required:
+ * CopyReadFromRawBuf
+ *             Reads 'nbytes' bytes from cstate->copy_file via
cstate->raw_buf and
+ *             writes then to 'saveTo'
+ *
+ * Useful when reading binary data from the file.
Should "writes then to 'saveTo'" be "writes them to 'dest'"?

Thanks Vignesh for reviewing the patch. Modified 'saveTo' to 'dest' in v3 patch.

Amit's comment:

For the case where required nbytes may not fit into the buffer in
CopyReadFromRawBuf, I'm sure this can happen only for field data,
(field count , and field size are of fixed length and can fit in the
buffer), instead of reading them in parts of buff size into the buffer
(using CopyLoadRawBuf) and then DRAIN_COPY_RAW_BUF() to the
destination, I think we can detect this condition using requested
bytes and the buffer size and directly read from the file to the
destination buffer and then reload the raw_buffer for further
processing. I think this way, it will be good.

Hmm, I'm afraid that this will make the code more complex for
apparently small benefit. Is this really that much of a problem
performance wise?

Yes it makes CopyReadFromRawBuf(), code a bit complex from a
readability perspective. I'm convinced not to have the
abovementioned(by me) change, due to 3 reasons,1) the
readability/understandability 2) how many use cases can we have where
requested field size greater than RAW_BUF_SIZE(64KB)? I think very few
cases. I may be wrong here. 3) Performance wise it may not be much as
we do one extra memcpy only in situations where field sizes are
greater than 64KB(as we have already seen and observed by you as well
in one of the response [1]) that memcpy cost for this case may be
negligible.

Considering all of above, I'm okay to have CopyReadFromRawBuf()
function, the way it is currently.

[1]:

One solution to this problem is to read data from binary file in RAW_BUF_SIZE(64KB) chunks to avoid repeatedly calling fread()(thus possibly avoiding few disk IOs). This is similar to the approach followed for csv/text files.

I agree that having the buffer in front of the file makes sense,
although we do now have an extra memcpy, that is, from raw_buf to
attribute_buf.data. Currently, fread() reads directly into
attribute_buf.data. But maybe that's okay as I don't see the new copy
being all that bad.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

CopyFrom-binary-buffering_v3.patchapplication/octet-stream; name=CopyFrom-binary-buffering_v3.patchDownload
From aaae714e3a9c30dffd8bf51fa3b3039c0ba7ab70 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddy@enterprisedb.com>
Date: Mon, 13 Jul 2020 06:33:56 +0530
Subject: [PATCH v3] Performance Improvement For Copy From Binary Files

Read data from binary file in RAW_BUF_SIZE bytes at a time for
COPY FROM command, instead of reading everytime from file for
field count, size and field data. This avoids fread calls
significantly and improves performance of the COPY FROM binary
files command.
---
 src/backend/commands/copy.c | 132 +++++++++++++++++++++---------------
 1 file changed, 78 insertions(+), 54 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 99d1457180..2e34d1f84a 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -195,7 +195,7 @@ typedef struct CopyStateData
 	 * the buffer on each cycle.
 	 *
 	 * (In binary COPY FROM, attribute_buf holds the binary data for the
-	 * current field, while the other variables are not used.)
+	 * current field and raw_buf holds the raw data read from file.)
 	 */
 	StringInfoData attribute_buf;
 
@@ -370,6 +370,7 @@ static bool CopyReadLine(CopyState cstate);
 static bool CopyReadLineText(CopyState cstate);
 static int	CopyReadAttributesText(CopyState cstate);
 static int	CopyReadAttributesCSV(CopyState cstate);
+static int	CopyReadFromRawBuf(CopyState cstate, char *dest, int nbytes);
 static Datum CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 									 Oid typioparam, int32 typmod,
 									 bool *isnull);
@@ -391,9 +392,7 @@ static void CopySendEndOfRow(CopyState cstate);
 static int	CopyGetData(CopyState cstate, void *databuf,
 						int minread, int maxread);
 static void CopySendInt32(CopyState cstate, int32 val);
-static bool CopyGetInt32(CopyState cstate, int32 *val);
 static void CopySendInt16(CopyState cstate, int16 val);
-static bool CopyGetInt16(CopyState cstate, int16 *val);
 
 
 /*
@@ -732,25 +731,6 @@ CopySendInt32(CopyState cstate, int32 val)
 	CopySendData(cstate, &buf, sizeof(buf));
 }
 
-/*
- * CopyGetInt32 reads an int32 that appears in network byte order
- *
- * Returns true if OK, false if EOF
- */
-static bool
-CopyGetInt32(CopyState cstate, int32 *val)
-{
-	uint32		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int32) pg_ntoh32(buf);
-	return true;
-}
-
 /*
  * CopySendInt16 sends an int16 in network byte order
  */
@@ -763,23 +743,6 @@ CopySendInt16(CopyState cstate, int16 val)
 	CopySendData(cstate, &buf, sizeof(buf));
 }
 
-/*
- * CopyGetInt16 reads an int16 that appears in network byte order
- */
-static bool
-CopyGetInt16(CopyState cstate, int16 *val)
-{
-	uint16		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int16) pg_ntoh16(buf);
-	return true;
-}
-
 
 /*
  * CopyLoadRawBuf loads some more data into raw_buf
@@ -816,6 +779,61 @@ CopyLoadRawBuf(CopyState cstate)
 	return (inbytes > 0);
 }
 
+/*
+ * CopyReadFromRawBuf
+ *		Reads 'nbytes' bytes from cstate->copy_file via cstate->raw_buf and
+ *		writes then to 'dest'
+ *
+ * Useful when reading binary data from the file.
+ */
+#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
+	do {\
+		memcpy((dest), (cstate)->raw_buf + (cstate)->raw_buf_index, (nbytes));\
+		(cstate)->raw_buf_index += (nbytes);\
+	} while(0)
+
+#define BUF_BYTES	(cstate->raw_buf_len - cstate->raw_buf_index)
+
+static int
+CopyReadFromRawBuf(CopyState cstate, char *dest, int nbytes)
+{
+	int		copied_bytes = 0;
+
+	if (BUF_BYTES >= nbytes)
+	{
+		/* Enough bytes are present in the buffer. */
+		DRAIN_COPY_RAW_BUF(cstate, dest, nbytes);
+		copied_bytes = nbytes;
+	}
+	else
+	{
+		/*
+		 * Not enough bytes in the buffer, so must read from the file.  Need
+		 * the loop considering that 'nbytes' may be larger than the maximum
+		 * bytes that the buffer can hold.
+		 */
+		do
+		{
+			int		copy_bytes;
+
+			/*
+			 * This tries to read up to RAW_BUF_SIZE bytes into raw_buf,
+			 * returning if no more were read.
+			 */
+			if (BUF_BYTES == 0 && !CopyLoadRawBuf(cstate))
+				return copied_bytes;
+
+			copy_bytes = Min(nbytes - copied_bytes, BUF_BYTES);
+			DRAIN_COPY_RAW_BUF(cstate, dest, copy_bytes);
+			dest += copy_bytes;
+			copied_bytes += copy_bytes;
+		} while (copied_bytes < nbytes);
+	}
+
+	return copied_bytes;
+}
+#undef DRAIN_COPY_RAW_BUF
+#undef BUF_BYTES
 
 /*
  *	 DoCopy executes the SQL COPY statement
@@ -3363,17 +3381,17 @@ BeginCopyFrom(ParseState *pstate,
 	cstate->cur_attval = NULL;
 
 	/*
-	 * Set up variables to avoid per-attribute overhead.  attribute_buf is
-	 * used in both text and binary modes, but we use line_buf and raw_buf
-	 * only in text mode.
+	 * Set up variables to avoid per-attribute overhead. line_buf
+	 * is not used in binary mode.
 	 */
 	initStringInfo(&cstate->attribute_buf);
+	cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
+	cstate->raw_buf_index = cstate->raw_buf_len = 0;
+
 	if (!cstate->binary)
 	{
 		initStringInfo(&cstate->line_buf);
 		cstate->line_buf_converted = false;
-		cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
-		cstate->raw_buf_index = cstate->raw_buf_len = 0;
 	}
 
 	/* Assign range table, we'll need it in CopyFrom. */
@@ -3524,16 +3542,18 @@ BeginCopyFrom(ParseState *pstate,
 		int32		tmp;
 
 		/* Signature */
-		if (CopyGetData(cstate, readSig, 11, 11) != 11 ||
+		if (CopyReadFromRawBuf(cstate, readSig, 11) != 11 ||
 			memcmp(readSig, BinarySignature, 11) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("COPY file signature not recognized")));
 		/* Flags field */
-		if (!CopyGetInt32(cstate, &tmp))
+		if (CopyReadFromRawBuf(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp))
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing flags)")));
+		tmp = (int32) pg_ntoh32(tmp);
 		if ((tmp & (1 << 16)) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
@@ -3544,15 +3564,15 @@ BeginCopyFrom(ParseState *pstate,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("unrecognized critical flags in COPY file header")));
 		/* Header extension length */
-		if (!CopyGetInt32(cstate, &tmp) ||
-			tmp < 0)
+		if (CopyReadFromRawBuf(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp) || (tmp = (int32) pg_ntoh32(tmp)) < 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing length)")));
 		/* Skip extension header, if present */
 		while (tmp-- > 0)
 		{
-			if (CopyGetData(cstate, readSig, 1, 1) != 1)
+			if (CopyReadFromRawBuf(cstate, readSig, 1) != 1)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("invalid COPY file header (wrong length)")));
@@ -3745,12 +3765,14 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 
 		cstate->cur_lineno++;
 
-		if (!CopyGetInt16(cstate, &fld_count))
+		if (CopyReadFromRawBuf(cstate, (char *) &fld_count,
+							   sizeof(fld_count)) != sizeof(fld_count))
 		{
 			/* EOF detected (end of file, or protocol-level EOF) */
 			return false;
 		}
 
+		fld_count = (int16) pg_ntoh16(fld_count);
 		if (fld_count == -1)
 		{
 			/*
@@ -3768,7 +3790,7 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 			char		dummy;
 
 			if (cstate->copy_dest != COPY_OLD_FE &&
-				CopyGetData(cstate, &dummy, 1, 1) > 0)
+				CopyReadFromRawBuf(cstate, &dummy, 1) > 0)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("received copy data after EOF marker")));
@@ -4723,10 +4745,12 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	int32		fld_size;
 	Datum		result;
 
-	if (!CopyGetInt32(cstate, &fld_size))
+	if (CopyReadFromRawBuf(cstate, (char *) &fld_size, sizeof(fld_size)) !=
+		sizeof(fld_size))
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
+	fld_size = (int32) pg_ntoh32(fld_size);
 	if (fld_size == -1)
 	{
 		*isnull = true;
@@ -4741,8 +4765,8 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	resetStringInfo(&cstate->attribute_buf);
 
 	enlargeStringInfo(&cstate->attribute_buf, fld_size);
-	if (CopyGetData(cstate, cstate->attribute_buf.data,
-					fld_size, fld_size) != fld_size)
+	if (CopyReadFromRawBuf(cstate, cstate->attribute_buf.data,
+						   fld_size) != fld_size)
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
-- 
2.25.1

#10Amit Langote
amitlangote09@gmail.com
In reply to: Bharath Rupireddy (#9)
1 attachment(s)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Mon, Jul 13, 2020 at 10:19 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, Jul 13, 2020 at 4:06 AM Thomas Munro <thomas.munro@gmail.com> wrote:

This error showed up when cfbot tried it:

COPY BINARY stud_emp FROM
'/home/travis/build/postgresql-cfbot/postgresql/src/test/regress/results/stud_emp.data';
+ERROR: could not read from COPY file: Bad address

This is due to the recent commit
cd22d3cdb9bd9963c694c01a8c0232bbae3ddcfb, in which we restricted the
raw_buf and line_buf allocations for binary files. Since we are using
raw_buf for this performance improvement feature, now, it's enough to
restrict only line_buf for binary files. I made the changes
accordingly in the v3 patch attached here.

Regression tests(make check & make check-world) ran cleanly with the v3 patch.

Thank you Bharath. I was a bit surprised that you had also submitted
a patch to NOT allocate raw_buf for COPY FROM ... BINARY. :-)

Please also find my responses for:

Vignesh's comment:

On Sun, Jul 12, 2020 at 6:43 PM vignesh C <vignesh21@gmail.com> wrote:
I had reviewed the changes. I felt one minor change required:
+ * CopyReadFromRawBuf
+ *             Reads 'nbytes' bytes from cstate->copy_file via
cstate->raw_buf and
+ *             writes then to 'saveTo'
+ *
+ * Useful when reading binary data from the file.
Should "writes then to 'saveTo'" be "writes them to 'dest'"?

Thanks Vignesh for reviewing the patch. Modified 'saveTo' to 'dest' in v3 patch.

My bad.

Amit's comment:

For the case where required nbytes may not fit into the buffer in
CopyReadFromRawBuf, I'm sure this can happen only for field data,
(field count , and field size are of fixed length and can fit in the
buffer), instead of reading them in parts of buff size into the buffer
(using CopyLoadRawBuf) and then DRAIN_COPY_RAW_BUF() to the
destination, I think we can detect this condition using requested
bytes and the buffer size and directly read from the file to the
destination buffer and then reload the raw_buffer for further
processing. I think this way, it will be good.

Hmm, I'm afraid that this will make the code more complex for
apparently small benefit. Is this really that much of a problem
performance wise?

Yes it makes CopyReadFromRawBuf(), code a bit complex from a
readability perspective. I'm convinced not to have the
abovementioned(by me) change, due to 3 reasons,1) the
readability/understandability 2) how many use cases can we have where
requested field size greater than RAW_BUF_SIZE(64KB)? I think very few
cases. I may be wrong here. 3) Performance wise it may not be much as
we do one extra memcpy only in situations where field sizes are
greater than 64KB(as we have already seen and observed by you as well
in one of the response [1]) that memcpy cost for this case may be
negligible.

Actually, an extra memcpy is incurred on every call of
CopyReadFromRawBuf(), but I haven't seen it to be very problematic.

By the way, considering the rebase over cd22d3cdb9b, it seemed to me
that we needed to update the comments in CopyStateData struct
definition a bit more. While doing that, I realized
CopyReadFromRawBuf as a name for the new function might be misleading
as long as we are only using it for binary data. Maybe
CopyReadBinaryData is more appropriate? See attached v4 with these
and a few other cosmetic changes.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

CopyFrom-binary-buffering_v4.patchapplication/octet-stream; name=CopyFrom-binary-buffering_v4.patchDownload
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 99d1457..ae9dada 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -195,7 +195,8 @@ typedef struct CopyStateData
 	 * the buffer on each cycle.
 	 *
 	 * (In binary COPY FROM, attribute_buf holds the binary data for the
-	 * current field, while the other variables are not used.)
+	 * current field copied from raw_buf, which in turn holds raw data read
+	 * from the file.)
 	 */
 	StringInfoData attribute_buf;
 
@@ -219,8 +220,9 @@ typedef struct CopyStateData
 	 * Finally, raw_buf holds raw data read from the data source (file or
 	 * client connection).  CopyReadLine parses this data sufficiently to
 	 * locate line boundaries, then transfers the data to line_buf and
-	 * converts it.  Note: we guarantee that there is a \0 at
-	 * raw_buf[raw_buf_len].
+	 * converts it.  Also, CopyReadBinaryData() slices the data into
+	 * attributes based on the caller-specified field length in bytes.
+	 * Note: we guarantee that there is a \0 at raw_buf[raw_buf_len].
 	 */
 #define RAW_BUF_SIZE 65536		/* we palloc RAW_BUF_SIZE+1 bytes */
 	char	   *raw_buf;
@@ -373,6 +375,7 @@ static int	CopyReadAttributesCSV(CopyState cstate);
 static Datum CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 									 Oid typioparam, int32 typmod,
 									 bool *isnull);
+static int	CopyReadBinaryData(CopyState cstate, char *dest, int nbytes);
 static void CopyAttributeOutText(CopyState cstate, char *string);
 static void CopyAttributeOutCSV(CopyState cstate, char *string,
 								bool use_quote, bool single_attr);
@@ -391,9 +394,7 @@ static void CopySendEndOfRow(CopyState cstate);
 static int	CopyGetData(CopyState cstate, void *databuf,
 						int minread, int maxread);
 static void CopySendInt32(CopyState cstate, int32 val);
-static bool CopyGetInt32(CopyState cstate, int32 *val);
 static void CopySendInt16(CopyState cstate, int16 val);
-static bool CopyGetInt16(CopyState cstate, int16 *val);
 
 
 /*
@@ -733,25 +734,6 @@ CopySendInt32(CopyState cstate, int32 val)
 }
 
 /*
- * CopyGetInt32 reads an int32 that appears in network byte order
- *
- * Returns true if OK, false if EOF
- */
-static bool
-CopyGetInt32(CopyState cstate, int32 *val)
-{
-	uint32		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int32) pg_ntoh32(buf);
-	return true;
-}
-
-/*
  * CopySendInt16 sends an int16 in network byte order
  */
 static void
@@ -763,23 +745,6 @@ CopySendInt16(CopyState cstate, int16 val)
 	CopySendData(cstate, &buf, sizeof(buf));
 }
 
-/*
- * CopyGetInt16 reads an int16 that appears in network byte order
- */
-static bool
-CopyGetInt16(CopyState cstate, int16 *val)
-{
-	uint16		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int16) pg_ntoh16(buf);
-	return true;
-}
-
 
 /*
  * CopyLoadRawBuf loads some more data into raw_buf
@@ -816,6 +781,59 @@ CopyLoadRawBuf(CopyState cstate)
 	return (inbytes > 0);
 }
 
+/*
+ * CopyReadBinaryData
+ *		Reads 'nbytes' bytes from cstate->copy_file via cstate->raw_buf and
+ *		writes them to 'dest'
+ */
+static int
+CopyReadBinaryData(CopyState cstate, char *dest, int nbytes)
+{
+	int		copied_bytes = 0;
+
+#define BUF_BYTES	(cstate->raw_buf_len - cstate->raw_buf_index)
+#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
+	do {\
+		memcpy((dest), (cstate)->raw_buf + (cstate)->raw_buf_index, (nbytes));\
+		(cstate)->raw_buf_index += (nbytes);\
+	} while(0)
+
+	if (BUF_BYTES >= nbytes)
+	{
+		/* Enough bytes are present in the buffer. */
+		DRAIN_COPY_RAW_BUF(cstate, dest, nbytes);
+		copied_bytes = nbytes;
+	}
+	else
+	{
+		/*
+		 * Not enough bytes in the buffer, so must read from the file.  Need
+		 * the loop considering that 'nbytes' may be larger than the maximum
+		 * bytes that the buffer can hold.
+		 */
+		do
+		{
+			int		copy_bytes;
+
+			/*
+			 * This tries to read up to RAW_BUF_SIZE bytes into raw_buf,
+			 * returning if no more were read.
+			 */
+			if (BUF_BYTES == 0 && !CopyLoadRawBuf(cstate))
+				return copied_bytes;
+
+			copy_bytes = Min(nbytes - copied_bytes, BUF_BYTES);
+			DRAIN_COPY_RAW_BUF(cstate, dest, copy_bytes);
+			dest += copy_bytes;
+			copied_bytes += copy_bytes;
+		} while (copied_bytes < nbytes);
+	}
+
+#undef DRAIN_COPY_RAW_BUF
+#undef BUF_BYTES
+
+	return copied_bytes;
+}
 
 /*
  *	 DoCopy executes the SQL COPY statement
@@ -3363,17 +3381,17 @@ BeginCopyFrom(ParseState *pstate,
 	cstate->cur_attval = NULL;
 
 	/*
-	 * Set up variables to avoid per-attribute overhead.  attribute_buf is
-	 * used in both text and binary modes, but we use line_buf and raw_buf
+	 * Set up variables to avoid per-attribute overhead. attribute_buf and
+	 * raw_buf are used in both text and binary modes, but we use line_buf
 	 * only in text mode.
 	 */
 	initStringInfo(&cstate->attribute_buf);
+	cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
+	cstate->raw_buf_index = cstate->raw_buf_len = 0;
 	if (!cstate->binary)
 	{
 		initStringInfo(&cstate->line_buf);
 		cstate->line_buf_converted = false;
-		cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
-		cstate->raw_buf_index = cstate->raw_buf_len = 0;
 	}
 
 	/* Assign range table, we'll need it in CopyFrom. */
@@ -3524,16 +3542,18 @@ BeginCopyFrom(ParseState *pstate,
 		int32		tmp;
 
 		/* Signature */
-		if (CopyGetData(cstate, readSig, 11, 11) != 11 ||
+		if (CopyReadBinaryData(cstate, readSig, 11) != 11 ||
 			memcmp(readSig, BinarySignature, 11) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("COPY file signature not recognized")));
 		/* Flags field */
-		if (!CopyGetInt32(cstate, &tmp))
+		if (CopyReadBinaryData(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp))
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing flags)")));
+		tmp = (int32) pg_ntoh32(tmp);
 		if ((tmp & (1 << 16)) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
@@ -3544,15 +3564,15 @@ BeginCopyFrom(ParseState *pstate,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("unrecognized critical flags in COPY file header")));
 		/* Header extension length */
-		if (!CopyGetInt32(cstate, &tmp) ||
-			tmp < 0)
+		if (CopyReadBinaryData(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp) || (tmp = (int32) pg_ntoh32(tmp)) < 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing length)")));
 		/* Skip extension header, if present */
 		while (tmp-- > 0)
 		{
-			if (CopyGetData(cstate, readSig, 1, 1) != 1)
+			if (CopyReadBinaryData(cstate, readSig, 1) != 1)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("invalid COPY file header (wrong length)")));
@@ -3745,12 +3765,14 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 
 		cstate->cur_lineno++;
 
-		if (!CopyGetInt16(cstate, &fld_count))
+		if (CopyReadBinaryData(cstate, (char *) &fld_count,
+							   sizeof(fld_count)) != sizeof(fld_count))
 		{
 			/* EOF detected (end of file, or protocol-level EOF) */
 			return false;
 		}
 
+		fld_count = (int16) pg_ntoh16(fld_count);
 		if (fld_count == -1)
 		{
 			/*
@@ -3768,7 +3790,7 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 			char		dummy;
 
 			if (cstate->copy_dest != COPY_OLD_FE &&
-				CopyGetData(cstate, &dummy, 1, 1) > 0)
+				CopyReadBinaryData(cstate, &dummy, 1) > 0)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("received copy data after EOF marker")));
@@ -4723,10 +4745,12 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	int32		fld_size;
 	Datum		result;
 
-	if (!CopyGetInt32(cstate, &fld_size))
+	if (CopyReadBinaryData(cstate, (char *) &fld_size, sizeof(fld_size)) !=
+		sizeof(fld_size))
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
+	fld_size = (int32) pg_ntoh32(fld_size);
 	if (fld_size == -1)
 	{
 		*isnull = true;
@@ -4741,8 +4765,8 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	resetStringInfo(&cstate->attribute_buf);
 
 	enlargeStringInfo(&cstate->attribute_buf, fld_size);
-	if (CopyGetData(cstate, cstate->attribute_buf.data,
-					fld_size, fld_size) != fld_size)
+	if (CopyReadBinaryData(cstate, cstate->attribute_buf.data,
+						   fld_size) != fld_size)
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
#11Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Amit Langote (#10)
Re: [PATCH] Performance Improvement For Copy From Binary Files

This is due to the recent commit
cd22d3cdb9bd9963c694c01a8c0232bbae3ddcfb, in which we restricted the
raw_buf and line_buf allocations for binary files. Since we are using
raw_buf for this performance improvement feature, now, it's enough to
restrict only line_buf for binary files. I made the changes
accordingly in the v3 patch attached here.

Regression tests(make check & make check-world) ran cleanly with the v3 patch.

Thank you Bharath. I was a bit surprised that you had also submitted
a patch to NOT allocate raw_buf for COPY FROM ... BINARY. :-)

Yes that was by me, before I started to work on this feature. I think
we can backpatch that change(assuming we don't backpatch this
feature), I will make the request accordingly.

Anyways, now we don't allow line_buf allocation for binary files,
which is also a good thing.

For the case where required nbytes may not fit into the buffer in
CopyReadFromRawBuf, I'm sure this can happen only for field data,
(field count , and field size are of fixed length and can fit in the
buffer), instead of reading them in parts of buff size into the buffer
(using CopyLoadRawBuf) and then DRAIN_COPY_RAW_BUF() to the
destination, I think we can detect this condition using requested
bytes and the buffer size and directly read from the file to the
destination buffer and then reload the raw_buffer for further
processing. I think this way, it will be good.

Hmm, I'm afraid that this will make the code more complex for
apparently small benefit. Is this really that much of a problem
performance wise?

Yes it makes CopyReadFromRawBuf(), code a bit complex from a
readability perspective. I'm convinced not to have the
abovementioned(by me) change, due to 3 reasons,1) the
readability/understandability 2) how many use cases can we have where
requested field size greater than RAW_BUF_SIZE(64KB)? I think very few
cases. I may be wrong here. 3) Performance wise it may not be much as
we do one extra memcpy only in situations where field sizes are
greater than 64KB(as we have already seen and observed by you as well
in one of the response [1]) that memcpy cost for this case may be
negligible.

Actually, an extra memcpy is incurred on every call of
CopyReadFromRawBuf(), but I haven't seen it to be very problematic.

Yes.

CopyReadFromRawBuf as a name for the new function might be misleading
as long as we are only using it for binary data. Maybe
CopyReadBinaryData is more appropriate? See attached v4 with these
and a few other cosmetic changes.

CopyReadBinaryData() looks meaningful. +1.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#12Amit Langote
amitlangote09@gmail.com
In reply to: Bharath Rupireddy (#11)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Mon, Jul 13, 2020 at 12:17 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

CopyReadFromRawBuf as a name for the new function might be misleading
as long as we are only using it for binary data. Maybe
CopyReadBinaryData is more appropriate? See attached v4 with these
and a few other cosmetic changes.

CopyReadBinaryData() looks meaningful. +1.

Okay, thanks. Let's have a committer take a look at this then?

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

#13Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Amit Langote (#12)
Re: [PATCH] Performance Improvement For Copy From Binary Files

CopyReadFromRawBuf as a name for the new function might be misleading
as long as we are only using it for binary data. Maybe
CopyReadBinaryData is more appropriate? See attached v4 with these
and a few other cosmetic changes.

CopyReadBinaryData() looks meaningful. +1.

Okay, thanks. Let's have a committer take a look at this then?

I think yes, unless someone has any more points/review comments.
Accordingly the status in the commitfest can be changed.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#14vignesh C
vignesh21@gmail.com
In reply to: Amit Langote (#10)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Mon, Jul 13, 2020 at 8:02 AM Amit Langote <amitlangote09@gmail.com> wrote:

By the way, considering the rebase over cd22d3cdb9b, it seemed to me
that we needed to update the comments in CopyStateData struct
definition a bit more. While doing that, I realized
CopyReadFromRawBuf as a name for the new function might be misleading
as long as we are only using it for binary data. Maybe
CopyReadBinaryData is more appropriate? See attached v4 with these
and a few other cosmetic changes.

I had one small comment:
+{
+       int             copied_bytes = 0;
+
+#define BUF_BYTES      (cstate->raw_buf_len - cstate->raw_buf_index)
+#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
+       do {\
+               memcpy((dest), (cstate)->raw_buf +
(cstate)->raw_buf_index, (nbytes));\
+               (cstate)->raw_buf_index += (nbytes);\
+       } while(0)

BUF_BYTES could be used in CopyLoadRawBuf function also.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

#15Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: vignesh C (#14)
1 attachment(s)
Re: [PATCH] Performance Improvement For Copy From Binary Files
I had one small comment:
+{
+       int             copied_bytes = 0;
+
+#define BUF_BYTES      (cstate->raw_buf_len - cstate->raw_buf_index)
+#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
+       do {\
+               memcpy((dest), (cstate)->raw_buf +
(cstate)->raw_buf_index, (nbytes));\
+               (cstate)->raw_buf_index += (nbytes);\
+       } while(0)

BUF_BYTES could be used in CopyLoadRawBuf function also.

Thanks Vignesh for the find out. I changed and attached the v5 patch.
The regression tests(make check and make check-world) ran
successfully.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

CopyFrom-binary-buffering_v5.patchapplication/x-patch; name=CopyFrom-binary-buffering_v5.patchDownload
From 757091a228b8def825f671cf318adde8cc4639ed Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddy@enterprisedb.com>
Date: Mon, 13 Jul 2020 19:57:57 +0530
Subject: [PATCH v5] Performance Improvement For Copy From Binary Files

Read data from binary file in RAW_BUF_SIZE bytes at a time for
COPY FROM command, instead of reading everytime from file for
field count, size and field data. This avoids fread calls
significantly and improves performance of the COPY FROM binary
files command.
---
 src/backend/commands/copy.c | 138 +++++++++++++++++++++---------------
 1 file changed, 82 insertions(+), 56 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 99d1457180..7973cde323 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -195,7 +195,8 @@ typedef struct CopyStateData
 	 * the buffer on each cycle.
 	 *
 	 * (In binary COPY FROM, attribute_buf holds the binary data for the
-	 * current field, while the other variables are not used.)
+	 * current field copied from raw_buf, which in turn holds raw data read
+	 * from the file.)
 	 */
 	StringInfoData attribute_buf;
 
@@ -219,8 +220,9 @@ typedef struct CopyStateData
 	 * Finally, raw_buf holds raw data read from the data source (file or
 	 * client connection).  CopyReadLine parses this data sufficiently to
 	 * locate line boundaries, then transfers the data to line_buf and
-	 * converts it.  Note: we guarantee that there is a \0 at
-	 * raw_buf[raw_buf_len].
+	 * converts it.  Also, CopyReadBinaryData() slices the data into
+	 * attributes based on the caller-specified field length in bytes.
+	 * Note: we guarantee that there is a \0 at raw_buf[raw_buf_len].
 	 */
 #define RAW_BUF_SIZE 65536		/* we palloc RAW_BUF_SIZE+1 bytes */
 	char	   *raw_buf;
@@ -257,6 +259,10 @@ typedef struct
 /* Trim the list of buffers back down to this number after flushing */
 #define MAX_PARTITION_BUFFERS	32
 
+/* Gets the bytes available in raw_buf. */
+
+#define BUF_BYTES	(cstate->raw_buf_len - cstate->raw_buf_index)
+
 /* Stores multi-insert data related to a single relation in CopyFrom. */
 typedef struct CopyMultiInsertBuffer
 {
@@ -373,6 +379,7 @@ static int	CopyReadAttributesCSV(CopyState cstate);
 static Datum CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 									 Oid typioparam, int32 typmod,
 									 bool *isnull);
+static int	CopyReadBinaryData(CopyState cstate, char *dest, int nbytes);
 static void CopyAttributeOutText(CopyState cstate, char *string);
 static void CopyAttributeOutCSV(CopyState cstate, char *string,
 								bool use_quote, bool single_attr);
@@ -391,9 +398,7 @@ static void CopySendEndOfRow(CopyState cstate);
 static int	CopyGetData(CopyState cstate, void *databuf,
 						int minread, int maxread);
 static void CopySendInt32(CopyState cstate, int32 val);
-static bool CopyGetInt32(CopyState cstate, int32 *val);
 static void CopySendInt16(CopyState cstate, int16 val);
-static bool CopyGetInt16(CopyState cstate, int16 *val);
 
 
 /*
@@ -732,25 +737,6 @@ CopySendInt32(CopyState cstate, int32 val)
 	CopySendData(cstate, &buf, sizeof(buf));
 }
 
-/*
- * CopyGetInt32 reads an int32 that appears in network byte order
- *
- * Returns true if OK, false if EOF
- */
-static bool
-CopyGetInt32(CopyState cstate, int32 *val)
-{
-	uint32		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int32) pg_ntoh32(buf);
-	return true;
-}
-
 /*
  * CopySendInt16 sends an int16 in network byte order
  */
@@ -763,23 +749,6 @@ CopySendInt16(CopyState cstate, int16 val)
 	CopySendData(cstate, &buf, sizeof(buf));
 }
 
-/*
- * CopyGetInt16 reads an int16 that appears in network byte order
- */
-static bool
-CopyGetInt16(CopyState cstate, int16 *val)
-{
-	uint16		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int16) pg_ntoh16(buf);
-	return true;
-}
-
 
 /*
  * CopyLoadRawBuf loads some more data into raw_buf
@@ -800,7 +769,7 @@ CopyLoadRawBuf(CopyState cstate)
 	if (cstate->raw_buf_index < cstate->raw_buf_len)
 	{
 		/* Copy down the unprocessed data */
-		nbytes = cstate->raw_buf_len - cstate->raw_buf_index;
+		nbytes = BUF_BYTES;
 		memmove(cstate->raw_buf, cstate->raw_buf + cstate->raw_buf_index,
 				nbytes);
 	}
@@ -816,6 +785,57 @@ CopyLoadRawBuf(CopyState cstate)
 	return (inbytes > 0);
 }
 
+/*
+ * CopyReadBinaryData
+ *		Reads 'nbytes' bytes from cstate->copy_file via cstate->raw_buf and
+ *		writes them to 'dest'
+ */
+static int
+CopyReadBinaryData(CopyState cstate, char *dest, int nbytes)
+{
+	int		copied_bytes = 0;
+
+#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
+	do {\
+		memcpy((dest), (cstate)->raw_buf + (cstate)->raw_buf_index, (nbytes));\
+		(cstate)->raw_buf_index += (nbytes);\
+	} while(0)
+
+	if (BUF_BYTES >= nbytes)
+	{
+		/* Enough bytes are present in the buffer. */
+		DRAIN_COPY_RAW_BUF(cstate, dest, nbytes);
+		copied_bytes = nbytes;
+	}
+	else
+	{
+		/*
+		 * Not enough bytes in the buffer, so must read from the file.  Need
+		 * the loop considering that 'nbytes' may be larger than the maximum
+		 * bytes that the buffer can hold.
+		 */
+		do
+		{
+			int		copy_bytes;
+
+			/*
+			 * This tries to read up to RAW_BUF_SIZE bytes into raw_buf,
+			 * returning if no more were read.
+			 */
+			if (BUF_BYTES == 0 && !CopyLoadRawBuf(cstate))
+				return copied_bytes;
+
+			copy_bytes = Min(nbytes - copied_bytes, BUF_BYTES);
+			DRAIN_COPY_RAW_BUF(cstate, dest, copy_bytes);
+			dest += copy_bytes;
+			copied_bytes += copy_bytes;
+		} while (copied_bytes < nbytes);
+	}
+
+#undef DRAIN_COPY_RAW_BUF
+
+	return copied_bytes;
+}
 
 /*
  *	 DoCopy executes the SQL COPY statement
@@ -3363,17 +3383,17 @@ BeginCopyFrom(ParseState *pstate,
 	cstate->cur_attval = NULL;
 
 	/*
-	 * Set up variables to avoid per-attribute overhead.  attribute_buf is
-	 * used in both text and binary modes, but we use line_buf and raw_buf
+	 * Set up variables to avoid per-attribute overhead. attribute_buf and
+	 * raw_buf are used in both text and binary modes, but we use line_buf
 	 * only in text mode.
 	 */
 	initStringInfo(&cstate->attribute_buf);
+	cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
+	cstate->raw_buf_index = cstate->raw_buf_len = 0;
 	if (!cstate->binary)
 	{
 		initStringInfo(&cstate->line_buf);
 		cstate->line_buf_converted = false;
-		cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
-		cstate->raw_buf_index = cstate->raw_buf_len = 0;
 	}
 
 	/* Assign range table, we'll need it in CopyFrom. */
@@ -3524,16 +3544,18 @@ BeginCopyFrom(ParseState *pstate,
 		int32		tmp;
 
 		/* Signature */
-		if (CopyGetData(cstate, readSig, 11, 11) != 11 ||
+		if (CopyReadBinaryData(cstate, readSig, 11) != 11 ||
 			memcmp(readSig, BinarySignature, 11) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("COPY file signature not recognized")));
 		/* Flags field */
-		if (!CopyGetInt32(cstate, &tmp))
+		if (CopyReadBinaryData(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp))
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing flags)")));
+		tmp = (int32) pg_ntoh32(tmp);
 		if ((tmp & (1 << 16)) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
@@ -3544,15 +3566,15 @@ BeginCopyFrom(ParseState *pstate,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("unrecognized critical flags in COPY file header")));
 		/* Header extension length */
-		if (!CopyGetInt32(cstate, &tmp) ||
-			tmp < 0)
+		if (CopyReadBinaryData(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp) || (tmp = (int32) pg_ntoh32(tmp)) < 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing length)")));
 		/* Skip extension header, if present */
 		while (tmp-- > 0)
 		{
-			if (CopyGetData(cstate, readSig, 1, 1) != 1)
+			if (CopyReadBinaryData(cstate, readSig, 1) != 1)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("invalid COPY file header (wrong length)")));
@@ -3745,12 +3767,14 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 
 		cstate->cur_lineno++;
 
-		if (!CopyGetInt16(cstate, &fld_count))
+		if (CopyReadBinaryData(cstate, (char *) &fld_count,
+							   sizeof(fld_count)) != sizeof(fld_count))
 		{
 			/* EOF detected (end of file, or protocol-level EOF) */
 			return false;
 		}
 
+		fld_count = (int16) pg_ntoh16(fld_count);
 		if (fld_count == -1)
 		{
 			/*
@@ -3768,7 +3792,7 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 			char		dummy;
 
 			if (cstate->copy_dest != COPY_OLD_FE &&
-				CopyGetData(cstate, &dummy, 1, 1) > 0)
+				CopyReadBinaryData(cstate, &dummy, 1) > 0)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("received copy data after EOF marker")));
@@ -4723,10 +4747,12 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	int32		fld_size;
 	Datum		result;
 
-	if (!CopyGetInt32(cstate, &fld_size))
+	if (CopyReadBinaryData(cstate, (char *) &fld_size, sizeof(fld_size)) !=
+		sizeof(fld_size))
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
+	fld_size = (int32) pg_ntoh32(fld_size);
 	if (fld_size == -1)
 	{
 		*isnull = true;
@@ -4741,8 +4767,8 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	resetStringInfo(&cstate->attribute_buf);
 
 	enlargeStringInfo(&cstate->attribute_buf, fld_size);
-	if (CopyGetData(cstate, cstate->attribute_buf.data,
-					fld_size, fld_size) != fld_size)
+	if (CopyReadBinaryData(cstate, cstate->attribute_buf.data,
+						   fld_size) != fld_size)
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
-- 
2.25.1

#16Amit Langote
amitlangote09@gmail.com
In reply to: Bharath Rupireddy (#15)
1 attachment(s)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Mon, Jul 13, 2020 at 11:58 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

I had one small comment:
+{
+       int             copied_bytes = 0;
+
+#define BUF_BYTES      (cstate->raw_buf_len - cstate->raw_buf_index)
+#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
+       do {\
+               memcpy((dest), (cstate)->raw_buf +
(cstate)->raw_buf_index, (nbytes));\
+               (cstate)->raw_buf_index += (nbytes);\
+       } while(0)

BUF_BYTES could be used in CopyLoadRawBuf function also.

Thanks Vignesh for the find out. I changed and attached the v5 patch.
The regression tests(make check and make check-world) ran
successfully.

Good idea, thanks.

In CopyLoadRawBuf(), we could also change the condition if
(cstate->raw_buf_index < cstate->raw_buf_len) to if (BUF_BYTES > 0),
which looks clearer.

Also, if we are going to use the macro more generally, let's make it
look less localized. For example, rename it to RAW_BUF_BYTES similar
to RAW_BUF_SIZE and place their definitions close by. It also seems
like a good idea to make 'cstate' a parameter for clarity.

Attached v6.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

CopyFrom-binary-buffering_v6.patchapplication/octet-stream; name=CopyFrom-binary-buffering_v6.patchDownload
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 99d1457..f87aedd 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -195,7 +195,8 @@ typedef struct CopyStateData
 	 * the buffer on each cycle.
 	 *
 	 * (In binary COPY FROM, attribute_buf holds the binary data for the
-	 * current field, while the other variables are not used.)
+	 * current field copied from raw_buf, which in turn holds raw data read
+	 * from the file.)
 	 */
 	StringInfoData attribute_buf;
 
@@ -219,13 +220,18 @@ typedef struct CopyStateData
 	 * Finally, raw_buf holds raw data read from the data source (file or
 	 * client connection).  CopyReadLine parses this data sufficiently to
 	 * locate line boundaries, then transfers the data to line_buf and
-	 * converts it.  Note: we guarantee that there is a \0 at
-	 * raw_buf[raw_buf_len].
+	 * converts it.  Also, CopyReadBinaryData() slices the data into
+	 * attributes based on the caller-specified field length in bytes.
+	 * Note: we guarantee that there is a \0 at raw_buf[raw_buf_len].
 	 */
 #define RAW_BUF_SIZE 65536		/* we palloc RAW_BUF_SIZE+1 bytes */
 	char	   *raw_buf;
 	int			raw_buf_index;	/* next byte to process */
 	int			raw_buf_len;	/* total # of bytes stored */
+
+	/* Shorthand for bytes in raw_buf available for reading */
+#define RAW_BUF_BYTES(cstate)\
+	((cstate)->raw_buf_len - (cstate)->raw_buf_index)
 } CopyStateData;
 
 /* DestReceiver for COPY (query) TO */
@@ -373,6 +379,7 @@ static int	CopyReadAttributesCSV(CopyState cstate);
 static Datum CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 									 Oid typioparam, int32 typmod,
 									 bool *isnull);
+static int	CopyReadBinaryData(CopyState cstate, char *dest, int nbytes);
 static void CopyAttributeOutText(CopyState cstate, char *string);
 static void CopyAttributeOutCSV(CopyState cstate, char *string,
 								bool use_quote, bool single_attr);
@@ -391,9 +398,7 @@ static void CopySendEndOfRow(CopyState cstate);
 static int	CopyGetData(CopyState cstate, void *databuf,
 						int minread, int maxread);
 static void CopySendInt32(CopyState cstate, int32 val);
-static bool CopyGetInt32(CopyState cstate, int32 *val);
 static void CopySendInt16(CopyState cstate, int16 val);
-static bool CopyGetInt16(CopyState cstate, int16 *val);
 
 
 /*
@@ -733,25 +738,6 @@ CopySendInt32(CopyState cstate, int32 val)
 }
 
 /*
- * CopyGetInt32 reads an int32 that appears in network byte order
- *
- * Returns true if OK, false if EOF
- */
-static bool
-CopyGetInt32(CopyState cstate, int32 *val)
-{
-	uint32		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int32) pg_ntoh32(buf);
-	return true;
-}
-
-/*
  * CopySendInt16 sends an int16 in network byte order
  */
 static void
@@ -763,33 +749,16 @@ CopySendInt16(CopyState cstate, int16 val)
 	CopySendData(cstate, &buf, sizeof(buf));
 }
 
-/*
- * CopyGetInt16 reads an int16 that appears in network byte order
- */
-static bool
-CopyGetInt16(CopyState cstate, int16 *val)
-{
-	uint16		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int16) pg_ntoh16(buf);
-	return true;
-}
-
 
 /*
  * CopyLoadRawBuf loads some more data into raw_buf
  *
  * Returns true if able to obtain at least one more byte, else false.
  *
- * If raw_buf_index < raw_buf_len, the unprocessed bytes are transferred
- * down to the start of the buffer and then we load more data after that.
- * This case is used only when a frontend multibyte character crosses a
- * bufferload boundary.
+ * If RAW_BUF_BYTES(cstate) > 0, the unprocessed bytes are transferred down to
+ * the start of the buffer and then we load more data after that.  This case
+ * is only used when a frontend multibyte character crosses a bufferload
+ * boundary.
  */
 static bool
 CopyLoadRawBuf(CopyState cstate)
@@ -797,10 +766,10 @@ CopyLoadRawBuf(CopyState cstate)
 	int			nbytes;
 	int			inbytes;
 
-	if (cstate->raw_buf_index < cstate->raw_buf_len)
+	if (RAW_BUF_BYTES(cstate) > 0)
 	{
 		/* Copy down the unprocessed data */
-		nbytes = cstate->raw_buf_len - cstate->raw_buf_index;
+		nbytes = RAW_BUF_BYTES(cstate);
 		memmove(cstate->raw_buf, cstate->raw_buf + cstate->raw_buf_index,
 				nbytes);
 	}
@@ -816,6 +785,57 @@ CopyLoadRawBuf(CopyState cstate)
 	return (inbytes > 0);
 }
 
+/*
+ * CopyReadBinaryData
+ *		Reads 'nbytes' bytes from cstate->copy_file via cstate->raw_buf and
+ *		writes them to 'dest'
+ */
+static int
+CopyReadBinaryData(CopyState cstate, char *dest, int nbytes)
+{
+	int		copied_bytes = 0;
+
+#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
+	do {\
+		memcpy((dest), (cstate)->raw_buf + (cstate)->raw_buf_index, (nbytes));\
+		(cstate)->raw_buf_index += (nbytes);\
+	} while(0)
+
+	if (RAW_BUF_BYTES(cstate) >= nbytes)
+	{
+		/* Enough bytes are present in the buffer. */
+		DRAIN_COPY_RAW_BUF(cstate, dest, nbytes);
+		copied_bytes = nbytes;
+	}
+	else
+	{
+		/*
+		 * Not enough bytes in the buffer, so must read from the file.  Need
+		 * the loop considering that 'nbytes' may be larger than the maximum
+		 * bytes that the buffer can hold.
+		 */
+		do
+		{
+			int		copy_bytes;
+
+			/*
+			 * This tries to read up to RAW_BUF_SIZE bytes into raw_buf,
+			 * returning if no more were read.
+			 */
+			if (RAW_BUF_BYTES(cstate) == 0 && !CopyLoadRawBuf(cstate))
+				return copied_bytes;
+
+			copy_bytes = Min(nbytes - copied_bytes, RAW_BUF_BYTES(cstate));
+			DRAIN_COPY_RAW_BUF(cstate, dest, copy_bytes);
+			dest += copy_bytes;
+			copied_bytes += copy_bytes;
+		} while (copied_bytes < nbytes);
+	}
+
+#undef DRAIN_COPY_RAW_BUF
+
+	return copied_bytes;
+}
 
 /*
  *	 DoCopy executes the SQL COPY statement
@@ -3363,17 +3383,17 @@ BeginCopyFrom(ParseState *pstate,
 	cstate->cur_attval = NULL;
 
 	/*
-	 * Set up variables to avoid per-attribute overhead.  attribute_buf is
-	 * used in both text and binary modes, but we use line_buf and raw_buf
+	 * Set up variables to avoid per-attribute overhead. attribute_buf and
+	 * raw_buf are used in both text and binary modes, but we use line_buf
 	 * only in text mode.
 	 */
 	initStringInfo(&cstate->attribute_buf);
+	cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
+	cstate->raw_buf_index = cstate->raw_buf_len = 0;
 	if (!cstate->binary)
 	{
 		initStringInfo(&cstate->line_buf);
 		cstate->line_buf_converted = false;
-		cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
-		cstate->raw_buf_index = cstate->raw_buf_len = 0;
 	}
 
 	/* Assign range table, we'll need it in CopyFrom. */
@@ -3524,16 +3544,18 @@ BeginCopyFrom(ParseState *pstate,
 		int32		tmp;
 
 		/* Signature */
-		if (CopyGetData(cstate, readSig, 11, 11) != 11 ||
+		if (CopyReadBinaryData(cstate, readSig, 11) != 11 ||
 			memcmp(readSig, BinarySignature, 11) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("COPY file signature not recognized")));
 		/* Flags field */
-		if (!CopyGetInt32(cstate, &tmp))
+		if (CopyReadBinaryData(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp))
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing flags)")));
+		tmp = (int32) pg_ntoh32(tmp);
 		if ((tmp & (1 << 16)) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
@@ -3544,15 +3566,15 @@ BeginCopyFrom(ParseState *pstate,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("unrecognized critical flags in COPY file header")));
 		/* Header extension length */
-		if (!CopyGetInt32(cstate, &tmp) ||
-			tmp < 0)
+		if (CopyReadBinaryData(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp) || (tmp = (int32) pg_ntoh32(tmp)) < 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing length)")));
 		/* Skip extension header, if present */
 		while (tmp-- > 0)
 		{
-			if (CopyGetData(cstate, readSig, 1, 1) != 1)
+			if (CopyReadBinaryData(cstate, readSig, 1) != 1)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("invalid COPY file header (wrong length)")));
@@ -3745,12 +3767,14 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 
 		cstate->cur_lineno++;
 
-		if (!CopyGetInt16(cstate, &fld_count))
+		if (CopyReadBinaryData(cstate, (char *) &fld_count,
+							   sizeof(fld_count)) != sizeof(fld_count))
 		{
 			/* EOF detected (end of file, or protocol-level EOF) */
 			return false;
 		}
 
+		fld_count = (int16) pg_ntoh16(fld_count);
 		if (fld_count == -1)
 		{
 			/*
@@ -3768,7 +3792,7 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 			char		dummy;
 
 			if (cstate->copy_dest != COPY_OLD_FE &&
-				CopyGetData(cstate, &dummy, 1, 1) > 0)
+				CopyReadBinaryData(cstate, &dummy, 1) > 0)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("received copy data after EOF marker")));
@@ -4723,10 +4747,12 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	int32		fld_size;
 	Datum		result;
 
-	if (!CopyGetInt32(cstate, &fld_size))
+	if (CopyReadBinaryData(cstate, (char *) &fld_size, sizeof(fld_size)) !=
+		sizeof(fld_size))
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
+	fld_size = (int32) pg_ntoh32(fld_size);
 	if (fld_size == -1)
 	{
 		*isnull = true;
@@ -4741,8 +4767,8 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	resetStringInfo(&cstate->attribute_buf);
 
 	enlargeStringInfo(&cstate->attribute_buf, fld_size);
-	if (CopyGetData(cstate, cstate->attribute_buf.data,
-					fld_size, fld_size) != fld_size)
+	if (CopyReadBinaryData(cstate, cstate->attribute_buf.data,
+						   fld_size) != fld_size)
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
#17vignesh C
vignesh21@gmail.com
In reply to: Amit Langote (#16)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Tue, Jul 14, 2020 at 7:26 AM Amit Langote <amitlangote09@gmail.com> wrote:

Good idea, thanks.

In CopyLoadRawBuf(), we could also change the condition if
(cstate->raw_buf_index < cstate->raw_buf_len) to if (BUF_BYTES > 0),
which looks clearer.

Also, if we are going to use the macro more generally, let's make it
look less localized. For example, rename it to RAW_BUF_BYTES similar
to RAW_BUF_SIZE and place their definitions close by. It also seems
like a good idea to make 'cstate' a parameter for clarity.

Attached v6.

Thanks for making the changes.

-       if (cstate->raw_buf_index < cstate->raw_buf_len)
+       if (RAW_BUF_BYTES(cstate) > 0)
        {
                /* Copy down the unprocessed data */
-               nbytes = cstate->raw_buf_len - cstate->raw_buf_index;
+               nbytes = RAW_BUF_BYTES(cstate);
                memmove(cstate->raw_buf, cstate->raw_buf +
cstate->raw_buf_index,
                                nbytes);
        }

One small improvement could be to change it like below to reduce few
more instructions:
static bool
CopyLoadRawBuf(CopyState cstate)
{
int nbytes = RAW_BUF_BYTES(cstate);
int inbytes;

/* Copy down the unprocessed data */
if (nbytes > 0)
memmove(cstate->raw_buf, cstate->raw_buf + cstate->raw_buf_index,
nbytes);

inbytes = CopyGetData(cstate, cstate->raw_buf + nbytes,
1, RAW_BUF_SIZE - nbytes);
nbytes += inbytes;
cstate->raw_buf[nbytes] = '\0';
cstate->raw_buf_index = 0;
cstate->raw_buf_len = nbytes;
return (inbytes > 0);
}

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

#18Amit Langote
amitlangote09@gmail.com
In reply to: vignesh C (#17)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Tue, Jul 14, 2020 at 2:02 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, Jul 14, 2020 at 7:26 AM Amit Langote <amitlangote09@gmail.com> wrote:

In CopyLoadRawBuf(), we could also change the condition if
(cstate->raw_buf_index < cstate->raw_buf_len) to if (BUF_BYTES > 0),
which looks clearer.

Also, if we are going to use the macro more generally, let's make it
look less localized. For example, rename it to RAW_BUF_BYTES similar
to RAW_BUF_SIZE and place their definitions close by. It also seems
like a good idea to make 'cstate' a parameter for clarity.

Attached v6.

Thanks for making the changes.

-       if (cstate->raw_buf_index < cstate->raw_buf_len)
+       if (RAW_BUF_BYTES(cstate) > 0)
{
/* Copy down the unprocessed data */
-               nbytes = cstate->raw_buf_len - cstate->raw_buf_index;
+               nbytes = RAW_BUF_BYTES(cstate);
memmove(cstate->raw_buf, cstate->raw_buf +
cstate->raw_buf_index,
nbytes);
}

One small improvement could be to change it like below to reduce few
more instructions:
static bool
CopyLoadRawBuf(CopyState cstate)
{
int nbytes = RAW_BUF_BYTES(cstate);
int inbytes;

/* Copy down the unprocessed data */
if (nbytes > 0)
memmove(cstate->raw_buf, cstate->raw_buf + cstate->raw_buf_index,
nbytes);

inbytes = CopyGetData(cstate, cstate->raw_buf + nbytes,
1, RAW_BUF_SIZE - nbytes);
nbytes += inbytes;
cstate->raw_buf[nbytes] = '\0';
cstate->raw_buf_index = 0;
cstate->raw_buf_len = nbytes;
return (inbytes > 0);
}

Sounds fine to me. Although CopyLoadRawBuf() does not seem to a
candidate for rigorous code optimization as it does not get called
that often.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

#19vignesh C
vignesh21@gmail.com
In reply to: Amit Langote (#18)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Tue, Jul 14, 2020 at 11:19 AM Amit Langote <amitlangote09@gmail.com> wrote:

Sounds fine to me. Although CopyLoadRawBuf() does not seem to a
candidate for rigorous code optimization as it does not get called
that often.

I thought we could include that change as we are making changes around
that code. Rest of the changes looked fine to me. Also I noticed that
commit message was missing in the patch.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

#20Amit Langote
amitlangote09@gmail.com
In reply to: vignesh C (#19)
1 attachment(s)
Re: [PATCH] Performance Improvement For Copy From Binary Files

Hi Vignesh,

On Tue, Jul 14, 2020 at 10:23 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, Jul 14, 2020 at 11:19 AM Amit Langote <amitlangote09@gmail.com> wrote:

Sounds fine to me. Although CopyLoadRawBuf() does not seem to a
candidate for rigorous code optimization as it does not get called
that often.

I thought we could include that change as we are making changes around
that code.

Sure, done.

Rest of the changes looked fine to me. Also I noticed that
commit message was missing in the patch.

Please see the attached v7.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v7-0001-Improve-performance-of-binary-COPY-FROM-with-buff.patchapplication/octet-stream; name=v7-0001-Improve-performance-of-binary-COPY-FROM-with-buff.patchDownload
From 9212cc6b2b8d96e48a74fbdd04af5b375bd5a0c3 Mon Sep 17 00:00:00 2001
From: amitlan <amitlangote09@gmail.com>
Date: Tue, 14 Jul 2020 10:43:30 +0900
Subject: [PATCH v7] Improve performance of binary COPY FROM with buffering

Currently, each piece of binary data, that is, per-row field count,
per-field data length, and the field data itself, is read directly
from the file, which shows up as significant overhead.

This changes things to read the binary data in RAW_BUF_SIZE increments
from the file into raw_buf and parse out individual pieces of binary data
from raw_buf.  This is similar to how textual COPY FROM works although
this doesn't go far enough to use something akin to line_buf.
---
 src/backend/commands/copy.c | 155 +++++++++++++++++++++++++-------------------
 1 file changed, 88 insertions(+), 67 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 44da71c..0a9a86f 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -195,7 +195,8 @@ typedef struct CopyStateData
 	 * the buffer on each cycle.
 	 *
 	 * (In binary COPY FROM, attribute_buf holds the binary data for the
-	 * current field, while the other variables are not used.)
+	 * current field copied from raw_buf, which in turn holds raw data read
+	 * from the file.)
 	 */
 	StringInfoData attribute_buf;
 
@@ -219,13 +220,18 @@ typedef struct CopyStateData
 	 * Finally, raw_buf holds raw data read from the data source (file or
 	 * client connection).  CopyReadLine parses this data sufficiently to
 	 * locate line boundaries, then transfers the data to line_buf and
-	 * converts it.  Note: we guarantee that there is a \0 at
-	 * raw_buf[raw_buf_len].
+	 * converts it.  Also, CopyReadBinaryData() slices the data into
+	 * attributes based on the caller-specified field length in bytes.
+	 * Note: we guarantee that there is a \0 at raw_buf[raw_buf_len].
 	 */
 #define RAW_BUF_SIZE 65536		/* we palloc RAW_BUF_SIZE+1 bytes */
 	char	   *raw_buf;
 	int			raw_buf_index;	/* next byte to process */
 	int			raw_buf_len;	/* total # of bytes stored */
+
+	/* Shorthand for bytes in raw_buf available for reading */
+#define RAW_BUF_BYTES(cstate)\
+	((cstate)->raw_buf_len - (cstate)->raw_buf_index)
 } CopyStateData;
 
 /* DestReceiver for COPY (query) TO */
@@ -373,6 +379,7 @@ static int	CopyReadAttributesCSV(CopyState cstate);
 static Datum CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 									 Oid typioparam, int32 typmod,
 									 bool *isnull);
+static int	CopyReadBinaryData(CopyState cstate, char *dest, int nbytes);
 static void CopyAttributeOutText(CopyState cstate, char *string);
 static void CopyAttributeOutCSV(CopyState cstate, char *string,
 								bool use_quote, bool single_attr);
@@ -391,9 +398,7 @@ static void CopySendEndOfRow(CopyState cstate);
 static int	CopyGetData(CopyState cstate, void *databuf,
 						int minread, int maxread);
 static void CopySendInt32(CopyState cstate, int32 val);
-static bool CopyGetInt32(CopyState cstate, int32 *val);
 static void CopySendInt16(CopyState cstate, int16 val);
-static bool CopyGetInt16(CopyState cstate, int16 *val);
 
 
 /*
@@ -733,25 +738,6 @@ CopySendInt32(CopyState cstate, int32 val)
 }
 
 /*
- * CopyGetInt32 reads an int32 that appears in network byte order
- *
- * Returns true if OK, false if EOF
- */
-static bool
-CopyGetInt32(CopyState cstate, int32 *val)
-{
-	uint32		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int32) pg_ntoh32(buf);
-	return true;
-}
-
-/*
  * CopySendInt16 sends an int16 in network byte order
  */
 static void
@@ -763,49 +749,27 @@ CopySendInt16(CopyState cstate, int16 val)
 	CopySendData(cstate, &buf, sizeof(buf));
 }
 
-/*
- * CopyGetInt16 reads an int16 that appears in network byte order
- */
-static bool
-CopyGetInt16(CopyState cstate, int16 *val)
-{
-	uint16		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int16) pg_ntoh16(buf);
-	return true;
-}
-
 
 /*
  * CopyLoadRawBuf loads some more data into raw_buf
  *
  * Returns true if able to obtain at least one more byte, else false.
  *
- * If raw_buf_index < raw_buf_len, the unprocessed bytes are transferred
- * down to the start of the buffer and then we load more data after that.
- * This case is used only when a frontend multibyte character crosses a
- * bufferload boundary.
+ * If RAW_BUF_BYTES(cstate) > 0, the unprocessed bytes are transferred down to
+ * the start of the buffer and then we load more data after that.  This case
+ * is only used when a frontend multibyte character crosses a bufferload
+ * boundary.
  */
 static bool
 CopyLoadRawBuf(CopyState cstate)
 {
-	int			nbytes;
+	int			nbytes = RAW_BUF_BYTES(cstate);
 	int			inbytes;
 
-	if (cstate->raw_buf_index < cstate->raw_buf_len)
-	{
-		/* Copy down the unprocessed data */
-		nbytes = cstate->raw_buf_len - cstate->raw_buf_index;
+	/* Copy down the unprocessed data if any. */
+	if (nbytes > 0)
 		memmove(cstate->raw_buf, cstate->raw_buf + cstate->raw_buf_index,
 				nbytes);
-	}
-	else
-		nbytes = 0;				/* no data need be saved */
 
 	inbytes = CopyGetData(cstate, cstate->raw_buf + nbytes,
 						  1, RAW_BUF_SIZE - nbytes);
@@ -816,6 +780,57 @@ CopyLoadRawBuf(CopyState cstate)
 	return (inbytes > 0);
 }
 
+/*
+ * CopyReadBinaryData
+ *		Reads 'nbytes' bytes from cstate->copy_file via cstate->raw_buf and
+ *		writes them to 'dest'
+ */
+static int
+CopyReadBinaryData(CopyState cstate, char *dest, int nbytes)
+{
+	int		copied_bytes = 0;
+
+#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
+	do {\
+		memcpy((dest), (cstate)->raw_buf + (cstate)->raw_buf_index, (nbytes));\
+		(cstate)->raw_buf_index += (nbytes);\
+	} while(0)
+
+	if (RAW_BUF_BYTES(cstate) >= nbytes)
+	{
+		/* Enough bytes are present in the buffer. */
+		DRAIN_COPY_RAW_BUF(cstate, dest, nbytes);
+		copied_bytes = nbytes;
+	}
+	else
+	{
+		/*
+		 * Not enough bytes in the buffer, so must read from the file.  Need
+		 * the loop considering that 'nbytes' may be larger than the maximum
+		 * bytes that the buffer can hold.
+		 */
+		do
+		{
+			int		copy_bytes;
+
+			/*
+			 * This tries to read up to RAW_BUF_SIZE bytes into raw_buf,
+			 * returning if no more were read.
+			 */
+			if (RAW_BUF_BYTES(cstate) == 0 && !CopyLoadRawBuf(cstate))
+				return copied_bytes;
+
+			copy_bytes = Min(nbytes - copied_bytes, RAW_BUF_BYTES(cstate));
+			DRAIN_COPY_RAW_BUF(cstate, dest, copy_bytes);
+			dest += copy_bytes;
+			copied_bytes += copy_bytes;
+		} while (copied_bytes < nbytes);
+	}
+
+#undef DRAIN_COPY_RAW_BUF
+
+	return copied_bytes;
+}
 
 /*
  *	 DoCopy executes the SQL COPY statement
@@ -3366,17 +3381,17 @@ BeginCopyFrom(ParseState *pstate,
 	cstate->cur_attval = NULL;
 
 	/*
-	 * Set up variables to avoid per-attribute overhead.  attribute_buf is
-	 * used in both text and binary modes, but we use line_buf and raw_buf
+	 * Set up variables to avoid per-attribute overhead. attribute_buf and
+	 * raw_buf are used in both text and binary modes, but we use line_buf
 	 * only in text mode.
 	 */
 	initStringInfo(&cstate->attribute_buf);
+	cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
+	cstate->raw_buf_index = cstate->raw_buf_len = 0;
 	if (!cstate->binary)
 	{
 		initStringInfo(&cstate->line_buf);
 		cstate->line_buf_converted = false;
-		cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
-		cstate->raw_buf_index = cstate->raw_buf_len = 0;
 	}
 
 	/* Assign range table, we'll need it in CopyFrom. */
@@ -3527,16 +3542,18 @@ BeginCopyFrom(ParseState *pstate,
 		int32		tmp;
 
 		/* Signature */
-		if (CopyGetData(cstate, readSig, 11, 11) != 11 ||
+		if (CopyReadBinaryData(cstate, readSig, 11) != 11 ||
 			memcmp(readSig, BinarySignature, 11) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("COPY file signature not recognized")));
 		/* Flags field */
-		if (!CopyGetInt32(cstate, &tmp))
+		if (CopyReadBinaryData(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp))
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing flags)")));
+		tmp = (int32) pg_ntoh32(tmp);
 		if ((tmp & (1 << 16)) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
@@ -3547,15 +3564,15 @@ BeginCopyFrom(ParseState *pstate,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("unrecognized critical flags in COPY file header")));
 		/* Header extension length */
-		if (!CopyGetInt32(cstate, &tmp) ||
-			tmp < 0)
+		if (CopyReadBinaryData(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp) || (tmp = (int32) pg_ntoh32(tmp)) < 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing length)")));
 		/* Skip extension header, if present */
 		while (tmp-- > 0)
 		{
-			if (CopyGetData(cstate, readSig, 1, 1) != 1)
+			if (CopyReadBinaryData(cstate, readSig, 1) != 1)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("invalid COPY file header (wrong length)")));
@@ -3748,12 +3765,14 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 
 		cstate->cur_lineno++;
 
-		if (!CopyGetInt16(cstate, &fld_count))
+		if (CopyReadBinaryData(cstate, (char *) &fld_count,
+							   sizeof(fld_count)) != sizeof(fld_count))
 		{
 			/* EOF detected (end of file, or protocol-level EOF) */
 			return false;
 		}
 
+		fld_count = (int16) pg_ntoh16(fld_count);
 		if (fld_count == -1)
 		{
 			/*
@@ -3771,7 +3790,7 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 			char		dummy;
 
 			if (cstate->copy_dest != COPY_OLD_FE &&
-				CopyGetData(cstate, &dummy, 1, 1) > 0)
+				CopyReadBinaryData(cstate, &dummy, 1) > 0)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("received copy data after EOF marker")));
@@ -4726,10 +4745,12 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	int32		fld_size;
 	Datum		result;
 
-	if (!CopyGetInt32(cstate, &fld_size))
+	if (CopyReadBinaryData(cstate, (char *) &fld_size, sizeof(fld_size)) !=
+		sizeof(fld_size))
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
+	fld_size = (int32) pg_ntoh32(fld_size);
 	if (fld_size == -1)
 	{
 		*isnull = true;
@@ -4744,8 +4765,8 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	resetStringInfo(&cstate->attribute_buf);
 
 	enlargeStringInfo(&cstate->attribute_buf, fld_size);
-	if (CopyGetData(cstate, cstate->attribute_buf.data,
-					fld_size, fld_size) != fld_size)
+	if (CopyReadBinaryData(cstate, cstate->attribute_buf.data,
+						   fld_size) != fld_size)
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
-- 
1.8.3.1

#21vignesh C
vignesh21@gmail.com
In reply to: Amit Langote (#20)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Wed, Jul 15, 2020 at 8:03 AM Amit Langote <amitlangote09@gmail.com> wrote:

Hi Vignesh,

On Tue, Jul 14, 2020 at 10:23 PM vignesh C <vignesh21@gmail.com> wrote:

On Tue, Jul 14, 2020 at 11:19 AM Amit Langote <amitlangote09@gmail.com> wrote:

Sounds fine to me. Although CopyLoadRawBuf() does not seem to a
candidate for rigorous code optimization as it does not get called
that often.

I thought we could include that change as we are making changes around
that code.

Sure, done.

Rest of the changes looked fine to me. Also I noticed that
commit message was missing in the patch.

Please see the attached v7.

Thanks for fixing the comments.
Patch applies cleanly, make check & make check-world passes.
The changes looks fine to me.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

#22Amit Langote
amitlangote09@gmail.com
In reply to: vignesh C (#21)
1 attachment(s)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Wed, Jul 15, 2020 at 1:06 PM vignesh C <vignesh21@gmail.com> wrote:

On Wed, Jul 15, 2020 at 8:03 AM Amit Langote <amitlangote09@gmail.com> wrote:

On Tue, Jul 14, 2020 at 10:23 PM vignesh C <vignesh21@gmail.com> wrote:

Rest of the changes looked fine to me. Also I noticed that
commit message was missing in the patch.

Please see the attached v7.

Thanks for fixing the comments.
Patch applies cleanly, make check & make check-world passes.
The changes looks fine to me.

Thanks for checking. Sorry, I hadn't credited Bharath as an author in
the commit message, so here's v7 again.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v7-0001-Improve-performance-of-binary-COPY-FROM-with-buff.patchapplication/octet-stream; name=v7-0001-Improve-performance-of-binary-COPY-FROM-with-buff.patchDownload
From 171f69b25acba3f81ffb579de0dd38a36aa1b227 Mon Sep 17 00:00:00 2001
From: amitlan <amitlangote09@gmail.com>
Date: Tue, 14 Jul 2020 10:43:30 +0900
Subject: [PATCH v7] Improve performance of binary COPY FROM with buffering

Currently, each piece of binary data, that is, per-row field count,
per-field data length, and the field data itself, is read directly
from the file, which shows up as significant overhead.

This changes things to read the binary data in RAW_BUF_SIZE increments
from the file into raw_buf and parse out individual pieces of binary data
from raw_buf.  This is similar to how textual COPY FROM works although
this doesn't go far enough to use something akin to line_buf.

Authors: Bharath Rupireddy, Amit Langote
---
 src/backend/commands/copy.c | 155 +++++++++++++++++++++++++-------------------
 1 file changed, 88 insertions(+), 67 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 44da71c..0a9a86f 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -195,7 +195,8 @@ typedef struct CopyStateData
 	 * the buffer on each cycle.
 	 *
 	 * (In binary COPY FROM, attribute_buf holds the binary data for the
-	 * current field, while the other variables are not used.)
+	 * current field copied from raw_buf, which in turn holds raw data read
+	 * from the file.)
 	 */
 	StringInfoData attribute_buf;
 
@@ -219,13 +220,18 @@ typedef struct CopyStateData
 	 * Finally, raw_buf holds raw data read from the data source (file or
 	 * client connection).  CopyReadLine parses this data sufficiently to
 	 * locate line boundaries, then transfers the data to line_buf and
-	 * converts it.  Note: we guarantee that there is a \0 at
-	 * raw_buf[raw_buf_len].
+	 * converts it.  Also, CopyReadBinaryData() slices the data into
+	 * attributes based on the caller-specified field length in bytes.
+	 * Note: we guarantee that there is a \0 at raw_buf[raw_buf_len].
 	 */
 #define RAW_BUF_SIZE 65536		/* we palloc RAW_BUF_SIZE+1 bytes */
 	char	   *raw_buf;
 	int			raw_buf_index;	/* next byte to process */
 	int			raw_buf_len;	/* total # of bytes stored */
+
+	/* Shorthand for bytes in raw_buf available for reading */
+#define RAW_BUF_BYTES(cstate)\
+	((cstate)->raw_buf_len - (cstate)->raw_buf_index)
 } CopyStateData;
 
 /* DestReceiver for COPY (query) TO */
@@ -373,6 +379,7 @@ static int	CopyReadAttributesCSV(CopyState cstate);
 static Datum CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 									 Oid typioparam, int32 typmod,
 									 bool *isnull);
+static int	CopyReadBinaryData(CopyState cstate, char *dest, int nbytes);
 static void CopyAttributeOutText(CopyState cstate, char *string);
 static void CopyAttributeOutCSV(CopyState cstate, char *string,
 								bool use_quote, bool single_attr);
@@ -391,9 +398,7 @@ static void CopySendEndOfRow(CopyState cstate);
 static int	CopyGetData(CopyState cstate, void *databuf,
 						int minread, int maxread);
 static void CopySendInt32(CopyState cstate, int32 val);
-static bool CopyGetInt32(CopyState cstate, int32 *val);
 static void CopySendInt16(CopyState cstate, int16 val);
-static bool CopyGetInt16(CopyState cstate, int16 *val);
 
 
 /*
@@ -733,25 +738,6 @@ CopySendInt32(CopyState cstate, int32 val)
 }
 
 /*
- * CopyGetInt32 reads an int32 that appears in network byte order
- *
- * Returns true if OK, false if EOF
- */
-static bool
-CopyGetInt32(CopyState cstate, int32 *val)
-{
-	uint32		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int32) pg_ntoh32(buf);
-	return true;
-}
-
-/*
  * CopySendInt16 sends an int16 in network byte order
  */
 static void
@@ -763,49 +749,27 @@ CopySendInt16(CopyState cstate, int16 val)
 	CopySendData(cstate, &buf, sizeof(buf));
 }
 
-/*
- * CopyGetInt16 reads an int16 that appears in network byte order
- */
-static bool
-CopyGetInt16(CopyState cstate, int16 *val)
-{
-	uint16		buf;
-
-	if (CopyGetData(cstate, &buf, sizeof(buf), sizeof(buf)) != sizeof(buf))
-	{
-		*val = 0;				/* suppress compiler warning */
-		return false;
-	}
-	*val = (int16) pg_ntoh16(buf);
-	return true;
-}
-
 
 /*
  * CopyLoadRawBuf loads some more data into raw_buf
  *
  * Returns true if able to obtain at least one more byte, else false.
  *
- * If raw_buf_index < raw_buf_len, the unprocessed bytes are transferred
- * down to the start of the buffer and then we load more data after that.
- * This case is used only when a frontend multibyte character crosses a
- * bufferload boundary.
+ * If RAW_BUF_BYTES(cstate) > 0, the unprocessed bytes are transferred down to
+ * the start of the buffer and then we load more data after that.  This case
+ * is only used when a frontend multibyte character crosses a bufferload
+ * boundary.
  */
 static bool
 CopyLoadRawBuf(CopyState cstate)
 {
-	int			nbytes;
+	int			nbytes = RAW_BUF_BYTES(cstate);
 	int			inbytes;
 
-	if (cstate->raw_buf_index < cstate->raw_buf_len)
-	{
-		/* Copy down the unprocessed data */
-		nbytes = cstate->raw_buf_len - cstate->raw_buf_index;
+	/* Copy down the unprocessed data if any. */
+	if (nbytes > 0)
 		memmove(cstate->raw_buf, cstate->raw_buf + cstate->raw_buf_index,
 				nbytes);
-	}
-	else
-		nbytes = 0;				/* no data need be saved */
 
 	inbytes = CopyGetData(cstate, cstate->raw_buf + nbytes,
 						  1, RAW_BUF_SIZE - nbytes);
@@ -816,6 +780,57 @@ CopyLoadRawBuf(CopyState cstate)
 	return (inbytes > 0);
 }
 
+/*
+ * CopyReadBinaryData
+ *		Reads 'nbytes' bytes from cstate->copy_file via cstate->raw_buf and
+ *		writes them to 'dest'
+ */
+static int
+CopyReadBinaryData(CopyState cstate, char *dest, int nbytes)
+{
+	int		copied_bytes = 0;
+
+#define DRAIN_COPY_RAW_BUF(cstate, dest, nbytes)\
+	do {\
+		memcpy((dest), (cstate)->raw_buf + (cstate)->raw_buf_index, (nbytes));\
+		(cstate)->raw_buf_index += (nbytes);\
+	} while(0)
+
+	if (RAW_BUF_BYTES(cstate) >= nbytes)
+	{
+		/* Enough bytes are present in the buffer. */
+		DRAIN_COPY_RAW_BUF(cstate, dest, nbytes);
+		copied_bytes = nbytes;
+	}
+	else
+	{
+		/*
+		 * Not enough bytes in the buffer, so must read from the file.  Need
+		 * the loop considering that 'nbytes' may be larger than the maximum
+		 * bytes that the buffer can hold.
+		 */
+		do
+		{
+			int		copy_bytes;
+
+			/*
+			 * This tries to read up to RAW_BUF_SIZE bytes into raw_buf,
+			 * returning if no more were read.
+			 */
+			if (RAW_BUF_BYTES(cstate) == 0 && !CopyLoadRawBuf(cstate))
+				return copied_bytes;
+
+			copy_bytes = Min(nbytes - copied_bytes, RAW_BUF_BYTES(cstate));
+			DRAIN_COPY_RAW_BUF(cstate, dest, copy_bytes);
+			dest += copy_bytes;
+			copied_bytes += copy_bytes;
+		} while (copied_bytes < nbytes);
+	}
+
+#undef DRAIN_COPY_RAW_BUF
+
+	return copied_bytes;
+}
 
 /*
  *	 DoCopy executes the SQL COPY statement
@@ -3366,17 +3381,17 @@ BeginCopyFrom(ParseState *pstate,
 	cstate->cur_attval = NULL;
 
 	/*
-	 * Set up variables to avoid per-attribute overhead.  attribute_buf is
-	 * used in both text and binary modes, but we use line_buf and raw_buf
+	 * Set up variables to avoid per-attribute overhead. attribute_buf and
+	 * raw_buf are used in both text and binary modes, but we use line_buf
 	 * only in text mode.
 	 */
 	initStringInfo(&cstate->attribute_buf);
+	cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
+	cstate->raw_buf_index = cstate->raw_buf_len = 0;
 	if (!cstate->binary)
 	{
 		initStringInfo(&cstate->line_buf);
 		cstate->line_buf_converted = false;
-		cstate->raw_buf = (char *) palloc(RAW_BUF_SIZE + 1);
-		cstate->raw_buf_index = cstate->raw_buf_len = 0;
 	}
 
 	/* Assign range table, we'll need it in CopyFrom. */
@@ -3527,16 +3542,18 @@ BeginCopyFrom(ParseState *pstate,
 		int32		tmp;
 
 		/* Signature */
-		if (CopyGetData(cstate, readSig, 11, 11) != 11 ||
+		if (CopyReadBinaryData(cstate, readSig, 11) != 11 ||
 			memcmp(readSig, BinarySignature, 11) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("COPY file signature not recognized")));
 		/* Flags field */
-		if (!CopyGetInt32(cstate, &tmp))
+		if (CopyReadBinaryData(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp))
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing flags)")));
+		tmp = (int32) pg_ntoh32(tmp);
 		if ((tmp & (1 << 16)) != 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
@@ -3547,15 +3564,15 @@ BeginCopyFrom(ParseState *pstate,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("unrecognized critical flags in COPY file header")));
 		/* Header extension length */
-		if (!CopyGetInt32(cstate, &tmp) ||
-			tmp < 0)
+		if (CopyReadBinaryData(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp) || (tmp = (int32) pg_ntoh32(tmp)) < 0)
 			ereport(ERROR,
 					(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 					 errmsg("invalid COPY file header (missing length)")));
 		/* Skip extension header, if present */
 		while (tmp-- > 0)
 		{
-			if (CopyGetData(cstate, readSig, 1, 1) != 1)
+			if (CopyReadBinaryData(cstate, readSig, 1) != 1)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("invalid COPY file header (wrong length)")));
@@ -3748,12 +3765,14 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 
 		cstate->cur_lineno++;
 
-		if (!CopyGetInt16(cstate, &fld_count))
+		if (CopyReadBinaryData(cstate, (char *) &fld_count,
+							   sizeof(fld_count)) != sizeof(fld_count))
 		{
 			/* EOF detected (end of file, or protocol-level EOF) */
 			return false;
 		}
 
+		fld_count = (int16) pg_ntoh16(fld_count);
 		if (fld_count == -1)
 		{
 			/*
@@ -3771,7 +3790,7 @@ NextCopyFrom(CopyState cstate, ExprContext *econtext,
 			char		dummy;
 
 			if (cstate->copy_dest != COPY_OLD_FE &&
-				CopyGetData(cstate, &dummy, 1, 1) > 0)
+				CopyReadBinaryData(cstate, &dummy, 1) > 0)
 				ereport(ERROR,
 						(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 						 errmsg("received copy data after EOF marker")));
@@ -4726,10 +4745,12 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	int32		fld_size;
 	Datum		result;
 
-	if (!CopyGetInt32(cstate, &fld_size))
+	if (CopyReadBinaryData(cstate, (char *) &fld_size, sizeof(fld_size)) !=
+		sizeof(fld_size))
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
+	fld_size = (int32) pg_ntoh32(fld_size);
 	if (fld_size == -1)
 	{
 		*isnull = true;
@@ -4744,8 +4765,8 @@ CopyReadBinaryAttribute(CopyState cstate, FmgrInfo *flinfo,
 	resetStringInfo(&cstate->attribute_buf);
 
 	enlargeStringInfo(&cstate->attribute_buf, fld_size);
-	if (CopyGetData(cstate, cstate->attribute_buf.data,
-					fld_size, fld_size) != fld_size)
+	if (CopyReadBinaryData(cstate, cstate->attribute_buf.data,
+						   fld_size) != fld_size)
 		ereport(ERROR,
 				(errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
 				 errmsg("unexpected EOF in COPY data")));
-- 
1.8.3.1

#23Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Amit Langote (#22)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Thu, Jul 16, 2020 at 7:44 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Wed, Jul 15, 2020 at 1:06 PM vignesh C <vignesh21@gmail.com> wrote:

On Wed, Jul 15, 2020 at 8:03 AM Amit Langote <amitlangote09@gmail.com> wrote:

On Tue, Jul 14, 2020 at 10:23 PM vignesh C <vignesh21@gmail.com> wrote:

Rest of the changes looked fine to me. Also I noticed that
commit message was missing in the patch.

Please see the attached v7.

Thanks for fixing the comments.
Patch applies cleanly, make check & make check-world passes.
The changes looks fine to me.

Thanks for checking. Sorry, I hadn't credited Bharath as an author in
the commit message, so here's v7 again.

Patch looks good. It applies on latest commit
932f9fb504a57f296cf698d15bd93462ddfe2776 and make check, make
check-world were run successfully.

I will change the status to "ready for committer" in commitfest
tomorrow. Hope that's fine.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#24vignesh C
vignesh21@gmail.com
In reply to: Bharath Rupireddy (#23)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Thu, Jul 16, 2020 at 8:52 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Jul 16, 2020 at 7:44 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Wed, Jul 15, 2020 at 1:06 PM vignesh C <vignesh21@gmail.com> wrote:

On Wed, Jul 15, 2020 at 8:03 AM Amit Langote <amitlangote09@gmail.com> wrote:

On Tue, Jul 14, 2020 at 10:23 PM vignesh C <vignesh21@gmail.com> wrote:

Rest of the changes looked fine to me. Also I noticed that
commit message was missing in the patch.

Please see the attached v7.

Thanks for fixing the comments.
Patch applies cleanly, make check & make check-world passes.
The changes looks fine to me.

Thanks for checking. Sorry, I hadn't credited Bharath as an author in
the commit message, so here's v7 again.

Patch looks good. It applies on latest commit
932f9fb504a57f296cf698d15bd93462ddfe2776 and make check, make
check-world were run successfully.

I will change the status to "ready for committer" in commitfest
tomorrow. Hope that's fine.

I agree, a committer can have a look at this.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com

#25Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Amit Langote (#4)
Re: [PATCH] Performance Improvement For Copy From Binary Files

I will change the status to "ready for committer" in commitfest
tomorrow. Hope that's fine.

I agree, a committer can have a look at this.

I changed the status in the commit fest to "Ready for Committer".

https://commitfest.postgresql.org/28/2632/

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#22)
Re: [PATCH] Performance Improvement For Copy From Binary Files

Amit Langote <amitlangote09@gmail.com> writes:

[ v7-0001-Improve-performance-of-binary-COPY-FROM-with-buff.patch ]

Pushed with cosmetic changes.

I'd always supposed that stdio does enough internal buffering that short
fread()s shouldn't be much worse than memcpy(). But I reproduced your
result of ~30% speedup for data with a lot of narrow text columns, using
RHEL 8.2. Thinking this an indictment of glibc, I also tried it on
current macOS, and saw an even bigger speedup, approaching 50%. So
there's definitely something to this. I wonder if we ought to check
other I/O-constrained users of fread and fwrite, like pg_dump/pg_restore.

A point that I did not see addressed in the thread is whether this
has any negative impact on the copy-from-frontend code path, where
there's no fread() to avoid; short reads from CopyGetData() are
already going to be satisfied by memcpy'ing from the fe_msgbuf.
However, a quick check suggests that this patch is still a small
win for that case too --- apparently the control overhead in
CopyGetData() is not negligible.

So the patch seems fine functionally, but there were some cosmetic
things I didn't like:

* Removing CopyGetInt32 and CopyGetInt16 seemed like a pretty bad
idea, because it made the callers much uglier and more error-prone.
This is a particularly bad example:

 		/* Header extension length */
-		if (!CopyGetInt32(cstate, &tmp) ||
-			tmp < 0)
+		if (CopyReadBinaryData(cstate, (char *) &tmp, sizeof(tmp)) !=
+			sizeof(tmp) || (tmp = (int32) pg_ntoh32(tmp)) < 0)

Putting side-effects into late stages of an if-condition is just
awful coding practice. They're easy for a reader to miss and they
are magnets for bugs, because of the possibility that control doesn't
reach that part of the condition.

You can get the exact same speedup without any of those disadvantages
by marking these two functions "inline", so that's what I did.

* I dropped the DRAIN_COPY_RAW_BUF macro too, as in my estimation it was
a net negative for readability. With only two use-cases, having it made
the code longer not shorter; I was also pretty unconvinced about the
wisdom of having some of the loop's control logic inside the macro and
some outside.

* BTW, the macro definitions weren't particularly per project style
anyway. We generally put at least one space before line-ending
backslashes. I don't think pgindent will fix this for you; IME
it doesn't touch macro definitions at all.

* Did some more work on the comments.

regards, tom lane

#27Amit Langote
amitlangote09@gmail.com
In reply to: Tom Lane (#26)
Re: [PATCH] Performance Improvement For Copy From Binary Files

On Sun, Jul 26, 2020 at 6:06 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

[ v7-0001-Improve-performance-of-binary-COPY-FROM-with-buff.patch ]

Pushed with cosmetic changes.

Thanks for that.

I'd always supposed that stdio does enough internal buffering that short
fread()s shouldn't be much worse than memcpy(). But I reproduced your
result of ~30% speedup for data with a lot of narrow text columns, using
RHEL 8.2. Thinking this an indictment of glibc, I also tried it on
current macOS, and saw an even bigger speedup, approaching 50%. So
there's definitely something to this. I wonder if we ought to check
other I/O-constrained users of fread and fwrite, like pg_dump/pg_restore.

Ah, maybe a good idea to check that.

A point that I did not see addressed in the thread is whether this
has any negative impact on the copy-from-frontend code path, where
there's no fread() to avoid; short reads from CopyGetData() are
already going to be satisfied by memcpy'ing from the fe_msgbuf.
However, a quick check suggests that this patch is still a small
win for that case too --- apparently the control overhead in
CopyGetData() is not negligible.

Indeed.

So the patch seems fine functionally, but there were some cosmetic
things I didn't like:

* Removing CopyGetInt32 and CopyGetInt16 seemed like a pretty bad
idea, because it made the callers much uglier and more error-prone.
This is a particularly bad example:

/* Header extension length */
-               if (!CopyGetInt32(cstate, &tmp) ||
-                       tmp < 0)
+               if (CopyReadBinaryData(cstate, (char *) &tmp, sizeof(tmp)) !=
+                       sizeof(tmp) || (tmp = (int32) pg_ntoh32(tmp)) < 0)

Putting side-effects into late stages of an if-condition is just
awful coding practice. They're easy for a reader to miss and they
are magnets for bugs, because of the possibility that control doesn't
reach that part of the condition.

You can get the exact same speedup without any of those disadvantages
by marking these two functions "inline", so that's what I did.

* I dropped the DRAIN_COPY_RAW_BUF macro too, as in my estimation it was
a net negative for readability. With only two use-cases, having it made
the code longer not shorter; I was also pretty unconvinced about the
wisdom of having some of the loop's control logic inside the macro and
some outside.

* BTW, the macro definitions weren't particularly per project style
anyway. We generally put at least one space before line-ending
backslashes. I don't think pgindent will fix this for you; IME
it doesn't touch macro definitions at all.

* Did some more work on the comments.

Thanks for these changes.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com