Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

Started by Ranier Vilelaover 5 years ago12 messages

ranier.vf@gmail.com

over 5 years ago

1 attachment(s)

Hi,

read_binary_file seems a bit complicated when we want to read the rest of
the file (-1 for bytes_to_read).
This version, it seems, has no opposite effects, someone can confirm?

regards,
Ranier Vilela

Attachments:

v1-0001-simplified_read_binary_file.patchapplication/octet-stream; name=v1-0001-simplified_read_binary_file.patchDownload

diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d34182a7b0..9c06aadb38 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -106,14 +106,11 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read,
 				 bool missing_ok)
 {
 	bytea	   *buf;
-	size_t		nbytes = 0;
 	FILE	   *file;
+	size_t		nbytes;
 
-	/* clamp request size to what we can actually deliver */
-	if (bytes_to_read > (int64) (MaxAllocSize - VARHDRSZ))
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-				 errmsg("requested length too large")));
+    /* Read zero byte? */
+	Assert(bytes_to_read != 0);
 
 	if ((file = AllocateFile(filename, PG_BINARY_R)) == NULL)
 	{
@@ -126,83 +123,38 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read,
 							filename)));
 	}
 
-	if (fseeko(file, (off_t) seek_offset,
-			   (seek_offset >= 0) ? SEEK_SET : SEEK_END) != 0)
+    /* Avoid syscall fseeko if seek_offset is zero */
+	if (seek_offset != 0 &&
+	    fseeko(file, (off_t) seek_offset,
+			   (seek_offset > 0) ? SEEK_SET : SEEK_END) != 0)
 		ereport(ERROR,
 				(errcode_for_file_access(),
 				 errmsg("could not seek in file \"%s\": %m", filename)));
 
+    /* If passed explicit read size just do it */
 	if (bytes_to_read >= 0)
-	{
-		/* If passed explicit read size just do it */
-		buf = (bytea *) palloc((Size) bytes_to_read + VARHDRSZ);
-
-		nbytes = fread(VARDATA(buf), 1, (size_t) bytes_to_read, file);
-	}
+	    nbytes = bytes_to_read;
 	else
-	{
-		/* Negative read size, read rest of file */
-		StringInfoData sbuf;
+        nbytes = ftello(file);
 
-		initStringInfo(&sbuf);
-		/* Leave room in the buffer for the varlena length word */
-		sbuf.len += VARHDRSZ;
-		Assert(sbuf.len < sbuf.maxlen);
-
-		while (!(feof(file) || ferror(file)))
-		{
-			size_t		rbytes;
-
-			/* Minimum amount to read at a time */
-#define MIN_READ_SIZE 4096
-
-			/*
-			 * If not at end of file, and sbuf.len is equal to
-			 * MaxAllocSize - 1, then either the file is too large, or
-			 * there is nothing left to read. Attempt to read one more
-			 * byte to see if the end of file has been reached. If not,
-			 * the file is too large; we'd rather give the error message
-			 * for that ourselves.
-			 */
-			if (sbuf.len == MaxAllocSize - 1)
-			{
-				char	rbuf[1];
-
-				if (fread(rbuf, 1, 1, file) != 0 || !feof(file))
-					ereport(ERROR,
-							(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
-							 errmsg("file length too large")));
-				else
-					break;
-			}
-
-			/* OK, ensure that we can read at least MIN_READ_SIZE */
-			enlargeStringInfo(&sbuf, MIN_READ_SIZE);
-
-			/*
-			 * stringinfo.c likes to allocate in powers of 2, so it's likely
-			 * that much more space is available than we asked for.  Use all
-			 * of it, rather than making more fread calls than necessary.
-			 */
-			rbytes = fread(sbuf.data + sbuf.len, 1,
-						   (size_t) (sbuf.maxlen - sbuf.len - 1), file);
-			sbuf.len += rbytes;
-			nbytes += rbytes;
-		}
+	/* clamp request size to what we can actually deliver */
+	if (nbytes > ((int64) MaxAllocSize - VARHDRSZ))
+		ereport(ERROR,
+				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+				 errmsg("requested length too large")));
 
-		/* Now we can commandeer the stringinfo's buffer as the result */
-		buf = (bytea *) sbuf.data;
-	}
+	buf = (bytea *) palloc((Size) nbytes + VARHDRSZ);
+	nbytes = fread(VARDATA(buf), 1, nbytes, file);
 
 	if (ferror(file))
 		ereport(ERROR,
 				(errcode_for_file_access(),
 				 errmsg("could not read file \"%s\": %m", filename)));
 
-	SET_VARSIZE(buf, nbytes + VARHDRSZ);
-
 	FreeFile(file);
 
+	SET_VARSIZE(buf, nbytes + VARHDRSZ);
+
 	return buf;
 }

Alvaro Herrera

alvherre@2ndquadrant.com

over 5 years ago

In reply to: Ranier Vilela (#1)

Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

On 2020-Sep-11, Ranier Vilela wrote:

Hi,

read_binary_file seems a bit complicated when we want to read the rest of
the file (-1 for bytes_to_read).

This code was very recently rewritten in 96d1f423f95d, and I doubt that
taking out half the algorithm without studying how it got that way is a
great idea.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Ranier Vilela

ranier.vf@gmail.com

over 5 years ago

In reply to: Alvaro Herrera (#2)

Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

Em sex., 11 de set. de 2020 às 14:01, Alvaro Herrera <
alvherre@2ndquadrant.com> escreveu:

On 2020-Sep-11, Ranier Vilela wrote:

Hi,

read_binary_file seems a bit complicated when we want to read the rest of
the file (-1 for bytes_to_read).

This code was very recently rewritten in 96d1f423f95d, and I doubt that
taking out half the algorithm without studying how it got that way is a
great idea.

Works with all regress tests (199, same as the HEAD with msvc 2019 64 bits
except partition_prune)
Works with pgbench (pgbench -U postgres -c 50 -j 2 -t 10000 example)
And works with local installation, without adverse effects, for now.

regards,
Ranier Vilela

Andres Freund

andres@anarazel.de

over 5 years ago

In reply to: Ranier Vilela (#3)

Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

On 2020-09-11 14:10:31 -0300, Ranier Vilela wrote:

Em sex., 11 de set. de 2020 ï¿½s 14:01, Alvaro Herrera <
alvherre@2ndquadrant.com> escreveu:

On 2020-Sep-11, Ranier Vilela wrote:
This code was very recently rewritten in 96d1f423f95d, and I doubt that
taking out half the algorithm without studying how it got that way is a
great idea.

Works with all regress tests (199, same as the HEAD with msvc 2019 64 bits
except partition_prune)
Works with pgbench (pgbench -U postgres -c 50 -j 2 -t 10000 example)
And works with local installation, without adverse effects, for now.

Have your read the commit message for 96d1f423f95d ?

Ranier Vilela

ranier.vf@gmail.com

over 5 years ago

In reply to: Andres Freund (#4)

Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

Em sex., 11 de set. de 2020 às 15:09, Andres Freund <andres@anarazel.de>
escreveu:

On 2020-09-11 14:10:31 -0300, Ranier Vilela wrote:

Em sex., 11 de set. de 2020 às 14:01, Alvaro Herrera <
alvherre@2ndquadrant.com> escreveu:

On 2020-Sep-11, Ranier Vilela wrote:
This code was very recently rewritten in 96d1f423f95d, and I doubt that
taking out half the algorithm without studying how it got that way is a
great idea.

Works with all regress tests (199, same as the HEAD with msvc 2019 64

bits

except partition_prune)
Works with pgbench (pgbench -U postgres -c 50 -j 2 -t 10000 example)
And works with local installation, without adverse effects, for now.

Have your read the commit message for 96d1f423f95d ?

Yead, I read.
He's concerned about virtual file (pipe, FIFO, socket).
The patch ( 96d1f423f95d ) has the same problem.
fseeko with virtual file can fail too.

https://man7.org/linux/man-pages/man3/fseek.3.html
ESPIPE The file descriptor underlying stream is not seekable (e.g.,
it refers to a pipe, FIFO, or socket)

Call read_binary_file with virtual file,will log in:
"could not seek in file"

regards,
Ranier Vilela

Ranier Vilela

ranier.vf@gmail.com

over 5 years ago

In reply to: Ranier Vilela (#5)

1 attachment(s)

Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

New version, with support to read Virtual File (pipe, FIFO and socket).
With assert, in case, erroneous, of trying to read a pipe, with offset.

regards,
Ranier Vilela

Attachments:

v2-0001-simplified_read_binary_file.patchapplication/octet-stream; name=v2-0001-simplified_read_binary_file.patchDownload

diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d34182a7b0..8a5d3acc5b 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -106,11 +106,14 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read,
 				 bool missing_ok)
 {
 	bytea	   *buf;
-	size_t		nbytes = 0;
 	FILE	   *file;
+	size_t		nbytes;
+
+    /* Read zero byte? */
+	Assert(bytes_to_read != 0);
 
 	/* clamp request size to what we can actually deliver */
-	if (bytes_to_read > (int64) (MaxAllocSize - VARHDRSZ))
+	if (bytes_to_read > ((int64) MaxAllocSize - VARHDRSZ))
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("requested length too large")));
@@ -126,72 +129,50 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read,
 							filename)));
 	}
 
-	if (fseeko(file, (off_t) seek_offset,
-			   (seek_offset >= 0) ? SEEK_SET : SEEK_END) != 0)
-		ereport(ERROR,
-				(errcode_for_file_access(),
-				 errmsg("could not seek in file \"%s\": %m", filename)));
-
+    /* If passed explicit read size just do it */
 	if (bytes_to_read >= 0)
 	{
-		/* If passed explicit read size just do it */
-		buf = (bytea *) palloc((Size) bytes_to_read + VARHDRSZ);
+	    /* Virtual File? Can not be seekable. */
+	    Assert(seek_offset != 0);
 
-		nbytes = fread(VARDATA(buf), 1, (size_t) bytes_to_read, file);
+		/* Avoid syscall fseeko if seek_offset is zero */
+		if (seek_offset != 0 &&
+			fseeko(file, (off_t) seek_offset,
+			  	   (seek_offset > 0) ? SEEK_SET : SEEK_END) != 0)
+			ereport(ERROR,
+					(errcode_for_file_access(),
+					errmsg("could not seek in file \"%s\": %m", filename)));
+
+		buf = (bytea *) palloc((Size) bytes_to_read + VARHDRSZ);
+	    nbytes = fread(VARDATA(buf), 1, (size_t) bytes_to_read, file);
 	}
-	else
+    else
 	{
-		/* Negative read size, read rest of file */
 		StringInfoData sbuf;
+		size_t rbytes;
 
 		initStringInfo(&sbuf);
+
 		/* Leave room in the buffer for the varlena length word */
 		sbuf.len += VARHDRSZ;
 		Assert(sbuf.len < sbuf.maxlen);
 
-		while (!(feof(file) || ferror(file)))
-		{
-			size_t		rbytes;
-
-			/* Minimum amount to read at a time */
 #define MIN_READ_SIZE 4096
-
-			/*
-			 * If not at end of file, and sbuf.len is equal to
-			 * MaxAllocSize - 1, then either the file is too large, or
-			 * there is nothing left to read. Attempt to read one more
-			 * byte to see if the end of file has been reached. If not,
-			 * the file is too large; we'd rather give the error message
-			 * for that ourselves.
-			 */
-			if (sbuf.len == MaxAllocSize - 1)
-			{
-				char	rbuf[1];
-
-				if (fread(rbuf, 1, 1, file) != 0 || !feof(file))
-					ereport(ERROR,
-							(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
-							 errmsg("file length too large")));
-				else
-					break;
-			}
-
-			/* OK, ensure that we can read at least MIN_READ_SIZE */
-			enlargeStringInfo(&sbuf, MIN_READ_SIZE);
-
-			/*
-			 * stringinfo.c likes to allocate in powers of 2, so it's likely
-			 * that much more space is available than we asked for.  Use all
-			 * of it, rather than making more fread calls than necessary.
-			 */
-			rbytes = fread(sbuf.data + sbuf.len, 1,
-						   (size_t) (sbuf.maxlen - sbuf.len - 1), file);
+        nbytes = 0;
+        do {
+            enlargeStringInfo(&sbuf, MIN_READ_SIZE);
+		    rbytes = read(file, sbuf.data + sbuf.len, (size_t) (sbuf.maxlen - sbuf.len - 1));
 			sbuf.len += rbytes;
 			nbytes += rbytes;
-		}
+		} while(rbytes > 0 && sbuf.len < (int64) (MaxAllocSize - VARHDRSZ));
+
+	    if (sbuf.len > ((int64) MaxAllocSize - VARHDRSZ))
+			ereport(ERROR,
+					(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+					 errmsg("file length too large")));
 
 		/* Now we can commandeer the stringinfo's buffer as the result */
-		buf = (bytea *) sbuf.data;
+		buf = (bytea *) sbuf.data;
 	}
 
 	if (ferror(file))
@@ -199,10 +180,10 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read,
 				(errcode_for_file_access(),
 				 errmsg("could not read file \"%s\": %m", filename)));
 
-	SET_VARSIZE(buf, nbytes + VARHDRSZ);
-
 	FreeFile(file);
 
+	SET_VARSIZE(buf, nbytes + VARHDRSZ);
+
 	return buf;
 }

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Ranier Vilela (#6)

Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

Ranier Vilela <ranier.vf@gmail.com> writes:

New version, with support to read Virtual File (pipe, FIFO and socket).
With assert, in case, erroneous, of trying to read a pipe, with offset.

Really, could you do a little more thinking and testing of your own,
rather than expecting the rest of us to point out holes in your thinking?

* bytes_to_read == 0 is a legal case.

* "Assert(seek_offset != 0)" is an equally awful idea.

* Removing the seek code from the negative-bytes_to_read path
is just broken.

* The only reason this code is shorter than the previous version is
you took out all the comments explaining what it was doing, and
failed to replace them. This is just as subtle as before, so I
don't find that acceptable.

I do agree that it might be worth skipping the fseeko call in the
probably-common case where seek_offset is zero. Otherwise I don't
see much reason to change what's there.

regards, tom lane

Ranier Vilela

ranier.vf@gmail.com

over 5 years ago

In reply to: Tom Lane (#7)

1 attachment(s)

Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

Em sex., 11 de set. de 2020 às 17:44, Tom Lane <tgl@sss.pgh.pa.us> escreveu:

Ranier Vilela <ranier.vf@gmail.com> writes:

New version, with support to read Virtual File (pipe, FIFO and socket).
With assert, in case, erroneous, of trying to read a pipe, with offset.

Really, could you do a little more thinking and testing of your own,
rather than expecting the rest of us to point out holes in your thinking?

Yes, I can.

* bytes_to_read == 0 is a legal case.

Ok. Strange call to read a file with zero bytes.

* "Assert(seek_offset != 0)" is an equally awful idea.

I was trying to predict the case of reading a Virtual File, with
bytes_to_read == -1 and seek_offset == 0,
which is bound to fail in fseeko.
In any case, 96d1f423f95d it will fail with any Virtual File.

* Removing the seek code from the negative-bytes_to_read path
is just broken.

Ok.

* The only reason this code is shorter than the previous version is
you took out all the comments explaining what it was doing, and
failed to replace them. This is just as subtle as before, so I
don't find that acceptable.

It was not my intention.

I do agree that it might be worth skipping the fseeko call in the
probably-common case where seek_offset is zero. Otherwise I don't
see much reason to change what's there.

Well, I think that v3 is a little better, but it’s just my opinion.
v3 try to copy directly in the final buffer, which will certainly be a
little faster.

regards,
Ranier Vilela

Attachments:

v3-0001-simplified_read_binary_file.patchapplication/octet-stream; name=v3-0001-simplified_read_binary_file.patchDownload

diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d34182a7b0..57f695ca51 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -106,8 +106,8 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read,
 				 bool missing_ok)
 {
 	bytea	   *buf;
-	size_t		nbytes = 0;
 	FILE	   *file;
+	size_t		nbytes;
 
 	/* clamp request size to what we can actually deliver */
 	if (bytes_to_read > (int64) (MaxAllocSize - VARHDRSZ))
@@ -126,69 +126,63 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read,
 							filename)));
 	}
 
-	if (fseeko(file, (off_t) seek_offset,
-			   (seek_offset >= 0) ? SEEK_SET : SEEK_END) != 0)
+	/* 
+     * Virtual File? seek_offset must be zero, otherwise fseeko will fail.
+	 * Avoid syscall fseeko if seek_offset is zero.
+	 */
+	if (seek_offset != 0 &&
+		fseeko(file, (off_t) seek_offset,
+		  	   (seek_offset > 0) ? SEEK_SET : SEEK_END) != 0)
 		ereport(ERROR,
 				(errcode_for_file_access(),
 				 errmsg("could not seek in file \"%s\": %m", filename)));
 
 	if (bytes_to_read > 0)
 	{
-		/* If passed explicit read size just do it */
+        /* If passed explicit read size just do it */
 		buf = (bytea *) palloc((Size) bytes_to_read + VARHDRSZ);
 
-		nbytes = fread(VARDATA(buf), 1, (size_t) bytes_to_read, file);
+		/* Bypass fread buffer: less copies. */
+		setvbuf(file, VARDATA(buf), _IOFBF, (size_t) bytes_to_read);
+
+	    nbytes = fread(VARDATA(buf), 1, (size_t) bytes_to_read, file);
 	}
-	else
+    else
 	{
-		/* Negative read size, read rest of file */
 		StringInfoData sbuf;
+		ssize_t rbytes;
+		int fd;
 
 		initStringInfo(&sbuf);
+
 		/* Leave room in the buffer for the varlena length word */
 		sbuf.len += VARHDRSZ;
 		Assert(sbuf.len < sbuf.maxlen);
 
-		while (!(feof(file) || ferror(file)))
-		{
-			size_t		rbytes;
-
-			/* Minimum amount to read at a time */
+       /* Minimum amount to read at a time */
 #define MIN_READ_SIZE 4096
 
-			/*
-			 * If not at end of file, and sbuf.len is equal to
-			 * MaxAllocSize - 1, then either the file is too large, or
-			 * there is nothing left to read. Attempt to read one more
-			 * byte to see if the end of file has been reached. If not,
-			 * the file is too large; we'd rather give the error message
-			 * for that ourselves.
-			 */
-			if (sbuf.len == MaxAllocSize - 1)
-			{
-				char	rbuf[1];
-
-				if (fread(rbuf, 1, 1, file) != 0 || !feof(file))
-					ereport(ERROR,
-							(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
-							 errmsg("file length too large")));
-				else
-					break;
-			}
-
-			/* OK, ensure that we can read at least MIN_READ_SIZE */
+        fd = fileno(file);
+        nbytes = 0;
+        do {
+            /* OK, ensure that we can read at least MIN_READ_SIZE */
 			enlargeStringInfo(&sbuf, MIN_READ_SIZE);
 
 			/*
 			 * stringinfo.c likes to allocate in powers of 2, so it's likely
 			 * that much more space is available than we asked for.  Use all
-			 * of it, rather than making more fread calls than necessary.
+			 * of it, rather than making more read calls than necessary.
 			 */
-			rbytes = fread(sbuf.data + sbuf.len, 1,
-						   (size_t) (sbuf.maxlen - sbuf.len - 1), file);
+		    rbytes = read(fd, sbuf.data + sbuf.len, (size_t) (sbuf.maxlen - sbuf.len - 1));
+
 			sbuf.len += rbytes;
 			nbytes += rbytes;
-		}
+		} while(rbytes > 0 && sbuf.len < (int64) (MaxAllocSize - VARHDRSZ));
+
+	    if (sbuf.len > ((int64) MaxAllocSize - VARHDRSZ))
+			ereport(ERROR,
+					(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+					 errmsg("file length too large")));
 
 		/* Now we can commandeer the stringinfo's buffer as the result */
 		buf = (bytea *) sbuf.data;
@@ -199,10 +193,10 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read,
 				(errcode_for_file_access(),
 				 errmsg("could not read file \"%s\": %m", filename)));
 
-	SET_VARSIZE(buf, nbytes + VARHDRSZ);
-
 	FreeFile(file);
 
+	SET_VARSIZE(buf, nbytes + VARHDRSZ);
+
 	return buf;
 }

Ranier Vilela

ranier.vf@gmail.com

over 5 years ago

In reply to: Tom Lane (#7)

1 attachment(s)

Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

Em sex., 11 de set. de 2020 às 17:44, Tom Lane <tgl@sss.pgh.pa.us> escreveu:

Ranier Vilela <ranier.vf@gmail.com> writes:

New version, with support to read Virtual File (pipe, FIFO and socket).
With assert, in case, erroneous, of trying to read a pipe, with offset.

I do agree that it might be worth skipping the fseeko call in the
probably-common case where seek_offset is zero. Otherwise I don't
see much reason to change what's there.

Tom, if you think it's worth it, I agree with only avoiding syscal fseeko.

regards,
Ranier Vilela

Attachments:

v1-0001-avoid_syscall_fseeko_read_binary_file.patchapplication/octet-stream; name=v1-0001-avoid_syscall_fseeko_read_binary_file.patchDownload

diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d34182a7b0..7ea7ca21b0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -126,8 +126,10 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read,
 							filename)));
 	}
 
-	if (fseeko(file, (off_t) seek_offset,
-			   (seek_offset >= 0) ? SEEK_SET : SEEK_END) != 0)
+    /* Avoid syscall fseeko, if seek_offset is zero */
+	if (seek_offset != 0 &&
+		fseeko(file, (off_t) seek_offset,
+			   (seek_offset > 0) ? SEEK_SET : SEEK_END) != 0)
 		ereport(ERROR,
 				(errcode_for_file_access(),
 				 errmsg("could not seek in file \"%s\": %m", filename)));

#10

Ranier Vilela

ranier.vf@gmail.com

over 5 years ago

In reply to: Ranier Vilela (#8)

1 attachment(s)

Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

Em sex., 11 de set. de 2020 às 18:43, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Em sex., 11 de set. de 2020 às 17:44, Tom Lane <tgl@sss.pgh.pa.us>
escreveu:

Ranier Vilela <ranier.vf@gmail.com> writes:

New version, with support to read Virtual File (pipe, FIFO and socket).
With assert, in case, erroneous, of trying to read a pipe, with offset.

Really, could you do a little more thinking and testing of your own,
rather than expecting the rest of us to point out holes in your thinking?

Yes, I can.

* bytes_to_read == 0 is a legal case.

Ok. Strange call to read a file with zero bytes.

* "Assert(seek_offset != 0)" is an equally awful idea.

I was trying to predict the case of reading a Virtual File, with
bytes_to_read == -1 and seek_offset == 0,
which is bound to fail in fseeko.
In any case, 96d1f423f95d it will fail with any Virtual File.

* Removing the seek code from the negative-bytes_to_read path
is just broken.

Ok.

* The only reason this code is shorter than the previous version is
you took out all the comments explaining what it was doing, and
failed to replace them. This is just as subtle as before, so I
don't find that acceptable.

It was not my intention.

I do agree that it might be worth skipping the fseeko call in the
probably-common case where seek_offset is zero. Otherwise I don't
see much reason to change what's there.

Well, I think that v3 is a little better, but it’s just my opinion.
v3 try to copy directly in the final buffer, which will certainly be a
little faster.

v4 patch, simple benchmark, read binary file with size > 6GB.

PostgreSQL main:
postgres=# \timing on
Timing is on.
postgres=# select
pg_read_file('c:\users\ranier\downloads\macOS_Catalina.7z');
ERROR: file length too large
Time: 3068,459 ms (00:03,068)
postgres=#

PostgreSQL patched (v4 read_binary_file):
postgres=# \timing on
Timing is on.
postgres=# select
pg_read_file('c:\users\ranier\downloads\macOS_Catalina.7z');
ERROR: file length too large
Time: 701,035 ms
postgres=#

4.37x faster, very good.

v4 handles pipes very well.
Tested with
https://docs.microsoft.com/en-us/windows/win32/ipc/multithreaded-pipe-server

Terminal one:
C:\usr\src\tests\pipes>pipe_server

Pipe Server: Main thread awaiting client connection on \\.\pipe\mynamedpipe

Terminal two:
postgres=# \timing on
Timing is on.
postgres=# select pg_read_file('\\.\pipe\mynamedpipe');

Terminal one:
Client connected, creating a processing thread.

Pipe Server: Main thread awaiting client connection on \\.\pipe\mynamedpipe
InstanceThread created, receiving and processing messages.

Pipe Server: Main thread awaiting client connection on \\.\pipe\mynamedpipe
InstanceThread created, receiving and processing messages.
^C
C:\usr\src\tests\pipes>

Terminal two:
postgres=# select pg_read_file('\\.\pipe\mynamedpipe');
pg_read_file
--------------

(1 row)

Time: 77267,481 ms (01:17,267)
postgres=#

regards,
Ranier Vilela

Attachments:

v4-0001-simplified_read_binary_file.patchapplication/octet-stream; name=v4-0001-simplified_read_binary_file.patchDownload

diff --git a/src/backend/utils/adt/genfile.c b/src/backend/utils/adt/genfile.c
index d34182a7b0..250e55ffe0 100644
--- a/src/backend/utils/adt/genfile.c
+++ b/src/backend/utils/adt/genfile.c
@@ -106,11 +106,11 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read,
 				 bool missing_ok)
 {
 	bytea	   *buf;
-	size_t		nbytes = 0;
 	FILE	   *file;
+	size_t		nbytes = 0;
 
 	/* clamp request size to what we can actually deliver */
-	if (bytes_to_read > (int64) (MaxAllocSize - VARHDRSZ))
+	if (bytes_to_read > ((int64) MaxAllocSize - VARHDRSZ))
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("requested length too large")));
@@ -126,69 +126,60 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read,
 							filename)));
 	}
 
-	if (fseeko(file, (off_t) seek_offset,
-			   (seek_offset >= 0) ? SEEK_SET : SEEK_END) != 0)
+    /* Avoid syscall fseeko, if seek_offset is zero */
+	if (seek_offset != 0 &&
+		fseeko(file, (off_t) seek_offset,
+			   (seek_offset > 0) ? SEEK_SET : SEEK_END) != 0)
 		ereport(ERROR,
 				(errcode_for_file_access(),
 				 errmsg("could not seek in file \"%s\": %m", filename)));
 
-	if (bytes_to_read >= 0)
+	if (bytes_to_read > 0)
 	{
 		/* If passed explicit read size just do it */
 		buf = (bytea *) palloc((Size) bytes_to_read + VARHDRSZ);
 
+		/* Bypass fread buffer: less copies. */
+		setvbuf(file, VARDATA(buf), _IOFBF, (size_t) bytes_to_read);
+
 		nbytes = fread(VARDATA(buf), 1, (size_t) bytes_to_read, file);
 	}
 	else
 	{
 		/* Negative read size, read rest of file */
-		StringInfoData sbuf;
+		StringInfoData	sbuf;
+		size_t			rbytes;
+		int				fd;
 
 		initStringInfo(&sbuf);
 		/* Leave room in the buffer for the varlena length word */
 		sbuf.len += VARHDRSZ;
 		Assert(sbuf.len < sbuf.maxlen);
 
-		while (!(feof(file) || ferror(file)))
-		{
-			size_t		rbytes;
-
-			/* Minimum amount to read at a time */
+		/* Minimum amount to read at a time */
 #define MIN_READ_SIZE 4096
 
-			/*
-			 * If not at end of file, and sbuf.len is equal to
-			 * MaxAllocSize - 1, then either the file is too large, or
-			 * there is nothing left to read. Attempt to read one more
-			 * byte to see if the end of file has been reached. If not,
-			 * the file is too large; we'd rather give the error message
-			 * for that ourselves.
-			 */
-			if (sbuf.len == MaxAllocSize - 1)
-			{
-				char	rbuf[1];
-
-				if (fread(rbuf, 1, 1, file) != 0 || !feof(file))
-					ereport(ERROR,
-							(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
-							 errmsg("file length too large")));
-				else
-					break;
-			}
-
-			/* OK, ensure that we can read at least MIN_READ_SIZE */
-			enlargeStringInfo(&sbuf, MIN_READ_SIZE);
-
-			/*
-			 * stringinfo.c likes to allocate in powers of 2, so it's likely
-			 * that much more space is available than we asked for.  Use all
-			 * of it, rather than making more fread calls than necessary.
-			 */
-			rbytes = fread(sbuf.data + sbuf.len, 1,
-						   (size_t) (sbuf.maxlen - sbuf.len - 1), file);
-			sbuf.len += rbytes;
-			nbytes += rbytes;
-		}
+        fd = fileno(file);
+        nbytes = 0;
+        do {
+            /* OK, ensure that we can read at least MIN_READ_SIZE */
+ 			enlargeStringInfo(&sbuf, MIN_READ_SIZE);
+ 
+ 			/*
+ 			 * stringinfo.c likes to allocate in powers of 2, so it's likely
+ 			 * that much more space is available than we asked for.  Use all
+			 * of it, rather than making more read calls than necessary.
+ 			 */
+		    rbytes = read(fd, sbuf.data + sbuf.len, (size_t) (sbuf.maxlen - sbuf.len - 1));
+
+ 			sbuf.len += rbytes;
+ 			nbytes += rbytes;
+		} while(rbytes != 0 && sbuf.len < ((int64) MaxAllocSize - VARHDRSZ));
+
+	    if (sbuf.len > ((int64) MaxAllocSize - VARHDRSZ))
+			ereport(ERROR,
+					(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+					 errmsg("file length too large")));
 
 		/* Now we can commandeer the stringinfo's buffer as the result */
 		buf = (bytea *) sbuf.data;
@@ -199,10 +190,10 @@ read_binary_file(const char *filename, int64 seek_offset, int64 bytes_to_read,
 				(errcode_for_file_access(),
 				 errmsg("could not read file \"%s\": %m", filename)));
 
-	SET_VARSIZE(buf, nbytes + VARHDRSZ);
-
 	FreeFile(file);
 
+	SET_VARSIZE(buf, nbytes + VARHDRSZ);
+
 	return buf;
 }

#11

Alvaro Herrera

alvherre@2ndquadrant.com

over 5 years ago

In reply to: Ranier Vilela (#10)

Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

I think you meant _IONBF instead of _IOFBF -- otherwise it's at odds
with the comment you add. But what is the justification for that
addition? I don't see us doing that anywhere else.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#12

Ranier Vilela

ranier.vf@gmail.com

over 5 years ago

In reply to: Alvaro Herrera (#11)

Re: Simplified version of read_binary_file (src/backend/utils/adt/genfile.c)

Em ter., 15 de set. de 2020 às 14:54, Alvaro Herrera <
alvherre@2ndquadrant.com> escreveu:

I think you meant _IONBF instead of _IOFBF -- otherwise it's at odds
with the comment you add. But what is the justification for that
addition? I don't see us doing that anywhere else.

No.
_IOFBF *Full buffering:* On output, data is written once the buffer is full
(or flushed <http://www.cplusplus.com/fflush>). On Input, the buffer is
filled when an input operation is requested and the buffer is empty.
* _IONBF No buffering:* No buffer is used. Each I/O operation is written as
soon as possible. In this case, the *buffer* and *size* parameters are
ignored.
_IONBF ignores buffer and size.

Without setvbuf, fread uses an internal buffer, default 4096 bytes (OS
dependent).
If fread uses an internal buffer, then it needs a copy to the buffer
provided by the function.
setvbuf, does the same as read function low level, copies directly to the
final buffer.

regards,
Ranier Vilela