pg_verify_checksums vs windows

Started by Amit Kapilaover 7 years ago7 messages
#1Amit Kapila
amit.kapila16@gmail.com
1 attachment(s)

While trying to debug a recent bug report on hash indexes [1]/messages/by-id/5d03686d-727c-dbf8-0064-bf8b97ffe850@2ndquadrant.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com, I
noticed that pg_verify_checksums don't work on Windows (or at least in
my environment).

initdb -k ..\..\data
pg_verify_checksums.exe ..\..\Data
pg_verify_checksums: short read of block 0 in file
"..\..\Data/global/1136", got only 15 bytes

I have debugged and found that below code is the culprit.

scan_file(char *fn, int segmentno)
{
..
f = open(fn, 0);
..
int r = read(f, buf, BLCKSZ);

if (r == 0)
break;

if (r != BLCKSZ)
{
fprintf(stderr, _("%s: short read of block %d in file \"%s\", got only
%d bytes\n"),
progname, blockno, fn, r);
exit(1);
}
..
}

We are opening the file in text mode and trying to read the BLCKSZ
bytes, however, if there is any Control-Z char, it is treated as EOF.
This problem has been mentioned in the comments in c.h as follows:
/*
* NOTE: this is also used for opening text files.
* WIN32 treats Control-Z as EOF in files opened in text mode.
* Therefore, we open files in binary mode on Win32 so we can read
* literal control-Z. The other affect is that we see CRLF, but
* that is OK because we can already handle those cleanly.
*/

So, I think we need to open the file in binary mode as in other parts
of the code. The attached patch fixes the problem for me.

Thoughts?

[1]: /messages/by-id/5d03686d-727c-dbf8-0064-bf8b97ffe850@2ndquadrant.com -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

pg_verify_checksums_1.patchapplication/octet-stream; name=pg_verify_checksums_1.patchDownload
diff --git a/src/bin/pg_verify_checksums/pg_verify_checksums.c b/src/bin/pg_verify_checksums/pg_verify_checksums.c
index 28c975446e..2392381825 100644
--- a/src/bin/pg_verify_checksums/pg_verify_checksums.c
+++ b/src/bin/pg_verify_checksums/pg_verify_checksums.c
@@ -83,7 +83,7 @@ scan_file(char *fn, int segmentno)
 	int			f;
 	int			blockno;
 
-	f = open(fn, 0);
+	f = open(fn, O_RDONLY | PG_BINARY);
 	if (f < 0)
 	{
 		fprintf(stderr, _("%s: could not open file \"%s\": %s\n"),
#2Magnus Hagander
magnus@hagander.net
In reply to: Amit Kapila (#1)
Re: pg_verify_checksums vs windows

On Wed, Aug 29, 2018 at 1:31 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

While trying to debug a recent bug report on hash indexes [1], I
noticed that pg_verify_checksums don't work on Windows (or at least in
my environment).

initdb -k ..\..\data
pg_verify_checksums.exe ..\..\Data
pg_verify_checksums: short read of block 0 in file
"..\..\Data/global/1136", got only 15 bytes

I have debugged and found that below code is the culprit.

scan_file(char *fn, int segmentno)
{
..
f = open(fn, 0);
..
int r = read(f, buf, BLCKSZ);

if (r == 0)
break;

if (r != BLCKSZ)
{
fprintf(stderr, _("%s: short read of block %d in file \"%s\", got only
%d bytes\n"),
progname, blockno, fn, r);
exit(1);
}
..
}

We are opening the file in text mode and trying to read the BLCKSZ
bytes, however, if there is any Control-Z char, it is treated as EOF.
This problem has been mentioned in the comments in c.h as follows:
/*
* NOTE: this is also used for opening text files.
* WIN32 treats Control-Z as EOF in files opened in text mode.
* Therefore, we open files in binary mode on Win32 so we can read
* literal control-Z. The other affect is that we see CRLF, but
* that is OK because we can already handle those cleanly.
*/

So, I think we need to open the file in binary mode as in other parts
of the code. The attached patch fixes the problem for me.

Thoughts?

Yikes. Yes, I believe you are correct, and that looks like the correct fix.

I wonder why this was not caught on the buildfarm. We do have regression
tests for it, AFAIK? Or maybe we just lucked out there because there was no
^Z char in the files there?

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#3Amit Kapila
amit.kapila16@gmail.com
In reply to: Magnus Hagander (#2)
Re: pg_verify_checksums vs windows

On Wed, Aug 29, 2018 at 5:05 PM Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Aug 29, 2018 at 1:31 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

So, I think we need to open the file in binary mode as in other parts
of the code. The attached patch fixes the problem for me.

Thoughts?

Yikes. Yes, I believe you are correct, and that looks like the correct fix.

I wonder why this was not caught on the buildfarm. We do have regression tests for it, AFAIK?

I am not able to find regression tests for it, but maybe I am not
seeing it properly. By any chance, you have removed it during revert
of ""Allow on-line enabling and disabling of data checksums".

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#4Magnus Hagander
magnus@hagander.net
In reply to: Amit Kapila (#3)
Re: pg_verify_checksums vs windows

On Wed, Aug 29, 2018 at 1:44 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Wed, Aug 29, 2018 at 5:05 PM Magnus Hagander <magnus@hagander.net>
wrote:

On Wed, Aug 29, 2018 at 1:31 PM, Amit Kapila <amit.kapila16@gmail.com>

wrote:

So, I think we need to open the file in binary mode as in other parts
of the code. The attached patch fixes the problem for me.

Thoughts?

Yikes. Yes, I believe you are correct, and that looks like the correct

fix.

I wonder why this was not caught on the buildfarm. We do have regression

tests for it, AFAIK?

I am not able to find regression tests for it, but maybe I am not
seeing it properly. By any chance, you have removed it during revert
of ""Allow on-line enabling and disabling of data checksums".

Oh meh. You are right, it's in the reverted patch, I was looking in the
wrong branch :/ Sorry about that. And that certainly explains why we don't
have it.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#5Amit Kapila
amit.kapila16@gmail.com
In reply to: Magnus Hagander (#4)
Re: pg_verify_checksums vs windows

On Wed, Aug 29, 2018 at 5:17 PM Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Aug 29, 2018 at 1:44 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Aug 29, 2018 at 5:05 PM Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Aug 29, 2018 at 1:31 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

So, I think we need to open the file in binary mode as in other parts
of the code. The attached patch fixes the problem for me.

Thoughts?

Yikes. Yes, I believe you are correct, and that looks like the correct fix.

I wonder why this was not caught on the buildfarm. We do have regression tests for it, AFAIK?

I am not able to find regression tests for it, but maybe I am not
seeing it properly. By any chance, you have removed it during revert
of ""Allow on-line enabling and disabling of data checksums".

Oh meh. You are right, it's in the reverted patch, I was looking in the wrong branch :/ Sorry about that. And that certainly explains why we don't have it.

Okay. I will commit this in a day or so after once verifying it on
PG11 as well. I think this needs to be backpatched, let me know if
you think otherwise.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#6Magnus Hagander
magnus@hagander.net
In reply to: Amit Kapila (#5)
Re: pg_verify_checksums vs windows

On Thu, Aug 30, 2018 at 1:32 PM, Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Wed, Aug 29, 2018 at 5:17 PM Magnus Hagander <magnus@hagander.net>
wrote:

On Wed, Aug 29, 2018 at 1:44 PM, Amit Kapila <amit.kapila16@gmail.com>

wrote:

On Wed, Aug 29, 2018 at 5:05 PM Magnus Hagander <magnus@hagander.net>

wrote:

On Wed, Aug 29, 2018 at 1:31 PM, Amit Kapila <amit.kapila16@gmail.com>

wrote:

So, I think we need to open the file in binary mode as in other parts
of the code. The attached patch fixes the problem for me.

Thoughts?

Yikes. Yes, I believe you are correct, and that looks like the

correct fix.

I wonder why this was not caught on the buildfarm. We do have

regression tests for it, AFAIK?

I am not able to find regression tests for it, but maybe I am not
seeing it properly. By any chance, you have removed it during revert
of ""Allow on-line enabling and disabling of data checksums".

Oh meh. You are right, it's in the reverted patch, I was looking in the

wrong branch :/ Sorry about that. And that certainly explains why we don't
have it.

Okay. I will commit this in a day or so after once verifying it on
PG11 as well. I think this needs to be backpatched, let me know if
you think otherwise.

Definitely a bug so yes, it needs backpatching.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#7Amit Kapila
amit.kapila16@gmail.com
In reply to: Magnus Hagander (#6)
Re: pg_verify_checksums vs windows

On Thu, Aug 30, 2018 at 5:04 PM Magnus Hagander <magnus@hagander.net> wrote:

On Thu, Aug 30, 2018 at 1:32 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

Okay. I will commit this in a day or so after once verifying it on
PG11 as well. I think this needs to be backpatched, let me know if
you think otherwise.

Definitely a bug so yes, it needs backpatching.

Okay, pushed!

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com