pg_basebackup is not checking IDENTIFY_SYSTEM numbre of columns

Started by Jaime Casanovaabout 14 years ago4 messages
#1Jaime Casanova
jaime@2ndquadrant.com

Hi,

I just notice $SUBJECT and this could lead us to a segmentation fault
if by accident we point to a system with a different number of columns
in IDENTIFY_SYSTEM, at least i point pg_receivexlog from current head
to a 9.0 instalation and got that.

Any reason for not checking number of columns?

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

#2Magnus Hagander
magnus@hagander.net
In reply to: Jaime Casanova (#1)
Re: pg_basebackup is not checking IDENTIFY_SYSTEM numbre of columns

On Wed, Jan 11, 2012 at 22:31, Jaime Casanova <jaime@2ndquadrant.com> wrote:

Hi,

I just notice $SUBJECT and this could lead us to a segmentation fault
if by accident we point to a system with a different number of columns
in IDENTIFY_SYSTEM, at least i point pg_receivexlog from current head
to a 9.0 instalation and got that.

Any reason for not checking number of columns?

Hmm. I could've bet it did that. I must've taken that code out at some
point during refactoring.

No, no reason. Adding such a check would be a good idea.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#3Jaime Casanova
jaime@2ndquadrant.com
In reply to: Magnus Hagander (#2)
1 attachment(s)
Re: pg_basebackup is not checking IDENTIFY_SYSTEM numbre of columns

On Wed, Jan 11, 2012 at 5:11 PM, Magnus Hagander <magnus@hagander.net> wrote:

No, no reason. Adding such a check would be a good idea.

ok. patch attached, it also adds a few PQclear() calls before
disconnect_and_exit().
btw, in BaseBackup() in line 1149 (after the patch is applied) there
is an exit instead of disconnect_and_exit() and that is probably a
typo too

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

Attachments:

basebackup_nfields.patchtext/x-patch; charset=US-ASCII; name=basebackup_nfields.patchDownload
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
new file mode 100644
index 4007680..22d906a
*** a/src/bin/pg_basebackup/pg_basebackup.c
--- b/src/bin/pg_basebackup/pg_basebackup.c
*************** BaseBackup(void)
*** 916,927 ****
  	{
  		fprintf(stderr, _("%s: could not identify system: %s\n"),
  				progname, PQerrorMessage(conn));
  		disconnect_and_exit(1);
  	}
! 	if (PQntuples(res) != 1)
  	{
! 		fprintf(stderr, _("%s: could not identify system, got %i rows\n"),
! 				progname, PQntuples(res));
  		disconnect_and_exit(1);
  	}
  	sysidentifier = strdup(PQgetvalue(res, 0, 0));
--- 916,929 ----
  	{
  		fprintf(stderr, _("%s: could not identify system: %s\n"),
  				progname, PQerrorMessage(conn));
+ 		PQclear(res);
  		disconnect_and_exit(1);
  	}
! 	if (PQntuples(res) != 1 || PQnfields(res) != 3)
  	{
! 		fprintf(stderr, _("%s: could not identify system, got %i rows and %i fields\n"),
! 				progname, PQntuples(res), PQnfields(res));
! 		PQclear(res);
  		disconnect_and_exit(1);
  	}
  	sysidentifier = strdup(PQgetvalue(res, 0, 0));
*************** BaseBackup(void)
*** 954,965 ****
--- 956,969 ----
  	{
  		fprintf(stderr, _("%s: could not initiate base backup: %s"),
  				progname, PQerrorMessage(conn));
+ 		PQclear(res);
  		disconnect_and_exit(1);
  	}
  	if (PQntuples(res) != 1)
  	{
  		fprintf(stderr, _("%s: no start point returned from server\n"),
  				progname);
+ 		PQclear(res);
  		disconnect_and_exit(1);
  	}
  	strcpy(xlogstart, PQgetvalue(res, 0, 0));
*************** BaseBackup(void)
*** 976,986 ****
--- 980,992 ----
  	{
  		fprintf(stderr, _("%s: could not get backup header: %s"),
  				progname, PQerrorMessage(conn));
+ 		PQclear(res);
  		disconnect_and_exit(1);
  	}
  	if (PQntuples(res) < 1)
  	{
  		fprintf(stderr, _("%s: no data returned from server\n"), progname);
+ 		PQclear(res);
  		disconnect_and_exit(1);
  	}
  
*************** BaseBackup(void)
*** 1010,1015 ****
--- 1016,1022 ----
  	{
  		fprintf(stderr, _("%s: can only write single tablespace to stdout, database has %d\n"),
  				progname, PQntuples(res));
+ 		PQclear(res);
  		disconnect_and_exit(1);
  	}
  
*************** BaseBackup(void)
*** 1051,1062 ****
--- 1058,1071 ----
  	{
  		fprintf(stderr, _("%s: could not get WAL end position from server\n"),
  				progname);
+ 		PQclear(res);
  		disconnect_and_exit(1);
  	}
  	if (PQntuples(res) != 1)
  	{
  		fprintf(stderr, _("%s: no WAL end position returned from server\n"),
  				progname);
+ 		PQclear(res);
  		disconnect_and_exit(1);
  	}
  	strcpy(xlogend, PQgetvalue(res, 0, 0));
*************** BaseBackup(void)
*** 1069,1074 ****
--- 1078,1084 ----
  	{
  		fprintf(stderr, _("%s: final receive failed: %s"),
  				progname, PQerrorMessage(conn));
+ 		PQclear(res);
  		disconnect_and_exit(1);
  	}
  
*************** BaseBackup(void)
*** 1089,1094 ****
--- 1099,1105 ----
  		{
  			fprintf(stderr, _("%s: could not send command to background pipe: %s\n"),
  					progname, strerror(errno));
+ 			PQclear(res);
  			disconnect_and_exit(1);
  		}
  
*************** BaseBackup(void)
*** 1098,1121 ****
--- 1109,1136 ----
  		{
  			fprintf(stderr, _("%s: could not wait for child process: %s\n"),
  					progname, strerror(errno));
+ 			PQclear(res);
  			disconnect_and_exit(1);
  		}
  		if (r != bgchild)
  		{
  			fprintf(stderr, _("%s: child %i died, expected %i\n"),
  					progname, r, bgchild);
+ 			PQclear(res);
  			disconnect_and_exit(1);
  		}
  		if (!WIFEXITED(status))
  		{
  			fprintf(stderr, _("%s: child process did not exit normally\n"),
  					progname);
+ 			PQclear(res);
  			disconnect_and_exit(1);
  		}
  		if (WEXITSTATUS(status) != 0)
  		{
  			fprintf(stderr, _("%s: child process exited with error %i\n"),
  					progname, WEXITSTATUS(status));
+ 			PQclear(res);
  			disconnect_and_exit(1);
  		}
  		/* Exited normally, we're happy! */
*************** BaseBackup(void)
*** 1130,1135 ****
--- 1145,1151 ----
  		{
  			fprintf(stderr, _("%s: could not parse xlog end position \"%s\"\n"),
  					progname, xlogend);
+ 			PQclear(res);
  			exit(1);
  		}
  		InterlockedIncrement(&has_xlogendptr);
*************** BaseBackup(void)
*** 1140,1145 ****
--- 1156,1162 ----
  			_dosmaperr(GetLastError());
  			fprintf(stderr, _("%s: could not wait for child thread: %s\n"),
  					progname, strerror(errno));
+ 			PQclear(res);
  			disconnect_and_exit(1);
  		}
  		if (GetExitCodeThread((HANDLE) bgchild, &status) == 0)
*************** BaseBackup(void)
*** 1147,1158 ****
--- 1164,1177 ----
  			_dosmaperr(GetLastError());
  			fprintf(stderr, _("%s: could not get child thread exit status: %s\n"),
  					progname, strerror(errno));
+ 			PQclear(res);
  			disconnect_and_exit(1);
  		}
  		if (status != 0)
  		{
  			fprintf(stderr, _("%s: child thread exited with error %u\n"),
  					progname, (unsigned int) status);
+ 			PQclear(res);
  			disconnect_and_exit(1);
  		}
  		/* Exited normally, we're happy */
*************** BaseBackup(void)
*** 1162,1167 ****
--- 1181,1187 ----
  	/*
  	 * End of copy data. Final result is already checked inside the loop.
  	 */
+ 	PQclear(res);
  	PQfinish(conn);
  
  	if (verbose)
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
new file mode 100644
index e698b06..61045e2
*** a/src/bin/pg_basebackup/pg_receivexlog.c
--- b/src/bin/pg_basebackup/pg_receivexlog.c
*************** StreamLog(void)
*** 233,244 ****
  	{
  		fprintf(stderr, _("%s: could not identify system: %s\n"),
  				progname, PQerrorMessage(conn));
  		disconnect_and_exit(1);
  	}
! 	if (PQntuples(res) != 1)
  	{
! 		fprintf(stderr, _("%s: could not identify system, got %i rows\n"),
! 				progname, PQntuples(res));
  		disconnect_and_exit(1);
  	}
  	timeline = atoi(PQgetvalue(res, 0, 1));
--- 233,246 ----
  	{
  		fprintf(stderr, _("%s: could not identify system: %s\n"),
  				progname, PQerrorMessage(conn));
+ 		PQclear(res);
  		disconnect_and_exit(1);
  	}
! 	if (PQntuples(res) != 1 || PQnfields(res) != 3)
  	{
! 		fprintf(stderr, _("%s: could not identify system, got %i rows and %i fields\n"),
! 				progname, PQntuples(res), PQnfields(res));
! 		PQclear(res);
  		disconnect_and_exit(1);
  	}
  	timeline = atoi(PQgetvalue(res, 0, 1));
*************** StreamLog(void)
*** 246,251 ****
--- 248,254 ----
  	{
  		fprintf(stderr, _("%s: could not parse log start position from value \"%s\"\n"),
  				progname, PQgetvalue(res, 0, 2));
+ 		PQclear(res);
  		disconnect_and_exit(1);
  	}
  	PQclear(res);
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
new file mode 100644
index c390cbf..6af252b
*** a/src/bin/pg_basebackup/receivelog.c
--- b/src/bin/pg_basebackup/receivelog.c
*************** ReceiveXlogStream(PGconn *conn, XLogRecP
*** 235,240 ****
--- 235,250 ----
  			PQclear(res);
  			return false;
  		}
+ 		if (PQnfields(res) != 3 || PQntuples(res) != 1)
+ 		{
+ 			int			ntuples = PQntuples(res);
+ 			int			nfields = PQnfields(res);
+ 
+ 			PQclear(res);
+ 			fprintf(stderr, _("%s: Expected 1 tuple with 3 fields, got %d tuples with %d fields."),
+ 							   progname, ntuples, nfields);
+ 			return false;
+ 		}		
  		if (strcmp(sysidentifier, PQgetvalue(res, 0, 0)) != 0)
  		{
  			fprintf(stderr, _("%s: system identifier does not match between base backup and streaming connection\n"), progname);
#4Magnus Hagander
magnus@hagander.net
In reply to: Jaime Casanova (#3)
Re: pg_basebackup is not checking IDENTIFY_SYSTEM numbre of columns

On Sun, Jan 15, 2012 at 22:00, Jaime Casanova <jaime@2ndquadrant.com> wrote:

On Wed, Jan 11, 2012 at 5:11 PM, Magnus Hagander <magnus@hagander.net> wrote:

No, no reason. Adding such a check would be a good idea.

ok. patch attached, it also adds a few PQclear() calls before
disconnect_and_exit().

I don't think we need to care about all those PQclear() - it does an
exit() right after them anyway, so what's the point? (the disconnect
part is important of course, since otherwise we get a message in the
log on the server)

I've applied the patch without the PQclear()s, and changed it around
so that the error message shown is actually the same in all the
different places.

btw, in BaseBackup() in line 1149 (after the patch is applied) there
is an exit instead of disconnect_and_exit() and that is probably a
typo too

It is, indeed. Fixed.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/