pg_upgrade loses security lables and COMMENTs on blobs

Started by Stephen Frostalmost 9 years ago4 messages
#1Stephen Frost
sfrost@snowman.net

Greetings,

When pg_upgrade calls pg_dump, it passes in "--schema-only", which is
generally correct, except that this causes everything having to do with
large objects to be excluded. That's still usually correct, because
pg_upgrade will simply copy the pg_largeobject and
pg_largeobject_metadata tables through to the new cluster as-is
(essentially treating them as if they were user tables).

Unfortunately, those tables aren't, actually, the only places that we
store information about large objects; the general-purpose tables like
pg_seclabel and pg_description can hold information about large objects
too.

What this means is that performing a pg_upgrade will result in any
security labels or comments on large objects being dropped. This
seems to go back at least as far as 9.2, though I found it through the
pg_dump regression testing that I've been working on.

I haven't looked at trying to fix this yet, but I'm thinking the
approach to use will probably be to modify pg_dump to still call
getBlobs() when in binary-upgrade mode (regardless of the schema-only
flag) but then have dumpBlobs(), when in binary-upgrade mode, only
output the security labels and comments. I hope that doesn't end up
causing some kind of chicken-and-egg problem.. Presumably the large
object tables are in place and correct before the dump is restored, so I
think this will work.

Just wanted to get a note out to -hackers about the issue, I'll see
about getting a fix written up for it soon.

Thanks!

Stephen

#2Stephen Frost
sfrost@snowman.net
In reply to: Stephen Frost (#1)
1 attachment(s)
Re: pg_upgrade loses security lables and COMMENTs on blobs

All,

* Stephen Frost (sfrost@snowman.net) wrote:

Just wanted to get a note out to -hackers about the issue, I'll see
about getting a fix written up for it soon.

Attached is a patch which addresses this issue. I'm not terribly
pleased with it, but I also haven't got any great ideas of what else to
do. Suggestions welcome, of course.

Otherwise, I'll plan to start working on the back-branch changes for
this soon.

Thanks!

Stephen

Attachments:

fix_pg_upgrade_blob_comments-master_v5.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
new file mode 100644
index 6480fb8..983a999
*** a/src/bin/pg_dump/pg_backup.h
--- b/src/bin/pg_dump/pg_backup.h
*************** typedef struct _restoreOptions
*** 120,125 ****
--- 120,126 ----
  	int			enable_row_security;
  	int			sequence_data;	/* dump sequence data even in schema-only mode */
  	int			include_subscriptions;
+ 	int			binary_upgrade;
  } RestoreOptions;
  
  typedef struct _dumpOptions
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
new file mode 100644
index 929f1b5..6c340f9
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
*************** _tocEntryRequired(TocEntry *te, teSectio
*** 2876,2882 ****
  	/* Mask it if we only want schema */
  	if (ropt->schemaOnly)
  	{
! 		if (!(ropt->sequence_data && strcmp(te->desc, "SEQUENCE SET") == 0))
  			res = res & REQ_SCHEMA;
  	}
  
--- 2876,2884 ----
  	/* Mask it if we only want schema */
  	if (ropt->schemaOnly)
  	{
! 		if (!(ropt->sequence_data && strcmp(te->desc, "SEQUENCE SET") == 0) &&
! 			!(ropt->binary_upgrade && strcmp(te->desc,"BLOB") == 0) &&
! 			!(ropt->binary_upgrade && strncmp(te->tag,"LARGE OBJECT ", 13) == 0))
  			res = res & REQ_SCHEMA;
  	}
  
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
new file mode 100644
index 89db310..44d4f32
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*************** main(int argc, char **argv)
*** 775,781 ****
  	if (dopt.schemaOnly && dopt.sequence_data)
  		getTableData(&dopt, tblinfo, numTables, dopt.oids, RELKIND_SEQUENCE);
  
! 	if (dopt.outputBlobs)
  		getBlobs(fout);
  
  	/*
--- 775,789 ----
  	if (dopt.schemaOnly && dopt.sequence_data)
  		getTableData(&dopt, tblinfo, numTables, dopt.oids, RELKIND_SEQUENCE);
  
! 	/*
! 	 * In binary-upgrade mode, we do not have to worry about the actual blob
! 	 * data or the associated metadata that resides in the pg_largeobject and
! 	 * pg_largeobject_metadata tables, respectivly.
! 	 *
! 	 * However, we do need to collect blob information as there may be comments
! 	 * or security labels on blobs and we need to dump those out.
! 	 */
! 	if (dopt.outputBlobs || dopt.binary_upgrade)
  		getBlobs(fout);
  
  	/*
*************** main(int argc, char **argv)
*** 855,860 ****
--- 863,869 ----
  	ropt->enable_row_security = dopt.enable_row_security;
  	ropt->sequence_data = dopt.sequence_data;
  	ropt->include_subscriptions = dopt.include_subscriptions;
+ 	ropt->binary_upgrade = dopt.binary_upgrade;
  
  	if (compressLevel == -1)
  		ropt->compression = 0;
*************** getBlobs(Archive *fout)
*** 2903,2908 ****
--- 2912,2931 ----
  			PQgetisnull(res, i, i_initlomacl) &&
  			PQgetisnull(res, i, i_initrlomacl))
  			binfo[i].dobj.dump &= ~DUMP_COMPONENT_ACL;
+ 
+ 		/*
+ 		 * In binary-upgrade mode for blobs, we do *not* dump out the data or
+ 		 * the ACLs, should any exist.  The data and ACL (if any) will be
+ 		 * copied by pg_upgrade, which simply copies the pg_largeobject and
+ 		 * pg_largeobject_metadata tables.
+ 		 *
+ 		 * We *do* dump out the definition of the blob because we need that
+ 		 * to make the restoration of the comments, and anything else, work
+ 		 * since pg_upgrade copies the files behind pg_largeobject and
+ 		 * pg_largeobject_metadata after the dump is restored.
+ 		 */
+ 		if (dopt->binary_upgrade)
+ 			binfo[i].dobj.dump &= ~(DUMP_COMPONENT_DATA | DUMP_COMPONENT_ACL);
  	}
  
  	/*
*************** dumpComment(Archive *fout, const char *t
*** 8831,8837 ****
  	}
  	else
  	{
! 		if (dopt->schemaOnly)
  			return;
  	}
  
--- 8854,8861 ----
  	}
  	else
  	{
! 		/* We do dump blob comments in binary-upgrade mode */
! 		if (dopt->schemaOnly && !dopt->binary_upgrade)
  			return;
  	}
  
*************** dumpSecLabel(Archive *fout, const char *
*** 14226,14232 ****
  	}
  	else
  	{
! 		if (dopt->schemaOnly)
  			return;
  	}
  
--- 14250,14257 ----
  	}
  	else
  	{
! 		/* We do dump blob security labels in binary-upgrade mode */
! 		if (dopt->schemaOnly && !dopt->binary_upgrade)
  			return;
  	}
  
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
new file mode 100644
index f73bf89..c51088a
*** a/src/bin/pg_dump/t/002_pg_dump.pl
--- b/src/bin/pg_dump/t/002_pg_dump.pl
*************** my %pgdump_runs = (
*** 39,49 ****
  	binary_upgrade => {
  		dump_cmd => [
  			'pg_dump',
! 			"--file=$tempdir/binary_upgrade.sql",
  			'--schema-only',
  			'--binary-upgrade',
  			'-d', 'postgres',    # alternative way to specify database
! 		], },
  	clean => {
  		dump_cmd => [
  			'pg_dump',
--- 39,55 ----
  	binary_upgrade => {
  		dump_cmd => [
  			'pg_dump',
! 			'--format=custom',
! 			"--file=$tempdir/binary_upgrade.dump",
  			'--schema-only',
  			'--binary-upgrade',
  			'-d', 'postgres',    # alternative way to specify database
! 		],
! 		restore_cmd => [
! 			'pg_restore', '-Fc',
! 			'--verbose',
! 			"--file=$tempdir/binary_upgrade.sql",
! 			"$tempdir/binary_upgrade.dump", ], },
  	clean => {
  		dump_cmd => [
  			'pg_dump',
*************** my %tests = (
*** 334,339 ****
--- 340,346 ----
  		all_runs => 1,
  		regexp   => qr/^ALTER LARGE OBJECT \d+ OWNER TO .*;/m,
  		like     => {
+ 			binary_upgrade           => 1,
  			clean                    => 1,
  			clean_if_exists          => 1,
  			column_inserts           => 1,
*************** my %tests = (
*** 348,354 ****
  			section_pre_data         => 1,
  			test_schema_plus_blobs   => 1, },
  		unlike => {
- 			binary_upgrade           => 1,
  			no_blobs                 => 1,
  			no_owner                 => 1,
  			only_dump_test_schema    => 1,
--- 355,360 ----
*************** my %tests = (
*** 666,671 ****
--- 672,678 ----
  'SELECT pg_catalog.lo_from_bytea(0, \'\\x310a320a330a340a350a360a370a380a390a\');',
  		regexp => qr/^SELECT pg_catalog\.lo_create\('\d+'\);/m,
  		like   => {
+ 			binary_upgrade           => 1,
  			clean                    => 1,
  			clean_if_exists          => 1,
  			column_inserts           => 1,
*************** my %tests = (
*** 681,687 ****
  			section_pre_data         => 1,
  			test_schema_plus_blobs   => 1, },
  		unlike => {
- 			binary_upgrade           => 1,
  			no_blobs                 => 1,
  			only_dump_test_schema    => 1,
  			only_dump_test_table     => 1,
--- 688,693 ----
diff --git a/src/test/regress/expected/large_object.out b/src/test/regress/expected/large_object.out
new file mode 100644
index ...b00d47c
*** a/src/test/regress/expected/large_object.out
--- b/src/test/regress/expected/large_object.out
***************
*** 0 ****
--- 1,15 ----
+ -- This is more-or-less DROP IF EXISTS LARGE OBJECT 3001;
+ WITH unlink AS (SELECT lo_unlink(loid) FROM pg_largeobject WHERE loid = 3001) SELECT 1;
+  ?column? 
+ ----------
+         1
+ (1 row)
+ 
+ -- Test creation of a large object and leave it for testing pg_upgrade
+ SELECT lo_create(3001);
+  lo_create 
+ -----------
+       3001
+ (1 row)
+ 
+ COMMENT ON LARGE OBJECT 3001 IS 'testing comments';
diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out
new file mode 100644
index f66b443..1af66ff
*** a/src/test/regress/expected/privileges.out
--- b/src/test/regress/expected/privileges.out
*************** DROP ROLE IF EXISTS regress_user3;
*** 12,18 ****
  DROP ROLE IF EXISTS regress_user4;
  DROP ROLE IF EXISTS regress_user5;
  DROP ROLE IF EXISTS regress_user6;
! SELECT lo_unlink(oid) FROM pg_largeobject_metadata;
   lo_unlink 
  -----------
  (0 rows)
--- 12,18 ----
  DROP ROLE IF EXISTS regress_user4;
  DROP ROLE IF EXISTS regress_user5;
  DROP ROLE IF EXISTS regress_user6;
! SELECT lo_unlink(oid) FROM pg_largeobject_metadata WHERE oid >= 1000 AND oid < 3000;
   lo_unlink 
  -----------
  (0 rows)
*************** SELECT lo_unlink(2002);
*** 1173,1183 ****
  
  \c -
  -- confirm ACL setting
! SELECT oid, pg_get_userbyid(lomowner) ownername, lomacl FROM pg_largeobject_metadata;
   oid  |   ownername   |                                             lomacl                                             
  ------+---------------+------------------------------------------------------------------------------------------------
-  1002 | regress_user1 | 
   1001 | regress_user1 | {regress_user1=rw/regress_user1,=rw/regress_user1}
   1003 | regress_user1 | {regress_user1=rw/regress_user1,regress_user2=r/regress_user1}
   1004 | regress_user1 | {regress_user1=rw/regress_user1,regress_user2=rw/regress_user1}
   1005 | regress_user1 | {regress_user1=rw/regress_user1,regress_user2=r*w/regress_user1,regress_user3=r/regress_user2}
--- 1173,1183 ----
  
  \c -
  -- confirm ACL setting
! SELECT oid, pg_get_userbyid(lomowner) ownername, lomacl FROM pg_largeobject_metadata WHERE oid >= 1000 AND oid < 3000 ORDER BY oid;
   oid  |   ownername   |                                             lomacl                                             
  ------+---------------+------------------------------------------------------------------------------------------------
   1001 | regress_user1 | {regress_user1=rw/regress_user1,=rw/regress_user1}
+  1002 | regress_user1 | 
   1003 | regress_user1 | {regress_user1=rw/regress_user1,regress_user2=r/regress_user1}
   1004 | regress_user1 | {regress_user1=rw/regress_user1,regress_user2=rw/regress_user1}
   1005 | regress_user1 | {regress_user1=rw/regress_user1,regress_user2=r*w/regress_user1,regress_user3=r/regress_user2}
*************** DROP TABLE atest6;
*** 1546,1552 ****
  DROP TABLE atestc;
  DROP TABLE atestp1;
  DROP TABLE atestp2;
! SELECT lo_unlink(oid) FROM pg_largeobject_metadata;
   lo_unlink 
  -----------
           1
--- 1546,1552 ----
  DROP TABLE atestc;
  DROP TABLE atestp1;
  DROP TABLE atestp2;
! SELECT lo_unlink(oid) FROM pg_largeobject_metadata WHERE oid >= 1000 AND oid < 3000;
   lo_unlink 
  -----------
           1
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
new file mode 100644
index edeb2d6..1f2fb59
*** a/src/test/regress/parallel_schedule
--- b/src/test/regress/parallel_schedule
*************** test: select_into select_distinct select
*** 84,90 ****
  # ----------
  # Another group of parallel tests
  # ----------
! test: brin gin gist spgist privileges init_privs security_label collate matview lock replica_identity rowsecurity object_address tablesample groupingsets drop_operator
  
  # ----------
  # Another group of parallel tests
--- 84,90 ----
  # ----------
  # Another group of parallel tests
  # ----------
! test: brin gin gist spgist privileges init_privs security_label collate matview lock replica_identity rowsecurity object_address tablesample groupingsets drop_operator large_object
  
  # ----------
  # Another group of parallel tests
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
new file mode 100644
index 27a46d7..9ffceff
*** a/src/test/regress/serial_schedule
--- b/src/test/regress/serial_schedule
*************** test: object_address
*** 116,121 ****
--- 116,122 ----
  test: tablesample
  test: groupingsets
  test: drop_operator
+ test: large_object
  test: alter_generic
  test: alter_operator
  test: misc
diff --git a/src/test/regress/sql/large_object.sql b/src/test/regress/sql/large_object.sql
new file mode 100644
index ...a9e18b7
*** a/src/test/regress/sql/large_object.sql
--- b/src/test/regress/sql/large_object.sql
***************
*** 0 ****
--- 1,8 ----
+ 
+ -- This is more-or-less DROP IF EXISTS LARGE OBJECT 3001;
+ WITH unlink AS (SELECT lo_unlink(loid) FROM pg_largeobject WHERE loid = 3001) SELECT 1;
+ 
+ -- Test creation of a large object and leave it for testing pg_upgrade
+ SELECT lo_create(3001);
+ 
+ COMMENT ON LARGE OBJECT 3001 IS 'testing comments';
diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql
new file mode 100644
index 00dc7bd..82d6751
*** a/src/test/regress/sql/privileges.sql
--- b/src/test/regress/sql/privileges.sql
*************** DROP ROLE IF EXISTS regress_user4;
*** 17,23 ****
  DROP ROLE IF EXISTS regress_user5;
  DROP ROLE IF EXISTS regress_user6;
  
! SELECT lo_unlink(oid) FROM pg_largeobject_metadata;
  
  RESET client_min_messages;
  
--- 17,23 ----
  DROP ROLE IF EXISTS regress_user5;
  DROP ROLE IF EXISTS regress_user6;
  
! SELECT lo_unlink(oid) FROM pg_largeobject_metadata WHERE oid >= 1000 AND oid < 3000;
  
  RESET client_min_messages;
  
*************** SELECT lo_unlink(2002);
*** 729,735 ****
  
  \c -
  -- confirm ACL setting
! SELECT oid, pg_get_userbyid(lomowner) ownername, lomacl FROM pg_largeobject_metadata;
  
  SET SESSION AUTHORIZATION regress_user3;
  
--- 729,735 ----
  
  \c -
  -- confirm ACL setting
! SELECT oid, pg_get_userbyid(lomowner) ownername, lomacl FROM pg_largeobject_metadata WHERE oid >= 1000 AND oid < 3000 ORDER BY oid;
  
  SET SESSION AUTHORIZATION regress_user3;
  
*************** DROP TABLE atestc;
*** 960,966 ****
  DROP TABLE atestp1;
  DROP TABLE atestp2;
  
! SELECT lo_unlink(oid) FROM pg_largeobject_metadata;
  
  DROP GROUP regress_group1;
  DROP GROUP regress_group2;
--- 960,966 ----
  DROP TABLE atestp1;
  DROP TABLE atestp2;
  
! SELECT lo_unlink(oid) FROM pg_largeobject_metadata WHERE oid >= 1000 AND oid < 3000;
  
  DROP GROUP regress_group1;
  DROP GROUP regress_group2;
#3Bruce Momjian
bruce@momjian.us
In reply to: Stephen Frost (#2)
Re: pg_upgrade loses security lables and COMMENTs on blobs

On Thu, Feb 23, 2017 at 10:36:37AM -0500, Stephen Frost wrote:

All,

* Stephen Frost (sfrost@snowman.net) wrote:

Just wanted to get a note out to -hackers about the issue, I'll see
about getting a fix written up for it soon.

Attached is a patch which addresses this issue. I'm not terribly
pleased with it, but I also haven't got any great ideas of what else to
do. Suggestions welcome, of course.

Otherwise, I'll plan to start working on the back-branch changes for
this soon.

Yeah, this is probably the best you can do. Your analysis of how we
used to treat large objects is correct, and was never adjusted for the
changes you outlined.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Stephen Frost
sfrost@snowman.net
In reply to: Bruce Momjian (#3)
Re: pg_upgrade loses security lables and COMMENTs on blobs

Bruce,

* Bruce Momjian (bruce@momjian.us) wrote:

On Thu, Feb 23, 2017 at 10:36:37AM -0500, Stephen Frost wrote:

* Stephen Frost (sfrost@snowman.net) wrote:

Just wanted to get a note out to -hackers about the issue, I'll see
about getting a fix written up for it soon.

Attached is a patch which addresses this issue. I'm not terribly
pleased with it, but I also haven't got any great ideas of what else to
do. Suggestions welcome, of course.

Otherwise, I'll plan to start working on the back-branch changes for
this soon.

Yeah, this is probably the best you can do. Your analysis of how we
used to treat large objects is correct, and was never adjusted for the
changes you outlined.

Great, thanks, I'll be pushing this to all the branches soon, still
testing.

Thanks again!

Stephen