pg_amcheck option to install extension

Started by Andrew Dunstanover 4 years ago33 messages
#1Andrew Dunstan
andrew@dunslane.net

Hi,

Peter Geoghegan suggested that I have the cross version upgrade checker
run pg_amcheck on the upgraded module. This seemed to me like a good
idea, so I tried it, only to find that it refuses to run unless the
amcheck extension is installed. That's fair enough, but it also seems to
me like we should have an option to have pg_amcheck install the
extension is it's not present, by running something like 'create
extension if not exists amcheck'. Maybe in such a case there could also
be an option to drop the extension when pg_amcheck's work is done - I
haven't thought through all the implications.

Given pg_amcheck is a new piece of work I'm not sure if we can sneak
this in under the wire for release 14. I will certainly undertake to
review anything expeditiously. I can work around this issue in the
buildfarm, but it seems like something other users are likely to want.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#2Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Andrew Dunstan (#1)
1 attachment(s)
Re: pg_amcheck option to install extension

On Apr 16, 2021, at 11:06 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

Hi,

Peter Geoghegan suggested that I have the cross version upgrade checker
run pg_amcheck on the upgraded module. This seemed to me like a good
idea, so I tried it, only to find that it refuses to run unless the
amcheck extension is installed. That's fair enough, but it also seems to
me like we should have an option to have pg_amcheck install the
extension is it's not present, by running something like 'create
extension if not exists amcheck'. Maybe in such a case there could also
be an option to drop the extension when pg_amcheck's work is done - I
haven't thought through all the implications.

Given pg_amcheck is a new piece of work I'm not sure if we can sneak
this in under the wire for release 14. I will certainly undertake to
review anything expeditiously. I can work around this issue in the
buildfarm, but it seems like something other users are likely to want.

We cannot quite use a "create extension if not exists amcheck" command, as we clear the search path and so must specify the schema to use. Should we instead avoid clearing the search path for this? What are the security implications of using the first schema of the search path?

When called as `pg_amcheck --install-missing`, the implementation in the attached patch runs per database being checked a "create extension if not exists amcheck with schema public". If called as `pg_amcheck --install-missing=foo`, it instead runs "create extension if not exists amcheck with schema foo` having escaped "foo" appropriately for the given database. There is no option to use different schemas for different databases. Nor is there any option to use the search path. If somebody needs that, I think they can manage installing amcheck themselves.

Does this meet your needs for v14? I'd like to get this nailed down quickly, as it is unclear to me that we should even be doing this so late in the development cycle.

I'd also like your impressions on whether we're likely to move contrib/amcheck into core anytime soon. If so, is it worth adding an option that we'll soon need to deprecate?

Attachments:

v1-0001-Adding-install-missing-option-to-pg_amcheck.patchapplication/octet-stream; name=v1-0001-Adding-install-missing-option-to-pg_amcheck.patch; x-unix-mode=0644Download
From b06ec8671cb474309dc0619fe8d4d2658e0820c1 Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Sat, 17 Apr 2021 12:40:11 -0700
Subject: [PATCH v1] Adding --install-missing option to pg_amcheck

To ease the checking of databases where amcheck is not yet
installed, adding an option that will create the extension if it
does not yet exist.
---
 doc/src/sgml/ref/pg_amcheck.sgml | 17 ++++++++++++
 src/bin/pg_amcheck/pg_amcheck.c  | 44 ++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/doc/src/sgml/ref/pg_amcheck.sgml b/doc/src/sgml/ref/pg_amcheck.sgml
index d01e26faa8..71e16f20ec 100644
--- a/doc/src/sgml/ref/pg_amcheck.sgml
+++ b/doc/src/sgml/ref/pg_amcheck.sgml
@@ -217,6 +217,23 @@ PostgreSQL documentation
      </listitem>
     </varlistentry>
 
+    <varlistentry>
+     <term><option>--install-missing</option></term>
+     <term><option>--install-missing=<replaceable class="parameter">schema</replaceable></option></term>
+     <listitem>
+      <para>
+	   Install any missing extensions that are required to check the
+	   database(s).  If not yet installed, each extension's objects will be
+	   installed into the given
+	   <replaceable class="parameter">schema</replaceable>, or if not specified
+	   into schema <literal>public</literal>.
+      </para>
+      <para>
+	   At present, the only required extension is <xref linkend="amcheck"/>.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry>
      <term><option>-j <replaceable class="parameter">num</replaceable></option></term>
      <term><option>--jobs=<replaceable class="parameter">num</replaceable></option></term>
diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 587a79a1a6..c3950aba15 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -61,6 +61,13 @@ typedef struct AmcheckOptions
 	bool		show_progress;
 	int			jobs;
 
+	/*
+	 * Whether to install missing extensions, and optionally the name of the
+	 * schema in which to install the extension's objects.
+	 */
+	bool		install_missing;
+	char	   *install_schema;
+
 	/* Objects to check or not to check, as lists of PatternInfo structs. */
 	PatternInfoArray include;
 	PatternInfoArray exclude;
@@ -109,6 +116,8 @@ static AmcheckOptions opts = {
 	.strict_names = true,
 	.show_progress = false,
 	.jobs = 1,
+	.install_missing = false,
+	.install_schema = NULL,
 	.include = {NULL, 0},
 	.exclude = {NULL, 0},
 	.excludetbl = false,
@@ -259,6 +268,7 @@ main(int argc, char *argv[])
 		{"no-strict-names", no_argument, NULL, 10},
 		{"heapallindexed", no_argument, NULL, 11},
 		{"parent-check", no_argument, NULL, 12},
+		{"install-missing", optional_argument, NULL, 13},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -440,6 +450,11 @@ main(int argc, char *argv[])
 			case 12:
 				opts.parent_check = true;
 				break;
+			case 13:
+				opts.install_missing = true;
+				if (optarg)
+					opts.install_schema = pg_strdup(optarg);
+				break;
 			default:
 				fprintf(stderr,
 						_("Try \"%s --help\" for more information.\n"),
@@ -549,6 +564,34 @@ main(int argc, char *argv[])
 			conn = connectDatabase(&cparams, progname, opts.echo, false, true);
 		}
 
+		/*
+		 * Optionally install amcheck if not already installed in this
+		 * database.
+		 */
+		if (opts.install_missing && opts.install_schema != NULL)
+		{
+			char *schema;
+			char *install_sql;
+
+			/*
+			 * Must re-escape the schema name for each database, as the
+			 * escaping rules may change.
+			 */
+			schema = PQescapeIdentifier(conn, opts.install_schema,
+										strlen(opts.install_schema));
+			install_sql = psprintf("CREATE EXTENSION IF NOT EXISTS amcheck WITH SCHEMA %s",
+								   schema);
+
+			executeCommand(conn, install_sql, opts.echo);
+			pfree(install_sql);
+			pfree(schema);
+		}
+		else if (opts.install_missing)
+		{
+			executeCommand(conn, "CREATE EXTENSION IF NOT EXISTS amcheck WITH SCHEMA public",
+			opts.echo);
+		}
+
 		/*
 		 * Verify that amcheck is installed for this next database.  User
 		 * error could result in a database not having amcheck that should
@@ -1159,6 +1202,7 @@ help(const char *progname)
 	printf(_("  -V, --version                  output version information, then exit\n"));
 	printf(_("  -P, --progress                 show progress information\n"));
 	printf(_("  -?, --help                     show this help, then exit\n"));
+	printf(_("      --install-missing          install missing extensions\n"));
 
 	printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
 	printf(_("%s home page: <%s>\n"), PACKAGE_NAME, PACKAGE_URL);
-- 
2.21.1 (Apple Git-122.3)

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Mark Dilger (#2)
Re: pg_amcheck option to install extension

On 4/17/21 3:43 PM, Mark Dilger wrote:

On Apr 16, 2021, at 11:06 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

Hi,

Peter Geoghegan suggested that I have the cross version upgrade checker
run pg_amcheck on the upgraded module. This seemed to me like a good
idea, so I tried it, only to find that it refuses to run unless the
amcheck extension is installed. That's fair enough, but it also seems to
me like we should have an option to have pg_amcheck install the
extension is it's not present, by running something like 'create
extension if not exists amcheck'. Maybe in such a case there could also
be an option to drop the extension when pg_amcheck's work is done - I
haven't thought through all the implications.

Given pg_amcheck is a new piece of work I'm not sure if we can sneak
this in under the wire for release 14. I will certainly undertake to
review anything expeditiously. I can work around this issue in the
buildfarm, but it seems like something other users are likely to want.

We cannot quite use a "create extension if not exists amcheck" command, as we clear the search path and so must specify the schema to use. Should we instead avoid clearing the search path for this? What are the security implications of using the first schema of the search path?

When called as `pg_amcheck --install-missing`, the implementation in the attached patch runs per database being checked a "create extension if not exists amcheck with schema public". If called as `pg_amcheck --install-missing=foo`, it instead runs "create extension if not exists amcheck with schema foo` having escaped "foo" appropriately for the given database. There is no option to use different schemas for different databases. Nor is there any option to use the search path. If somebody needs that, I think they can manage installing amcheck themselves.

how about specifying pg_catalog as the schema instead of public?

Does this meet your needs for v14? I'd like to get this nailed down quickly, as it is unclear to me that we should even be doing this so late in the development cycle.

I'm ok with or without - I'll just have the buildfarm client pull a list
of databases and install the extension in all of them.

I'd also like your impressions on whether we're likely to move contrib/amcheck into core anytime soon. If so, is it worth adding an option that we'll soon need to deprecate?

I think if it stays as an extension it will stay in contrib. But it sure
feels very odd to have a core bin program that relies on a contrib
extension. It seems one or the other is misplaced.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#4Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Andrew Dunstan (#3)
1 attachment(s)
Re: pg_amcheck option to install extension

On Apr 18, 2021, at 6:19 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

how about specifying pg_catalog as the schema instead of public?

Done.

Attachments:

v2-0001-Adding-install-missing-option-to-pg_amcheck.patchapplication/octet-stream; name=v2-0001-Adding-install-missing-option-to-pg_amcheck.patch; x-unix-mode=0644Download
From 80aada5822c7cda6f097f08bba98e069fa20ea77 Mon Sep 17 00:00:00 2001
From: Mark Dilger <mark.dilger@enterprisedb.com>
Date: Sat, 17 Apr 2021 12:40:11 -0700
Subject: [PATCH v2] Adding --install-missing option to pg_amcheck

To ease the checking of databases where amcheck is not yet
installed, adding an option that will create the extension if it
does not yet exist.
---
 doc/src/sgml/ref/pg_amcheck.sgml | 17 ++++++++++++
 src/bin/pg_amcheck/pg_amcheck.c  | 44 ++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/doc/src/sgml/ref/pg_amcheck.sgml b/doc/src/sgml/ref/pg_amcheck.sgml
index d01e26faa8..a38e3ce7b3 100644
--- a/doc/src/sgml/ref/pg_amcheck.sgml
+++ b/doc/src/sgml/ref/pg_amcheck.sgml
@@ -217,6 +217,23 @@ PostgreSQL documentation
      </listitem>
     </varlistentry>
 
+    <varlistentry>
+     <term><option>--install-missing</option></term>
+     <term><option>--install-missing=<replaceable class="parameter">schema</replaceable></option></term>
+     <listitem>
+      <para>
+	   Install any missing extensions that are required to check the
+	   database(s).  If not yet installed, each extension's objects will be
+	   installed into the given
+	   <replaceable class="parameter">schema</replaceable>, or if not specified
+	   into schema <literal>pg_catalog</literal>.
+      </para>
+      <para>
+	   At present, the only required extension is <xref linkend="amcheck"/>.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry>
      <term><option>-j <replaceable class="parameter">num</replaceable></option></term>
      <term><option>--jobs=<replaceable class="parameter">num</replaceable></option></term>
diff --git a/src/bin/pg_amcheck/pg_amcheck.c b/src/bin/pg_amcheck/pg_amcheck.c
index 587a79a1a6..f9e5ff0d4f 100644
--- a/src/bin/pg_amcheck/pg_amcheck.c
+++ b/src/bin/pg_amcheck/pg_amcheck.c
@@ -61,6 +61,13 @@ typedef struct AmcheckOptions
 	bool		show_progress;
 	int			jobs;
 
+	/*
+	 * Whether to install missing extensions, and optionally the name of the
+	 * schema in which to install the extension's objects.
+	 */
+	bool		install_missing;
+	char	   *install_schema;
+
 	/* Objects to check or not to check, as lists of PatternInfo structs. */
 	PatternInfoArray include;
 	PatternInfoArray exclude;
@@ -109,6 +116,8 @@ static AmcheckOptions opts = {
 	.strict_names = true,
 	.show_progress = false,
 	.jobs = 1,
+	.install_missing = false,
+	.install_schema = NULL,
 	.include = {NULL, 0},
 	.exclude = {NULL, 0},
 	.excludetbl = false,
@@ -259,6 +268,7 @@ main(int argc, char *argv[])
 		{"no-strict-names", no_argument, NULL, 10},
 		{"heapallindexed", no_argument, NULL, 11},
 		{"parent-check", no_argument, NULL, 12},
+		{"install-missing", optional_argument, NULL, 13},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -440,6 +450,11 @@ main(int argc, char *argv[])
 			case 12:
 				opts.parent_check = true;
 				break;
+			case 13:
+				opts.install_missing = true;
+				if (optarg)
+					opts.install_schema = pg_strdup(optarg);
+				break;
 			default:
 				fprintf(stderr,
 						_("Try \"%s --help\" for more information.\n"),
@@ -549,6 +564,34 @@ main(int argc, char *argv[])
 			conn = connectDatabase(&cparams, progname, opts.echo, false, true);
 		}
 
+		/*
+		 * Optionally install amcheck if not already installed in this
+		 * database.
+		 */
+		if (opts.install_missing && opts.install_schema != NULL)
+		{
+			char *schema;
+			char *install_sql;
+
+			/*
+			 * Must re-escape the schema name for each database, as the
+			 * escaping rules may change.
+			 */
+			schema = PQescapeIdentifier(conn, opts.install_schema,
+										strlen(opts.install_schema));
+			install_sql = psprintf("CREATE EXTENSION IF NOT EXISTS amcheck WITH SCHEMA %s",
+								   schema);
+
+			executeCommand(conn, install_sql, opts.echo);
+			pfree(install_sql);
+			pfree(schema);
+		}
+		else if (opts.install_missing)
+		{
+			executeCommand(conn, "CREATE EXTENSION IF NOT EXISTS amcheck WITH SCHEMA pg_catalog",
+			opts.echo);
+		}
+
 		/*
 		 * Verify that amcheck is installed for this next database.  User
 		 * error could result in a database not having amcheck that should
@@ -1159,6 +1202,7 @@ help(const char *progname)
 	printf(_("  -V, --version                  output version information, then exit\n"));
 	printf(_("  -P, --progress                 show progress information\n"));
 	printf(_("  -?, --help                     show this help, then exit\n"));
+	printf(_("      --install-missing          install missing extensions\n"));
 
 	printf(_("\nReport bugs to <%s>.\n"), PACKAGE_BUGREPORT);
 	printf(_("%s home page: <%s>\n"), PACKAGE_NAME, PACKAGE_URL);
-- 
2.21.1 (Apple Git-122.3)

#5Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Andrew Dunstan (#3)
Re: pg_amcheck option to install extension

On 2021-Apr-18, Andrew Dunstan wrote:

On 4/17/21 3:43 PM, Mark Dilger wrote:

I'd also like your impressions on whether we're likely to move
contrib/amcheck into core anytime soon. If so, is it worth adding
an option that we'll soon need to deprecate?

I think if it stays as an extension it will stay in contrib. But it sure
feels very odd to have a core bin program that relies on a contrib
extension. It seems one or the other is misplaced.

I've proposed in the past that we should have a way to provide
extensions other than contrib -- specifically src/extensions/ -- and
then have those extensions installed together with the rest of core.
Then it would be perfectly legitimate to have src/bin/pg_amcheck that
depending that extension. I agree that the current situation is not
great.

--
�lvaro Herrera 39�49'30"S 73�17'W
"Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end." (2nd Commandment for C programmers)

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Alvaro Herrera (#5)
Re: pg_amcheck option to install extension

On 4/18/21 7:32 PM, Alvaro Herrera wrote:

On 2021-Apr-18, Andrew Dunstan wrote:

On 4/17/21 3:43 PM, Mark Dilger wrote:

I'd also like your impressions on whether we're likely to move
contrib/amcheck into core anytime soon. If so, is it worth adding
an option that we'll soon need to deprecate?

I think if it stays as an extension it will stay in contrib. But it sure
feels very odd to have a core bin program that relies on a contrib
extension. It seems one or the other is misplaced.

I've proposed in the past that we should have a way to provide
extensions other than contrib -- specifically src/extensions/ -- and
then have those extensions installed together with the rest of core.
Then it would be perfectly legitimate to have src/bin/pg_amcheck that
depending that extension. I agree that the current situation is not
great.

OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#7Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Andrew Dunstan (#6)
Re: pg_amcheck option to install extension

On Apr 19, 2021, at 9:32 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 4/18/21 7:32 PM, Alvaro Herrera wrote:

On 2021-Apr-18, Andrew Dunstan wrote:

On 4/17/21 3:43 PM, Mark Dilger wrote:

I'd also like your impressions on whether we're likely to move
contrib/amcheck into core anytime soon. If so, is it worth adding
an option that we'll soon need to deprecate?

I think if it stays as an extension it will stay in contrib. But it sure
feels very odd to have a core bin program that relies on a contrib
extension. It seems one or the other is misplaced.

I've proposed in the past that we should have a way to provide
extensions other than contrib -- specifically src/extensions/ -- and
then have those extensions installed together with the rest of core.
Then it would be perfectly legitimate to have src/bin/pg_amcheck that
depending that extension. I agree that the current situation is not
great.

OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.

Ah, no. I wrote pg_amcheck in contrib originally, and moved it to src/bin as requested during the v14 development cycle.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Robert Haas
robertmhaas@gmail.com
In reply to: Mark Dilger (#7)
Re: pg_amcheck option to install extension

On Mon, Apr 19, 2021 at 12:37 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:

OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.

Ah, no. I wrote pg_amcheck in contrib originally, and moved it to src/bin as requested during the v14 development cycle.

Yeah, I am not that excited about moving this again. I realize it was
never committed anywhere else, but it was moved at least one during
development. And I don't see that moving it to contrib really fixes
anything anyway here, except perhaps conceptually. Maybe inventing
src/extensions is the right idea, but there's no real need to do that
at this point in the release cycle, and it doesn't actually fix
anything either. The only thing that's really needed here is to either
(a) teach the test script to install amcheck as a separate step or (b)
teach pg_amcheck to install amcheck in a user-specified schema. If we
do that, AIUI, this issue is fixed regardless of whether we move any
source code around, and if we don't do that, AIUI, this issue is not
fixed no matter how much source code we move.

--
Robert Haas
EDB: http://www.enterprisedb.com

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#6)
Re: pg_amcheck option to install extension

Andrew Dunstan <andrew@dunslane.net> writes:

OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.

FWIW, I think that putting them both in contrib makes the most
sense from a structural standpoint.

Either way, though, you'll still need the proposed option to
let the executable issue a CREATE EXTENSION to get the shlib
loaded. Unless somebody is proposing that the extension be
installed-by-default like plpgsql, and that I am unequivocally
not for.

regards, tom lane

#10Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Tom Lane (#9)
Re: pg_amcheck option to install extension

On Apr 19, 2021, at 9:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.

FWIW, I think that putting them both in contrib makes the most
sense from a structural standpoint.

That was my original thought also, largely from a package management perspective. Just as an example, postgresql-client and postgresql-contrib are separate rpms. There isn't much point to having pg_amcheck installed as part of the postgresql-client package while having amcheck in the postgresql-contrib package which might not be installed.

A counter argument is that amcheck is server side, and pg_amcheck is client side. Having pg_amcheck installed on a system makes sense if you are connecting to a server on a different system.

On Mar 11, 2021, at 12:12 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:

I want to register, if we are going to add this, it ought to be in src/bin/. If we think it's a useful tool, it should be there with all the other useful tools.

I realize there is a dependency on a module in contrib, and it's probably now not the time to re-debate reorganizing contrib. But if we ever get to that, this program should be the prime example why the current organization is problematic, and we should be prepared to make the necessary moves then.

This was the request that motivated the move to src/bin.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#11Andrew Dunstan
andrew@dunslane.net
In reply to: Mark Dilger (#10)
Re: pg_amcheck option to install extension

On 4/19/21 1:25 PM, Mark Dilger wrote:

On Apr 19, 2021, at 9:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.

FWIW, I think that putting them both in contrib makes the most
sense from a structural standpoint.

That was my original thought also, largely from a package management perspective. Just as an example, postgresql-client and postgresql-contrib are separate rpms. There isn't much point to having pg_amcheck installed as part of the postgresql-client package while having amcheck in the postgresql-contrib package which might not be installed.

A counter argument is that amcheck is server side, and pg_amcheck is client side. Having pg_amcheck installed on a system makes sense if you are connecting to a server on a different system.

There are at least two other client side programs in contrib. So this
argument doesn't quite hold water from a consistency POV.

On Mar 11, 2021, at 12:12 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:

I want to register, if we are going to add this, it ought to be in src/bin/. If we think it's a useful tool, it should be there with all the other useful tools.

I realize there is a dependency on a module in contrib, and it's probably now not the time to re-debate reorganizing contrib. But if we ever get to that, this program should be the prime example why the current organization is problematic, and we should be prepared to make the necessary moves then.

This was the request that motivated the move to src/bin.

I missed that, so I guess maybe I can't complain too loudly. But if I'd
seen it I would have disagreed. :-)

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#12Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#9)
Re: pg_amcheck option to install extension

On Mon, Apr 19, 2021 at 12:53:29PM -0400, Tom Lane wrote:

FWIW, I think that putting them both in contrib makes the most
sense from a structural standpoint.

Either way, though, you'll still need the proposed option to
let the executable issue a CREATE EXTENSION to get the shlib
loaded. Unless somebody is proposing that the extension be
installed-by-default like plpgsql, and that I am unequivocally
not for.

Agreed. Something like src/extensions/ would be a tempting option,
but I don't think that it is a good idea to introduce a new piece of
infrastructure at this stage, so moving both to contrib/ would be the
best balance with the current pieces at hand.
--
Michael

#13Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Michael Paquier (#12)
Re: pg_amcheck option to install extension

On Apr 19, 2021, at 6:41 PM, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Apr 19, 2021 at 12:53:29PM -0400, Tom Lane wrote:

FWIW, I think that putting them both in contrib makes the most
sense from a structural standpoint.

Either way, though, you'll still need the proposed option to
let the executable issue a CREATE EXTENSION to get the shlib
loaded. Unless somebody is proposing that the extension be
installed-by-default like plpgsql, and that I am unequivocally
not for.

Agreed. Something like src/extensions/ would be a tempting option,
but I don't think that it is a good idea to introduce a new piece of
infrastructure at this stage, so moving both to contrib/ would be the
best balance with the current pieces at hand.

There is another issue to consider. Installing pg_amcheck in no way opens up an avenue of attack that I can see. It is just a client application with no special privileges. But installing amcheck arguably opens a line of attack; not one as significant as installing pageinspect, but of the same sort. Amcheck allows privileged database users to potentially get information from the tables that would otherwise be invisible even to them according to mvcc rules. (Is this already the case via some other functionality? Maybe this security problem already exists?) If the privileged database user has file system access, then this is not at all concerning, since they can already just open the files in a tool of their choice, but I don't see any reason why installations should require that privileged database users also be privileged to access the file system.

If you are not buying my argument here, perhaps a reference to how encryption functions are evaluated might help you see my point of view. You don't ask, "can the attacker recover the plain text from the encrypted text", but rather, "can the attacker tell the difference between encrypted plain text and encrypted random noise." That's because it is incredibly hard to reason about what an attacker might be able to learn. Even though learning about old data using amcheck would be hard, you can't say that an attacker would never be able to recover information about deleted rows. As such, security conscious installations are within reason to refuse to install it.

Since amcheck (and to a much larger extent, pageinspect) open potential data leakage issues, it makes sense for some security conscious sites to refuse to install it. pg_amcheck on the other hand could be installed everywhere. I understand why it might *feel* like pg_amcheck and amcheck have to both be installed to work, but I don't think that point of view makes much sense in reality. The computer running the client and the computer running the server are frequently not the same computer.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#14Michael Paquier
michael@paquier.xyz
In reply to: Mark Dilger (#13)
Re: pg_amcheck option to install extension

On Mon, Apr 19, 2021 at 07:15:23PM -0700, Mark Dilger wrote:

There is another issue to consider. Installing pg_amcheck in no way
opens up an avenue of attack that I can see. It is just a client
application with no special privileges. But installing amcheck
arguably opens a line of attack; not one as significant as
installing pageinspect, but of the same sort. Amcheck allows
privileged database users to potentially get information from the
tables that would otherwise be invisible even to them according to
mvcc rules. (Is this already the case via some other functionality?
Maybe this security problem already exists?) If the privileged
database user has file system access, then this is not at all
concerning, since they can already just open the files in a tool of
their choice, but I don't see any reason why installations should
require that privileged database users also be privileged to access
the file system.

By default, any functions deployed with amcheck have their execution
rights revoked from public, meaning that only a superuser can run them
with a default installation. A non-superuser could execute them only
once GRANT'd access to them.
--
Michael

#15Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Michael Paquier (#14)
Re: pg_amcheck option to install extension

On Apr 19, 2021, at 8:06 PM, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Apr 19, 2021 at 07:15:23PM -0700, Mark Dilger wrote:

There is another issue to consider. Installing pg_amcheck in no way
opens up an avenue of attack that I can see. It is just a client
application with no special privileges. But installing amcheck
arguably opens a line of attack; not one as significant as
installing pageinspect, but of the same sort. Amcheck allows
privileged database users to potentially get information from the
tables that would otherwise be invisible even to them according to
mvcc rules. (Is this already the case via some other functionality?
Maybe this security problem already exists?) If the privileged
database user has file system access, then this is not at all
concerning, since they can already just open the files in a tool of
their choice, but I don't see any reason why installations should
require that privileged database users also be privileged to access
the file system.

By default, any functions deployed with amcheck have their execution
rights revoked from public, meaning that only a superuser can run them
with a default installation. A non-superuser could execute them only
once GRANT'd access to them.

Correct. So having amcheck installed on the system provides the database superuser with a privilege escalation attack vector. I am assuming here the database superuser is not a privileged system user, and can only log in remotely, has no direct access to the file system, etc.

Alice has a database with sensitive data. She hires Bob to be her new database admin, with superuser privilege, but doesn't want Bob to see the sensitive data, so she deletes it first. The data is dead but still on disk.

Bob discovers a bug in postgres that will corrupt dead rows that some index is still pointing at. This attack requires sufficient privilege to trigger the bug, but presumably he has that much privilege, because he is a database superuser. Let's call this attack C(x) where "C" means the corruption inducing function, and "x" is the indexed key for which dead rows will be corrupted.

Bob runs "CREATE EXTENSION amcheck", and then successively runs C(x) followed by amcheck for each interesting value of x. Bob learns which of these values were in the system before Alice deleted them.

This is a classic privilege escalation attack. Bob has one privilege, and uses it to get another.

Alice might foresee this behavior from Bob and choose not to install contrib/amcheck. This is paranoid on her part, but does not cross the line into insanity.

The postgres community has every reason to keep amcheck in contrib so that users such as Alice can make this decision.

No similar argument has been put forward for why pg_amcheck should be kept in contrib.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#16Michael Paquier
michael@paquier.xyz
In reply to: Mark Dilger (#15)
Re: pg_amcheck option to install extension

On Mon, Apr 19, 2021 at 08:39:06PM -0700, Mark Dilger wrote:

This is a classic privilege escalation attack. Bob has one
privilege, and uses it to get another.

Bob is a superuser, so it has all the privileges of the world for this
instance. In what is that different from BASE_BACKUP or just COPY
FROM PROGRAM?

I am not following your argument here.
--
Michael

#17Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Michael Paquier (#16)
Re: pg_amcheck option to install extension

On Apr 19, 2021, at 9:22 PM, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Apr 19, 2021 at 08:39:06PM -0700, Mark Dilger wrote:

This is a classic privilege escalation attack. Bob has one
privilege, and uses it to get another.

Bob is a superuser, so it has all the privileges of the world for this
instance. In what is that different from BASE_BACKUP or just COPY
FROM PROGRAM?

I think you are conflating the concept of an operating system adminstrator with the concept of the database superuser/owner. If the operating system user that postgres is running as cannot execute any binaries, then "copy from program" is not a way for a database admistrator to escape the jail. If Bob does not have ssh access to the system, he cannot run pg_basebackup.

I am not following your argument here.

The argument is that the operating system user that postgres is running as, perhaps user "postgres", can read the files in the $PGDATA directory, but Bob can only see the MVCC view of the data, not the raw data. Installing contrib/amcheck allows Bob to get a peak behind the curtain.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#18Michael Paquier
michael@paquier.xyz
In reply to: Mark Dilger (#17)
Re: pg_amcheck option to install extension

On Mon, Apr 19, 2021 at 10:31:18PM -0700, Mark Dilger wrote:

I think you are conflating the concept of an operating system
adminstrator with the concept of the database superuser/owner. If
the operating system user that postgres is running as cannot execute
any binaries, then "copy from program" is not a way for a database
admistrator to escape the jail. If Bob does not have ssh access to
the system, he cannot run pg_basebackup.

You don't need much to be able to take a base backup once you have a
connection to the backend as long as your have the permissions to do
so. In this case that would be just the replication permissions.

The argument is that the operating system user that postgres is
running as, perhaps user "postgres", can read the files in the
$PGDATA directory, but Bob can only see the MVCC view of the data,
not the raw data. Installing contrib/amcheck allows Bob to get a
peak behind the curtain.

In my world, a superuser is considered as an entity holding the same
rights as the OS user running the PostgreSQL instance, so that's wider
than the definition you are thinking of here. There are many fancy
things one can do in this case, just to name a few that give access to
any files of the data directory or even other paths:
- pg_read_file(), and take the equivalent of a base backup with a
RECURSIVE CTE.
- BASE_BACKUP, doable from a simple psql session with a replication
connection.
- Untrusted languages.

So I don't understand your argument with amcheck here because any of
its features *requires* superuser rights unless granted. Coming back
to your example, Alice actually gave up the control of her database to
Bob the moment she gave him superuser rights. If she really wanted to
protect her privacy, she'd better think about a more restricted set of
ACLs for Bob before letting him manage her data, even if she considers
herself "safe" after she deleted it, but she's really not safe by any
means. I still stand with the point of upthread to put all that in
contrib/ for now, without discarding that this could be moved
somewhere else in the future.
--
Michael

#19Robert Haas
robertmhaas@gmail.com
In reply to: Andrew Dunstan (#11)
Re: pg_amcheck option to install extension

On Mon, Apr 19, 2021 at 2:55 PM Andrew Dunstan <andrew@dunslane.net> wrote:

There are at least two other client side programs in contrib. So this
argument doesn't quite hold water from a consistency POV.

I thought that at first, too. But then I realized that those programs
are oid2name and vacuumlo. And oid2name, at least, seems like
something we ought to just consider removing. It's unclear why this is
something that really deserves a command-line utility rather than just
some additional psql options or something. Does anyone really use it?

vacuumlo isn't that impressive either, since it makes the very tenuous
assumption that an oid column is intended to reference a large object,
and the documentation doesn't even acknowledge what a shaky idea that
actually is. But I suspect it has much better chances of being useful
in practice than oid2name. In fact, I've heard of people using it and,
I think, finding it useful, so we probably don't want to just nuke it.

But the point is, as things stand today, almost everything in contrib
is an extension, not a binary. And we might want to view the
exceptions as loose ends to be cleaned up, rather than a pattern to
emulate.

It's a judgement call, though.

--
Robert Haas
EDB: http://www.enterprisedb.com

#20Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#19)
Re: pg_amcheck option to install extension

On Tue, Apr 20, 2021 at 2:47 PM Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Apr 19, 2021 at 2:55 PM Andrew Dunstan <andrew@dunslane.net> wrote:

There are at least two other client side programs in contrib. So this
argument doesn't quite hold water from a consistency POV.

I thought that at first, too. But then I realized that those programs
are oid2name and vacuumlo. And oid2name, at least, seems like
something we ought to just consider removing. It's unclear why this is
something that really deserves a command-line utility rather than just
some additional psql options or something. Does anyone really use it?

Yeah, this seems like it could relatively simply just be a SQL query in psql.

vacuumlo isn't that impressive either, since it makes the very tenuous
assumption that an oid column is intended to reference a large object,
and the documentation doesn't even acknowledge what a shaky idea that
actually is. But I suspect it has much better chances of being useful
in practice than oid2name. In fact, I've heard of people using it and,
I think, finding it useful, so we probably don't want to just nuke it.

Yes, I've definitely run into using vacuumlo many times.

But the point is, as things stand today, almost everything in contrib
is an extension, not a binary. And we might want to view the
exceptions as loose ends to be cleaned up, rather than a pattern to
emulate.

I could certainly sign up for moving vacuumlo to bin/ and replacing
oid2name with something in psql for example.

(But yes, I realize this rapidly turns into another instance of the
bikeshedding about the future of contrib..)

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

#21Robert Haas
robertmhaas@gmail.com
In reply to: Mark Dilger (#17)
Re: pg_amcheck option to install extension

On Tue, Apr 20, 2021 at 1:31 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:

I think you are conflating the concept of an operating system adminstrator with the concept of the database superuser/owner.

You should conflate those things, because there's no meaningful
privilege boundary between them:

http://rhaas.blogspot.com/2020/12/cve-2019-9193.html

If reading the whole thing is too much, scroll down to the part in
fixed-width font and behold me trivially compromising the OS account
using plperlu.

I actually think this is a design error on our part. A lot of people,
apparently including you, feel that there should be a privilege
boundary between the PostgreSQL superuser and the OS user, or want
such a boundary to exist. It would be quite useful if there were a
boundary there, because it's entirely reasonable to want to have a
user who is allowed to do everything with the database except escape
into the OS account, and I can't think of any reason why we couldn't
set things up so that this is possible. We'd have to bar some things
that the superuser can currently do, like directly modify system
tables and use COPY TO/FROM PROGRAM, but there's a lot of things we
could allow too, like reading all the data and creating and deleting
accounts and setting their permissions arbitrarily, except maybe for
any special super-DUPER users who are allowed to do things that escape
the sandbox.

Now it would take a fair amount of work to make that distinction in a
rigorous way and figure out exactly what the design ought to be, and
I'm not volunteering. But I bet a lot of people would like it.

--
Robert Haas
EDB: http://www.enterprisedb.com

#22Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Michael Paquier (#12)
Re: pg_amcheck option to install extension

On 2021-Apr-20, Michael Paquier wrote:

Agreed. Something like src/extensions/ would be a tempting option,
but I don't think that it is a good idea to introduce a new piece of
infrastructure at this stage, so moving both to contrib/ would be the
best balance with the current pieces at hand.

Actually I think the best balance would be to leave things where they
are, and move amcheck to src/extensions/ once the next devel cycle
opens. That way, we avoid the (pretty much pointless) laborious task of
moving pg_amcheck to contrib only to move it back on the next cycle.

What I'm afraid of, if we move pg_amcheck to contrib, is that during the
next cycle people will say that they are both perfectly fine in contrib/
and so we don't need to move anything anywhere. And next time someone
wants to create a new extension that would be perfectly fine in core,
they will not want to have that one be the one that creates
src/extensions/, because then that in itself is a contentious point that
can get the whole patch rejected.

In a sense, what I'm doing is support the idea that "incremental
development" applies to procedure too.

--
�lvaro Herrera 39�49'30"S 73�17'W

#23Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#19)
Re: pg_amcheck option to install extension

On 4/20/21 8:47 AM, Robert Haas wrote:

On Mon, Apr 19, 2021 at 2:55 PM Andrew Dunstan <andrew@dunslane.net> wrote:

There are at least two other client side programs in contrib. So this
argument doesn't quite hold water from a consistency POV.

I thought that at first, too. But then I realized that those programs
are oid2name and vacuumlo. And oid2name, at least, seems like
something we ought to just consider removing. It's unclear why this is
something that really deserves a command-line utility rather than just
some additional psql options or something. Does anyone really use it?

vacuumlo isn't that impressive either, since it makes the very tenuous
assumption that an oid column is intended to reference a large object,
and the documentation doesn't even acknowledge what a shaky idea that
actually is. But I suspect it has much better chances of being useful
in practice than oid2name. In fact, I've heard of people using it and,
I think, finding it useful, so we probably don't want to just nuke it.

But the point is, as things stand today, almost everything in contrib
is an extension, not a binary. And we might want to view the
exceptions as loose ends to be cleaned up, rather than a pattern to
emulate.

It's a judgement call, though.

Yeah. I'll go along with Alvaro and say let's let sleeping dogs lie at
this stage of the dev process, and pick the discussion up after we branch.

I will just note one thing: the binaries in contrib have one important
function that hasn't been mentioned, namely to test using pgxs to build
binaries. That doesn't have to live in contrib, but we should have
something that does that somewhere in the build process, so if we
remmove oid2name and vacuumlo from contrib we need to look into that.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#22)
Re: pg_amcheck option to install extension

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

Actually I think the best balance would be to leave things where they
are, and move amcheck to src/extensions/ once the next devel cycle
opens. That way, we avoid the (pretty much pointless) laborious task of
moving pg_amcheck to contrib only to move it back on the next cycle.

What I'm afraid of, if we move pg_amcheck to contrib, is that during the
next cycle people will say that they are both perfectly fine in contrib/
and so we don't need to move anything anywhere.

Indeed. But I'm down on this idea of inventing src/extensions,
because then there will constantly be questions about whether FOO
belongs in contrib/ or src/extensions/. Unless we just move
everything there, and then the question becomes why bother. Sure,
"contrib" is kind of a legacy name, but PG is full of legacy names.

regards, tom lane

#25Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Robert Haas (#21)
Re: pg_amcheck option to install extension

On Apr 20, 2021, at 5:54 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 20, 2021 at 1:31 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:

I think you are conflating the concept of an operating system adminstrator with the concept of the database superuser/owner.

You should conflate those things, because there's no meaningful
privilege boundary between them:

This discussion started in response to the idea that pg_amcheck needs to be moved into contrib, presumably because that's where amcheck lives. I am arguing against the move.

The actual use case I have in mind is "Postgres as a service", where a company (Alice) rents space in the cloud and runs postgres databases which can be rented out to a tenant (Bob) who is the owner of his database, but not privileged on the underlying system in any way. Bob has enough privileges to run CREATE EXTENSION, but is limited to the extensions that Alice has made available. Alice evaluates packages and chooses not to install most of them, including amcheck. Or perhaps Alice chooses not to install any contrib modules. Either way, the location of amcheck in contrib is useful to Alice because it makes her choice not to install it very simple.

Bob, however, is connecting to databases provided by Alice, and is not trying to limit himself. He is happy to have the pg_amcheck client installed. If Alice's databases don't allow him to run amcheck, pg_amcheck is not useful relative to those databases, but perhaps Bob is also renting database space from Charlie and Charlie's databases have amcheck installed.

Now, the question is, "In which postgres package does Bob think pg_amcheck should reside?" It would be strange to say that Bob needs to install the postgresql-contrib rpm in order to get the pg_amcheck client. That rpm is mostly a bunch of modules, and may even have a package dependency on postgresql-server. Bob doesn't want either of those. He just wants the clients.

The discussion about using amcheck as a privilege escalation attack was mostly to give some background for why Alice might not want to install amcheck. I think it got a bit out of hand, in no small part because I was being imprecise about Bob's exact privilege levels. I was being imprecise about that part because my argument wasn't "here's how to leverage amcheck to exploit postgres", but rather, "here's what Alice might rationally be concerned about." To run CREATE EXTENSION does not actually require superuser privileges. It depends on the package. At the moment, you can't load amcheck without superuser privileges, but you can load some others, such as intarray:

bob=> create extension amcheck;
2021-04-20 07:40:46.758 PDT [80340] ERROR: permission denied to create extension "amcheck"
2021-04-20 07:40:46.758 PDT [80340] HINT: Must be superuser to create this extension.
2021-04-20 07:40:46.758 PDT [80340] STATEMENT: create extension amcheck;
ERROR: permission denied to create extension "amcheck"
HINT: Must be superuser to create this extension.
bob=> create extension intarray;
CREATE EXTENSION
bob=>

Alice might prefer to avoid installing amcheck altogether, not wanting to have to evaluate on each upgrade whether the privileges necessary to load amcheck have changed.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#26Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#24)
Re: pg_amcheck option to install extension

On 4/20/21 11:09 AM, Tom Lane wrote:

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

Actually I think the best balance would be to leave things where they
are, and move amcheck to src/extensions/ once the next devel cycle
opens. That way, we avoid the (pretty much pointless) laborious task of
moving pg_amcheck to contrib only to move it back on the next cycle.
What I'm afraid of, if we move pg_amcheck to contrib, is that during the
next cycle people will say that they are both perfectly fine in contrib/
and so we don't need to move anything anywhere.

Indeed. But I'm down on this idea of inventing src/extensions,
because then there will constantly be questions about whether FOO
belongs in contrib/ or src/extensions/. Unless we just move
everything there, and then the question becomes why bother. Sure,
"contrib" is kind of a legacy name, but PG is full of legacy names.

I think the distinction I would draw is between things we would expect
to be present in every Postgres installation (e.g. pg_stat_statements,
auto_explain, postgres_fdw, hstore) and things we don't for one reason
or another (e.g. pgcrypto, adminpack)

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#26)
Re: pg_amcheck option to install extension

Andrew Dunstan <andrew@dunslane.net> writes:

On 4/20/21 11:09 AM, Tom Lane wrote:

Indeed. But I'm down on this idea of inventing src/extensions,
because then there will constantly be questions about whether FOO
belongs in contrib/ or src/extensions/.

I think the distinction I would draw is between things we would expect
to be present in every Postgres installation (e.g. pg_stat_statements,
auto_explain, postgres_fdw, hstore) and things we don't for one reason
or another (e.g. pgcrypto, adminpack)

I dunno, that division appears quite arbitrary and endlessly
bikesheddable. It's something I'd prefer not to spend time
arguing about, but the only way we won't have such arguments
is if we don't make the distinction in the first place.

regards, tom lane

#28Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#27)
Re: pg_amcheck option to install extension

On Tue, Apr 20, 2021 at 12:05 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think the distinction I would draw is between things we would expect
to be present in every Postgres installation (e.g. pg_stat_statements,
auto_explain, postgres_fdw, hstore) and things we don't for one reason
or another (e.g. pgcrypto, adminpack)

I dunno, that division appears quite arbitrary and endlessly
bikesheddable.

+1. I wouldn't expect those things to be present in every
installation, for sure. I don't know that I've *ever* seen a customer
use hstore. If I have, it wasn't often. There's no way we'll ever get
consensus on which stuff people use, because it's different depending
on what customers you work with.

The stuff I feel bad about is stuff like 'isn' and 'earthdistance' and
'intarray', which are basically useless toys with low code quality.
You'd hate for people to confuse that with stuff like 'dblink' or
'pgcrypto' which might actually be useful. But there's a big, broad
fuzzy area in the middle where everyone is going to have different
opinions. And even things like 'isn' and 'earthdistance' and
'intarray' may well have defenders, either because somebody thinks
it's valuable as a coding example, or because somebody really did use
it in anger and had success.

--
Robert Haas
EDB: http://www.enterprisedb.com

#29Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Robert Haas (#21)
Privilege boundary between sysadmin and database superuser [Was: Re: pg_amcheck option to install extension]

On Apr 20, 2021, at 5:54 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 20, 2021 at 1:31 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:

I think you are conflating the concept of an operating system adminstrator with the concept of the database superuser/owner.

You should conflate those things, because there's no meaningful
privilege boundary between them:

I understand why you say so, but I think the situation is more nuanced than that.

http://rhaas.blogspot.com/2020/12/cve-2019-9193.html

If reading the whole thing is too much, scroll down to the part in
fixed-width font and behold me trivially compromising the OS account
using plperlu.

I think the question here is whether PostgreSQL is inherently insecure, meaning that it cannot function unless installed in a way that would allow the database superuser Bob to compromise the OS administered by Alice.

Magnus seems to object even to this formulation in his blog post, https://blog.hagander.net/when-a-vulnerability-is-not-a-vulnerability-244/, saying "a common setup is to only allow the postgres OS user itself to act as superuser, in which case there is no escalation at all." He seems to view Bob taking over the OS account as nothing more than Alice taking over her own account, since nobody but Alice should ever be able to log in as Bob. At a minimum, I think that means that Alice must trust PostgreSQL to contain zero exploits. If database user Charlie can escalate his privileges to the level of Bob, then Alice has a big problem. Assuming Alice is an average prudent system administrator, she doesn't really want to trust that PostgreSQL is completely exploit free. She just wants to quarantine it enough that she can sleep at night.

I think we have made the situation for Alice a bit difficult. She needs to make sure that whichever user the backend runs as does not have permission to access anything beyond the PGDATA directory and a handful of postgres binaries, otherwise Bob, and perhaps Charlie, can access them. She can do this most easily with containers, or at least it seems so to me. The only binaries that should be executable from within the container are "postgres", "locale", and whichever hardened archive command, recovery command, and restore command Alice wants to allow. The only shell that should be executable from within the container should be minimal, maybe something custom written by Alice that only works to recognize the very limited set of commands Alice wants to allow and then forks/execs those commands without allowing any further shell magic. "Copy to program" and "copy from program" internally call popen, which calls the shell, and if Alice's custom shell doesn't offer to pipe anything to the target program, Bob can't really do anything that way. "locale -a" doesn't seem particularly vulnerable to being fed garbage, and in any event, Alice's custom shell doesn't have to implement the pipe stream logic in that direction. She could make it unidirectional from `locale -a` back to postgres. The archive, recovery, and restore commands are internally invoked using system() which calls those commands indirectly using Alice's shell. Once again, she could write the shell to not pipe anything in either direction, which pretty well prevents Bob from doing anything malicious with them.

Reading and writing postgresql data files seems a much trickier problem. The "copy to file" and "copy from file" implementations don't go through the shell, and Alice can't deny the database reading or writing the data directory, so there doesn't seem to be any quarantine trick that will work. Bob can copy arbitrary malicious content to or from that directory. I don't see how this gets Bob any closer to compromising the OS account, though. All Bob is doing is messing up his own database. Even if corrupting these files convinces the postgres backend to attempt to write somewhere else in the system, the container should be sufficient to prevent it from actually succeeding outside its own data directory.

The issue of the pg_read_file() sql function, and similar functions, would seem to fall into the same category as "copy to file" and "copy from file". Bob can read and write his own data directory, but not anything else, assuming Alice set up the container properly.

I actually think this is a design error on our part. A lot of people,
apparently including you, feel that there should be a privilege
boundary between the PostgreSQL superuser and the OS user, or want
such a boundary to exist.

I'm arguing that the boundary does currently (almost) exist, but is violated by default, easy to further violate without realizing you are doing so, inconvenient and hard to maintain in practice, requires segregating the database superuser from whichever adminstrator(s) execute other tools, requires being paranoid when running such tools against the database because any content found therein could have been maliciously corrupted by the database administrator in a way that you are not expecting, requires a container or chroot jail and a custom shell, and this whole mess should not be made any more difficult.

We could make this incrementally easier by finding individual problems which have solutions generally acceptable to the community and tackling them one at a time. I don't see there will be terribly many such solutions, though, if the community sees no value in putting a boundary between Bob and Alice.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Dilger (#29)
Re: Privilege boundary between sysadmin and database superuser [Was: Re: pg_amcheck option to install extension]

Mark Dilger <mark.dilger@enterprisedb.com> writes:

On Apr 20, 2021, at 5:54 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Apr 20, 2021 at 1:31 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:

I think you are conflating the concept of an operating system adminstrator with the concept of the database superuser/owner.

You should conflate those things, because there's no meaningful
privilege boundary between them:

I understand why you say so, but I think the situation is more nuanced than that.

Maybe I too am confused, but I understand "operating system administrator"
to mean "somebody who has root, or some elevated OS privilege level, on
the box running Postgres". That is 100% distinct from the operating
system user that runs Postgres, which should generally be a bog-standard
OS user. (In fact, we try to prevent you from running Postgres as root.)

What there is not a meaningful privilege boundary between is that standard
OS user and a within-the-database superuser. Since we allow superusers to
trigger file reads and writes, and indeed execute programs, from within
the DB, a superuser can surely reach anything the OS user can do.

The rest of your analysis seems a bit off-point to me, which is what
makes me think that one of us is confused. If Alice is storing her
data in a Postgres database, she had better trust both the Postgres
superuser and the box's administrators ... otherwise, she should go
get her own box and her own Postgres installation.

regards, tom lane

#31Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Tom Lane (#30)
Re: Privilege boundary between sysadmin and database superuser [Was: Re: pg_amcheck option to install extension]

On Apr 20, 2021, at 3:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Mark Dilger <mark.dilger@enterprisedb.com> writes:

On Apr 20, 2021, at 5:54 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Apr 20, 2021 at 1:31 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:

I think you are conflating the concept of an operating system adminstrator with the concept of the database superuser/owner.

You should conflate those things, because there's no meaningful
privilege boundary between them:

I understand why you say so, but I think the situation is more nuanced than that.

Maybe I too am confused, but I understand "operating system administrator"
to mean "somebody who has root, or some elevated OS privilege level, on
the box running Postgres". That is 100% distinct from the operating
system user that runs Postgres, which should generally be a bog-standard
OS user. (In fact, we try to prevent you from running Postgres as root.)

What there is not a meaningful privilege boundary between is that standard
OS user and a within-the-database superuser. Since we allow superusers to
trigger file reads and writes, and indeed execute programs, from within
the DB, a superuser can surely reach anything the OS user can do.

Right. This is the part that Alice might want to restrict, and we don't have an easy way for her to do so.

The rest of your analysis seems a bit off-point to me, which is what
makes me think that one of us is confused. If Alice is storing her
data in a Postgres database, she had better trust both the Postgres
superuser and the box's administrators ... otherwise, she should go
get her own box and her own Postgres installation.

It is the other way around. Alice is the operating system administrator who doesn't trust Bob. She wants Bob to be able to do any database thing he wants within the PostgreSQL environment, but doesn't want that to leak out as an ability to run arbitrary stuff on the system, not even if it's just stuff running as bog-standard user "postgres". In my view, Alice can accomplish this goal using a very tightly designed container, but it is far from easy to do and to get right.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#32Stephen Frost
sfrost@snowman.net
In reply to: Mark Dilger (#31)
Re: Privilege boundary between sysadmin and database superuser [Was: Re: pg_amcheck option to install extension]

Greetings,

* Mark Dilger (mark.dilger@enterprisedb.com) wrote:

On Apr 20, 2021, at 3:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
The rest of your analysis seems a bit off-point to me, which is what
makes me think that one of us is confused. If Alice is storing her
data in a Postgres database, she had better trust both the Postgres
superuser and the box's administrators ... otherwise, she should go
get her own box and her own Postgres installation.

It is the other way around. Alice is the operating system administrator who doesn't trust Bob. She wants Bob to be able to do any database thing he wants within the PostgreSQL environment, but doesn't want that to leak out as an ability to run arbitrary stuff on the system, not even if it's just stuff running as bog-standard user "postgres". In my view, Alice can accomplish this goal using a very tightly designed container, but it is far from easy to do and to get right.

Then Bob doesn't get to be a superuser.

There's certainly some capabilities that aren't able to be GRANT'd out
today and which are reserved for superusers, but there's been ongoing
work to improve on that situation (pg_read_all_data being one of the
recent improvements in this area, in fact...). Certainly, if others are
interested in that then it'd be great to have more folks working to
improve the situation.

We do need to make it clear when a given capability isn't intended to
allow a user who has that capability to be able to become a superuser
and when the capability itself means that they would be able to. The
predefined role 'pg_execute_server_program' grants out the capability to
execute programs on the server, which both allows a user to become a
superuser if they wished, and goes against your stated requirement that
Bob isn't allowed to do that, so that predefined role shouldn't be
GRANT'd to Bob.

The question is: what do you wish Bob could do, as a non-superuser, that
Bob isn't able to do today? Assuming that there's a set of capabilities
there that both wouldn't allow Bob to become a superuser (which implies
that Bob can't do things like execute arbitrary programs or read/write
arbitrary files on the server) and which would allow Bob to perform the
actions you'd like Bob to be able to do, it's mostly a matter of
programming to make it happen...

Thanks,

Stephen

#33Andrew Dunstan
andrew@dunslane.net
In reply to: Mark Dilger (#4)
Re: pg_amcheck option to install extension

On 4/18/21 5:58 PM, Mark Dilger wrote:

On Apr 18, 2021, at 6:19 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

how about specifying pg_catalog as the schema instead of public?

Done.

Pushed with slight changes.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com