pg_amcheck option to install extension
Hi,
Peter Geoghegan suggested that I have the cross version upgrade checker
run pg_amcheck on the upgraded module. This seemed to me like a good
idea, so I tried it, only to find that it refuses to run unless the
amcheck extension is installed. That's fair enough, but it also seems to
me like we should have an option to have pg_amcheck install the
extension is it's not present, by running something like 'create
extension if not exists amcheck'. Maybe in such a case there could also
be an option to drop the extension when pg_amcheck's work is done - I
haven't thought through all the implications.
Given pg_amcheck is a new piece of work I'm not sure if we can sneak
this in under the wire for release 14. I will certainly undertake to
review anything expeditiously. I can work around this issue in the
buildfarm, but it seems like something other users are likely to want.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Apr 16, 2021, at 11:06 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
Hi,
Peter Geoghegan suggested that I have the cross version upgrade checker
run pg_amcheck on the upgraded module. This seemed to me like a good
idea, so I tried it, only to find that it refuses to run unless the
amcheck extension is installed. That's fair enough, but it also seems to
me like we should have an option to have pg_amcheck install the
extension is it's not present, by running something like 'create
extension if not exists amcheck'. Maybe in such a case there could also
be an option to drop the extension when pg_amcheck's work is done - I
haven't thought through all the implications.Given pg_amcheck is a new piece of work I'm not sure if we can sneak
this in under the wire for release 14. I will certainly undertake to
review anything expeditiously. I can work around this issue in the
buildfarm, but it seems like something other users are likely to want.
We cannot quite use a "create extension if not exists amcheck" command, as we clear the search path and so must specify the schema to use. Should we instead avoid clearing the search path for this? What are the security implications of using the first schema of the search path?
When called as `pg_amcheck --install-missing`, the implementation in the attached patch runs per database being checked a "create extension if not exists amcheck with schema public". If called as `pg_amcheck --install-missing=foo`, it instead runs "create extension if not exists amcheck with schema foo` having escaped "foo" appropriately for the given database. There is no option to use different schemas for different databases. Nor is there any option to use the search path. If somebody needs that, I think they can manage installing amcheck themselves.
Does this meet your needs for v14? I'd like to get this nailed down quickly, as it is unclear to me that we should even be doing this so late in the development cycle.
I'd also like your impressions on whether we're likely to move contrib/amcheck into core anytime soon. If so, is it worth adding an option that we'll soon need to deprecate?
Attachments:
v1-0001-Adding-install-missing-option-to-pg_amcheck.patchapplication/octet-stream; name=v1-0001-Adding-install-missing-option-to-pg_amcheck.patch; x-unix-mode=0644Download+61-1
On 4/17/21 3:43 PM, Mark Dilger wrote:
On Apr 16, 2021, at 11:06 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
Hi,
Peter Geoghegan suggested that I have the cross version upgrade checker
run pg_amcheck on the upgraded module. This seemed to me like a good
idea, so I tried it, only to find that it refuses to run unless the
amcheck extension is installed. That's fair enough, but it also seems to
me like we should have an option to have pg_amcheck install the
extension is it's not present, by running something like 'create
extension if not exists amcheck'. Maybe in such a case there could also
be an option to drop the extension when pg_amcheck's work is done - I
haven't thought through all the implications.Given pg_amcheck is a new piece of work I'm not sure if we can sneak
this in under the wire for release 14. I will certainly undertake to
review anything expeditiously. I can work around this issue in the
buildfarm, but it seems like something other users are likely to want.We cannot quite use a "create extension if not exists amcheck" command, as we clear the search path and so must specify the schema to use. Should we instead avoid clearing the search path for this? What are the security implications of using the first schema of the search path?
When called as `pg_amcheck --install-missing`, the implementation in the attached patch runs per database being checked a "create extension if not exists amcheck with schema public". If called as `pg_amcheck --install-missing=foo`, it instead runs "create extension if not exists amcheck with schema foo` having escaped "foo" appropriately for the given database. There is no option to use different schemas for different databases. Nor is there any option to use the search path. If somebody needs that, I think they can manage installing amcheck themselves.
how about specifying pg_catalog as the schema instead of public?
Does this meet your needs for v14? I'd like to get this nailed down quickly, as it is unclear to me that we should even be doing this so late in the development cycle.
I'm ok with or without - I'll just have the buildfarm client pull a list
of databases and install the extension in all of them.
I'd also like your impressions on whether we're likely to move contrib/amcheck into core anytime soon. If so, is it worth adding an option that we'll soon need to deprecate?
I think if it stays as an extension it will stay in contrib. But it sure
feels very odd to have a core bin program that relies on a contrib
extension. It seems one or the other is misplaced.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Apr 18, 2021, at 6:19 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
how about specifying pg_catalog as the schema instead of public?
Done.
Attachments:
v2-0001-Adding-install-missing-option-to-pg_amcheck.patchapplication/octet-stream; name=v2-0001-Adding-install-missing-option-to-pg_amcheck.patch; x-unix-mode=0644Download+61-1
On 2021-Apr-18, Andrew Dunstan wrote:
On 4/17/21 3:43 PM, Mark Dilger wrote:
I'd also like your impressions on whether we're likely to move
contrib/amcheck into core anytime soon. If so, is it worth adding
an option that we'll soon need to deprecate?I think if it stays as an extension it will stay in contrib. But it sure
feels very odd to have a core bin program that relies on a contrib
extension. It seems one or the other is misplaced.
I've proposed in the past that we should have a way to provide
extensions other than contrib -- specifically src/extensions/ -- and
then have those extensions installed together with the rest of core.
Then it would be perfectly legitimate to have src/bin/pg_amcheck that
depending that extension. I agree that the current situation is not
great.
--
�lvaro Herrera 39�49'30"S 73�17'W
"Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end." (2nd Commandment for C programmers)
On 4/18/21 7:32 PM, Alvaro Herrera wrote:
On 2021-Apr-18, Andrew Dunstan wrote:
On 4/17/21 3:43 PM, Mark Dilger wrote:
I'd also like your impressions on whether we're likely to move
contrib/amcheck into core anytime soon. If so, is it worth adding
an option that we'll soon need to deprecate?I think if it stays as an extension it will stay in contrib. But it sure
feels very odd to have a core bin program that relies on a contrib
extension. It seems one or the other is misplaced.I've proposed in the past that we should have a way to provide
extensions other than contrib -- specifically src/extensions/ -- and
then have those extensions installed together with the rest of core.
Then it would be perfectly legitimate to have src/bin/pg_amcheck that
depending that extension. I agree that the current situation is not
great.
OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Apr 19, 2021, at 9:32 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
On 4/18/21 7:32 PM, Alvaro Herrera wrote:
On 2021-Apr-18, Andrew Dunstan wrote:
On 4/17/21 3:43 PM, Mark Dilger wrote:
I'd also like your impressions on whether we're likely to move
contrib/amcheck into core anytime soon. If so, is it worth adding
an option that we'll soon need to deprecate?I think if it stays as an extension it will stay in contrib. But it sure
feels very odd to have a core bin program that relies on a contrib
extension. It seems one or the other is misplaced.I've proposed in the past that we should have a way to provide
extensions other than contrib -- specifically src/extensions/ -- and
then have those extensions installed together with the rest of core.
Then it would be perfectly legitimate to have src/bin/pg_amcheck that
depending that extension. I agree that the current situation is not
great.OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.
Ah, no. I wrote pg_amcheck in contrib originally, and moved it to src/bin as requested during the v14 development cycle.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Apr 19, 2021 at 12:37 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.Ah, no. I wrote pg_amcheck in contrib originally, and moved it to src/bin as requested during the v14 development cycle.
Yeah, I am not that excited about moving this again. I realize it was
never committed anywhere else, but it was moved at least one during
development. And I don't see that moving it to contrib really fixes
anything anyway here, except perhaps conceptually. Maybe inventing
src/extensions is the right idea, but there's no real need to do that
at this point in the release cycle, and it doesn't actually fix
anything either. The only thing that's really needed here is to either
(a) teach the test script to install amcheck as a separate step or (b)
teach pg_amcheck to install amcheck in a user-specified schema. If we
do that, AIUI, this issue is fixed regardless of whether we move any
source code around, and if we don't do that, AIUI, this issue is not
fixed no matter how much source code we move.
--
Robert Haas
EDB: http://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes:
OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.
FWIW, I think that putting them both in contrib makes the most
sense from a structural standpoint.
Either way, though, you'll still need the proposed option to
let the executable issue a CREATE EXTENSION to get the shlib
loaded. Unless somebody is proposing that the extension be
installed-by-default like plpgsql, and that I am unequivocally
not for.
regards, tom lane
On Apr 19, 2021, at 9:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.FWIW, I think that putting them both in contrib makes the most
sense from a structural standpoint.
That was my original thought also, largely from a package management perspective. Just as an example, postgresql-client and postgresql-contrib are separate rpms. There isn't much point to having pg_amcheck installed as part of the postgresql-client package while having amcheck in the postgresql-contrib package which might not be installed.
A counter argument is that amcheck is server side, and pg_amcheck is client side. Having pg_amcheck installed on a system makes sense if you are connecting to a server on a different system.
On Mar 11, 2021, at 12:12 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
I want to register, if we are going to add this, it ought to be in src/bin/. If we think it's a useful tool, it should be there with all the other useful tools.
I realize there is a dependency on a module in contrib, and it's probably now not the time to re-debate reorganizing contrib. But if we ever get to that, this program should be the prime example why the current organization is problematic, and we should be prepared to make the necessary moves then.
This was the request that motivated the move to src/bin.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 4/19/21 1:25 PM, Mark Dilger wrote:
On Apr 19, 2021, at 9:53 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
OK, so let's fix it. If amcheck is going to stay in contrib then ISTM
pg_amcheck should move there. I can organize that if there's agreement.
Or else let's move amcheck as Alvaro suggests.FWIW, I think that putting them both in contrib makes the most
sense from a structural standpoint.That was my original thought also, largely from a package management perspective. Just as an example, postgresql-client and postgresql-contrib are separate rpms. There isn't much point to having pg_amcheck installed as part of the postgresql-client package while having amcheck in the postgresql-contrib package which might not be installed.
A counter argument is that amcheck is server side, and pg_amcheck is client side. Having pg_amcheck installed on a system makes sense if you are connecting to a server on a different system.
There are at least two other client side programs in contrib. So this
argument doesn't quite hold water from a consistency POV.
On Mar 11, 2021, at 12:12 AM, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
I want to register, if we are going to add this, it ought to be in src/bin/. If we think it's a useful tool, it should be there with all the other useful tools.
I realize there is a dependency on a module in contrib, and it's probably now not the time to re-debate reorganizing contrib. But if we ever get to that, this program should be the prime example why the current organization is problematic, and we should be prepared to make the necessary moves then.
This was the request that motivated the move to src/bin.
I missed that, so I guess maybe I can't complain too loudly. But if I'd
seen it I would have disagreed. :-)
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Mon, Apr 19, 2021 at 12:53:29PM -0400, Tom Lane wrote:
FWIW, I think that putting them both in contrib makes the most
sense from a structural standpoint.Either way, though, you'll still need the proposed option to
let the executable issue a CREATE EXTENSION to get the shlib
loaded. Unless somebody is proposing that the extension be
installed-by-default like plpgsql, and that I am unequivocally
not for.
Agreed. Something like src/extensions/ would be a tempting option,
but I don't think that it is a good idea to introduce a new piece of
infrastructure at this stage, so moving both to contrib/ would be the
best balance with the current pieces at hand.
--
Michael
On Apr 19, 2021, at 6:41 PM, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Apr 19, 2021 at 12:53:29PM -0400, Tom Lane wrote:
FWIW, I think that putting them both in contrib makes the most
sense from a structural standpoint.Either way, though, you'll still need the proposed option to
let the executable issue a CREATE EXTENSION to get the shlib
loaded. Unless somebody is proposing that the extension be
installed-by-default like plpgsql, and that I am unequivocally
not for.Agreed. Something like src/extensions/ would be a tempting option,
but I don't think that it is a good idea to introduce a new piece of
infrastructure at this stage, so moving both to contrib/ would be the
best balance with the current pieces at hand.
There is another issue to consider. Installing pg_amcheck in no way opens up an avenue of attack that I can see. It is just a client application with no special privileges. But installing amcheck arguably opens a line of attack; not one as significant as installing pageinspect, but of the same sort. Amcheck allows privileged database users to potentially get information from the tables that would otherwise be invisible even to them according to mvcc rules. (Is this already the case via some other functionality? Maybe this security problem already exists?) If the privileged database user has file system access, then this is not at all concerning, since they can already just open the files in a tool of their choice, but I don't see any reason why installations should require that privileged database users also be privileged to access the file system.
If you are not buying my argument here, perhaps a reference to how encryption functions are evaluated might help you see my point of view. You don't ask, "can the attacker recover the plain text from the encrypted text", but rather, "can the attacker tell the difference between encrypted plain text and encrypted random noise." That's because it is incredibly hard to reason about what an attacker might be able to learn. Even though learning about old data using amcheck would be hard, you can't say that an attacker would never be able to recover information about deleted rows. As such, security conscious installations are within reason to refuse to install it.
Since amcheck (and to a much larger extent, pageinspect) open potential data leakage issues, it makes sense for some security conscious sites to refuse to install it. pg_amcheck on the other hand could be installed everywhere. I understand why it might *feel* like pg_amcheck and amcheck have to both be installed to work, but I don't think that point of view makes much sense in reality. The computer running the client and the computer running the server are frequently not the same computer.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Apr 19, 2021 at 07:15:23PM -0700, Mark Dilger wrote:
There is another issue to consider. Installing pg_amcheck in no way
opens up an avenue of attack that I can see. It is just a client
application with no special privileges. But installing amcheck
arguably opens a line of attack; not one as significant as
installing pageinspect, but of the same sort. Amcheck allows
privileged database users to potentially get information from the
tables that would otherwise be invisible even to them according to
mvcc rules. (Is this already the case via some other functionality?
Maybe this security problem already exists?) If the privileged
database user has file system access, then this is not at all
concerning, since they can already just open the files in a tool of
their choice, but I don't see any reason why installations should
require that privileged database users also be privileged to access
the file system.
By default, any functions deployed with amcheck have their execution
rights revoked from public, meaning that only a superuser can run them
with a default installation. A non-superuser could execute them only
once GRANT'd access to them.
--
Michael
On Apr 19, 2021, at 8:06 PM, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Apr 19, 2021 at 07:15:23PM -0700, Mark Dilger wrote:
There is another issue to consider. Installing pg_amcheck in no way
opens up an avenue of attack that I can see. It is just a client
application with no special privileges. But installing amcheck
arguably opens a line of attack; not one as significant as
installing pageinspect, but of the same sort. Amcheck allows
privileged database users to potentially get information from the
tables that would otherwise be invisible even to them according to
mvcc rules. (Is this already the case via some other functionality?
Maybe this security problem already exists?) If the privileged
database user has file system access, then this is not at all
concerning, since they can already just open the files in a tool of
their choice, but I don't see any reason why installations should
require that privileged database users also be privileged to access
the file system.By default, any functions deployed with amcheck have their execution
rights revoked from public, meaning that only a superuser can run them
with a default installation. A non-superuser could execute them only
once GRANT'd access to them.
Correct. So having amcheck installed on the system provides the database superuser with a privilege escalation attack vector. I am assuming here the database superuser is not a privileged system user, and can only log in remotely, has no direct access to the file system, etc.
Alice has a database with sensitive data. She hires Bob to be her new database admin, with superuser privilege, but doesn't want Bob to see the sensitive data, so she deletes it first. The data is dead but still on disk.
Bob discovers a bug in postgres that will corrupt dead rows that some index is still pointing at. This attack requires sufficient privilege to trigger the bug, but presumably he has that much privilege, because he is a database superuser. Let's call this attack C(x) where "C" means the corruption inducing function, and "x" is the indexed key for which dead rows will be corrupted.
Bob runs "CREATE EXTENSION amcheck", and then successively runs C(x) followed by amcheck for each interesting value of x. Bob learns which of these values were in the system before Alice deleted them.
This is a classic privilege escalation attack. Bob has one privilege, and uses it to get another.
Alice might foresee this behavior from Bob and choose not to install contrib/amcheck. This is paranoid on her part, but does not cross the line into insanity.
The postgres community has every reason to keep amcheck in contrib so that users such as Alice can make this decision.
No similar argument has been put forward for why pg_amcheck should be kept in contrib.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Apr 19, 2021 at 08:39:06PM -0700, Mark Dilger wrote:
This is a classic privilege escalation attack. Bob has one
privilege, and uses it to get another.
Bob is a superuser, so it has all the privileges of the world for this
instance. In what is that different from BASE_BACKUP or just COPY
FROM PROGRAM?
I am not following your argument here.
--
Michael
On Apr 19, 2021, at 9:22 PM, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Apr 19, 2021 at 08:39:06PM -0700, Mark Dilger wrote:
This is a classic privilege escalation attack. Bob has one
privilege, and uses it to get another.Bob is a superuser, so it has all the privileges of the world for this
instance. In what is that different from BASE_BACKUP or just COPY
FROM PROGRAM?
I think you are conflating the concept of an operating system adminstrator with the concept of the database superuser/owner. If the operating system user that postgres is running as cannot execute any binaries, then "copy from program" is not a way for a database admistrator to escape the jail. If Bob does not have ssh access to the system, he cannot run pg_basebackup.
I am not following your argument here.
The argument is that the operating system user that postgres is running as, perhaps user "postgres", can read the files in the $PGDATA directory, but Bob can only see the MVCC view of the data, not the raw data. Installing contrib/amcheck allows Bob to get a peak behind the curtain.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Mon, Apr 19, 2021 at 10:31:18PM -0700, Mark Dilger wrote:
I think you are conflating the concept of an operating system
adminstrator with the concept of the database superuser/owner. If
the operating system user that postgres is running as cannot execute
any binaries, then "copy from program" is not a way for a database
admistrator to escape the jail. If Bob does not have ssh access to
the system, he cannot run pg_basebackup.
You don't need much to be able to take a base backup once you have a
connection to the backend as long as your have the permissions to do
so. In this case that would be just the replication permissions.
The argument is that the operating system user that postgres is
running as, perhaps user "postgres", can read the files in the
$PGDATA directory, but Bob can only see the MVCC view of the data,
not the raw data. Installing contrib/amcheck allows Bob to get a
peak behind the curtain.
In my world, a superuser is considered as an entity holding the same
rights as the OS user running the PostgreSQL instance, so that's wider
than the definition you are thinking of here. There are many fancy
things one can do in this case, just to name a few that give access to
any files of the data directory or even other paths:
- pg_read_file(), and take the equivalent of a base backup with a
RECURSIVE CTE.
- BASE_BACKUP, doable from a simple psql session with a replication
connection.
- Untrusted languages.
So I don't understand your argument with amcheck here because any of
its features *requires* superuser rights unless granted. Coming back
to your example, Alice actually gave up the control of her database to
Bob the moment she gave him superuser rights. If she really wanted to
protect her privacy, she'd better think about a more restricted set of
ACLs for Bob before letting him manage her data, even if she considers
herself "safe" after she deleted it, but she's really not safe by any
means. I still stand with the point of upthread to put all that in
contrib/ for now, without discarding that this could be moved
somewhere else in the future.
--
Michael
On Mon, Apr 19, 2021 at 2:55 PM Andrew Dunstan <andrew@dunslane.net> wrote:
There are at least two other client side programs in contrib. So this
argument doesn't quite hold water from a consistency POV.
I thought that at first, too. But then I realized that those programs
are oid2name and vacuumlo. And oid2name, at least, seems like
something we ought to just consider removing. It's unclear why this is
something that really deserves a command-line utility rather than just
some additional psql options or something. Does anyone really use it?
vacuumlo isn't that impressive either, since it makes the very tenuous
assumption that an oid column is intended to reference a large object,
and the documentation doesn't even acknowledge what a shaky idea that
actually is. But I suspect it has much better chances of being useful
in practice than oid2name. In fact, I've heard of people using it and,
I think, finding it useful, so we probably don't want to just nuke it.
But the point is, as things stand today, almost everything in contrib
is an extension, not a binary. And we might want to view the
exceptions as loose ends to be cleaned up, rather than a pattern to
emulate.
It's a judgement call, though.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Tue, Apr 20, 2021 at 2:47 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Apr 19, 2021 at 2:55 PM Andrew Dunstan <andrew@dunslane.net> wrote:
There are at least two other client side programs in contrib. So this
argument doesn't quite hold water from a consistency POV.I thought that at first, too. But then I realized that those programs
are oid2name and vacuumlo. And oid2name, at least, seems like
something we ought to just consider removing. It's unclear why this is
something that really deserves a command-line utility rather than just
some additional psql options or something. Does anyone really use it?
Yeah, this seems like it could relatively simply just be a SQL query in psql.
vacuumlo isn't that impressive either, since it makes the very tenuous
assumption that an oid column is intended to reference a large object,
and the documentation doesn't even acknowledge what a shaky idea that
actually is. But I suspect it has much better chances of being useful
in practice than oid2name. In fact, I've heard of people using it and,
I think, finding it useful, so we probably don't want to just nuke it.
Yes, I've definitely run into using vacuumlo many times.
But the point is, as things stand today, almost everything in contrib
is an extension, not a binary. And we might want to view the
exceptions as loose ends to be cleaned up, rather than a pattern to
emulate.
I could certainly sign up for moving vacuumlo to bin/ and replacing
oid2name with something in psql for example.
(But yes, I realize this rapidly turns into another instance of the
bikeshedding about the future of contrib..)
--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/