Converting contrib SQL functions to new style
Attached are some draft patches to convert almost all of the
contrib modules' SQL functions to use SQL-standard function bodies.
The point of this is to remove the residual search_path security
hazards that we couldn't fix in commits 7eeb1d986 et al. Since
a SQL-style function body is fully parsed at creation time,
its object references are not subject to capture by the run-time
search path. Possibly there are small performance benefits too,
though I've not tried to measure that.
I've not touched the documentation yet. I suppose that we can
tone down the warnings added by 7eeb1d986 quite a bit, maybe
replacing them with just "be sure to use version x.y or later".
However I think we may still need an assumption that earthdistance
and cube are in the same schema --- any comments on that?
I'd like to propose squeezing these changes into v14, even though
we're past feature freeze. Reason one is that this is less a
new feature than a security fix; reason two is that this provides
some non-artificial test coverage for the SQL-function-body feature.
BTW, there still remain a couple of old-style SQL functions in
contrib/adminpack and contrib/lo. AFAICS those are unconditionally
secure, so I didn't bother with them.
Thoughts?
regards, tom lane
Attachments:
0001-citext-sql-functions.patchtext/x-diff; charset=us-ascii; name=0001-citext-sql-functions.patchDownload+67-1
0002-earthdistance-sql-functions.patchtext/x-diff; charset=us-ascii; name=0002-earthdistance-sql-functions.patchDownload+60-2
0003-pageinspect-sql-functions.patchtext/x-diff; charset=us-ascii; name=0003-pageinspect-sql-functions.patchDownload+70-0
0004-pg_freespacemap-sql-functions.patchtext/x-diff; charset=us-ascii; name=0004-pg_freespacemap-sql-functions.patchDownload+17-2
0005-xml2-sql-functions.patchtext/x-diff; charset=us-ascii; name=0005-xml2-sql-functions.patchDownload+22-2
On Tue, Apr 13, 2021 at 06:26:34PM -0400, Tom Lane wrote:
Attached are some draft patches to convert almost all of the
contrib modules' SQL functions to use SQL-standard function bodies.
The point of this is to remove the residual search_path security
hazards that we couldn't fix in commits 7eeb1d986 et al. Since
a SQL-style function body is fully parsed at creation time,
its object references are not subject to capture by the run-time
search path.
Are there any inexact matches in those function/operator calls? Will that
matter more or less than it does today?
However I think we may still need an assumption that earthdistance
and cube are in the same schema --- any comments on that?
That part doesn't change, indeed.
I'd like to propose squeezing these changes into v14, even though
we're past feature freeze. Reason one is that this is less a
new feature than a security fix; reason two is that this provides
some non-artificial test coverage for the SQL-function-body feature.
Dogfooding like this is good. What about the SQL-language functions that
initdb creates?
Noah Misch <noah@leadboat.com> writes:
On Tue, Apr 13, 2021 at 06:26:34PM -0400, Tom Lane wrote:
Attached are some draft patches to convert almost all of the
contrib modules' SQL functions to use SQL-standard function bodies.
The point of this is to remove the residual search_path security
hazards that we couldn't fix in commits 7eeb1d986 et al. Since
a SQL-style function body is fully parsed at creation time,
its object references are not subject to capture by the run-time
search path.
Are there any inexact matches in those function/operator calls? Will that
matter more or less than it does today?
I can't claim to have looked closely for inexact matches. It should
matter less than today, since there's a hazard only during creation
(with a somewhat-controlled search path) and not during use. But
that doesn't automatically eliminate the issue.
I'd like to propose squeezing these changes into v14, even though
we're past feature freeze. Reason one is that this is less a
new feature than a security fix; reason two is that this provides
some non-artificial test coverage for the SQL-function-body feature.
Dogfooding like this is good. What about the SQL-language functions that
initdb creates?
Hadn't thought about those, but converting them seems like a good idea.
regards, tom lane
On Tue, Apr 13, 2021 at 11:11:13PM -0400, Tom Lane wrote:
Noah Misch <noah@leadboat.com> writes:
On Tue, Apr 13, 2021 at 06:26:34PM -0400, Tom Lane wrote:
Attached are some draft patches to convert almost all of the
contrib modules' SQL functions to use SQL-standard function bodies.
The point of this is to remove the residual search_path security
hazards that we couldn't fix in commits 7eeb1d986 et al. Since
a SQL-style function body is fully parsed at creation time,
its object references are not subject to capture by the run-time
search path.Are there any inexact matches in those function/operator calls? Will that
matter more or less than it does today?I can't claim to have looked closely for inexact matches. It should
matter less than today, since there's a hazard only during creation
(with a somewhat-controlled search path) and not during use. But
that doesn't automatically eliminate the issue.
Once CREATE EXTENSION is over, things are a great deal safer under this
proposal, as you say. I suspect it makes CREATE EXTENSION more hazardous.
Today, typical SQL commands in extension creation scripts don't activate
inexact argument type matching. You were careful to make each script clear
the search_path around commands deviating from that (commit 7eeb1d9). I think
"CREATE FUNCTION plus1dot1(int) RETURNS numeric LANGUAGE SQL RETURN $1 + 1.1;"
in a trusted extension script would constitute a security vulnerability, since
it can lock in the wrong operator.
On Wed, Apr 14, 2021 at 8:58 AM Noah Misch <noah@leadboat.com> wrote:
Once CREATE EXTENSION is over, things are a great deal safer under this
proposal, as you say. I suspect it makes CREATE EXTENSION more hazardous.
Today, typical SQL commands in extension creation scripts don't activate
inexact argument type matching. You were careful to make each script clear
the search_path around commands deviating from that (commit 7eeb1d9). I think
"CREATE FUNCTION plus1dot1(int) RETURNS numeric LANGUAGE SQL RETURN $1 + 1.1;"
in a trusted extension script would constitute a security vulnerability, since
it can lock in the wrong operator.
I don't understand how that can happen, unless we've failed to secure
the search_path. And, if we've failed to secure the search_path, I
think we are in a lot of trouble no matter what else we do.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 14, 2021 at 8:58 AM Noah Misch <noah@leadboat.com> wrote:
Once CREATE EXTENSION is over, things are a great deal safer under this
proposal, as you say. I suspect it makes CREATE EXTENSION more hazardous.
Today, typical SQL commands in extension creation scripts don't activate
inexact argument type matching. You were careful to make each script clear
the search_path around commands deviating from that (commit 7eeb1d9). I think
"CREATE FUNCTION plus1dot1(int) RETURNS numeric LANGUAGE SQL RETURN $1 + 1.1;"
in a trusted extension script would constitute a security vulnerability, since
it can lock in the wrong operator.
I don't understand how that can happen, unless we've failed to secure
the search_path. And, if we've failed to secure the search_path, I
think we are in a lot of trouble no matter what else we do.
The situation of interest is where you are trying to install an extension
into a schema that also contains malicious objects. We've managed to make
most of the commands you might use in an extension script secure against
that situation, and Noah wants to hold SQL-function creation to that same
standard.
My concern in this patch is rendering SQL functions safe against untrusted
search_path at *time of use*, which is really an independent security
concern.
If you're willing to assume there's nothing untrustworthy in your
search_path, then there's no issue and nothing to fix. Unfortunately,
that seems like a rather head-in-the-sand standpoint.
regards, tom lane
On Apr 13, 2021, at 3:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
However I think we may still need an assumption that earthdistance
and cube are in the same schema --- any comments on that?
This is probably not worth doing, and we are already past feature freeze, but adding syntax to look up the namespace of an extension might help. The problem seems to be that we can't syntactically refer to the schema of an extension. We have to instead query pg_catalog.pg_extension joined against pg_catalog.pg_namespace and then interpolate the namespace name into strings that get executed, which is ugly.
This syntax is perhaps a non-starter, but conceptually something like:
-CREATE DOMAIN earth AS cube
+CREATE DOMAIN earthdistance::->earth AS cube::->cube
Then we'd perhaps extend RangeVar with an extensionname field and have either a schemaname or an extensionname be looked up in places where we currently lookup schemas, adding a catcache for extensions. (Like I said, probably not worth doing.)
We could get something like this working just inside the CREATE EXTENSION command if we expanded on the @extschema@ idea a bit. At first I thought this idea would suffer race conditions with concurrent modifications of pg_extension or pg_namespace, but it looks like we already have a snapshot when processing the script file, so:
-CREATE DOMAIN earth AS cube
+CREATE DOMAIN @@earthdistance@@::earth AS @@cube@@::cube
or such, with @@foo@@ being parsed out, looked up in pg_extension join pg_namespace, and substituted back in.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Apr 14, 2021 at 10:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The situation of interest is where you are trying to install an extension
into a schema that also contains malicious objects. We've managed to make
most of the commands you might use in an extension script secure against
that situation, and Noah wants to hold SQL-function creation to that same
standard.
Oh, I was forgetting that the creation schema has to be first in your
search path. :-(
Does the idea of allowing the creation schema to be set separately
have any legs? Because it seems like that would help here.
--
Robert Haas
EDB: http://www.enterprisedb.com
Mark Dilger <mark.dilger@enterprisedb.com> writes:
On Apr 13, 2021, at 3:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
However I think we may still need an assumption that earthdistance
and cube are in the same schema --- any comments on that?
This is probably not worth doing, and we are already past feature
freeze, but adding syntax to look up the namespace of an extension might
help.
Yeah, that idea was discussed before (perhaps only in private
security-team threads, though). We didn't do anything about it because
at the time there didn't seem to be pressing need, but in the context
of SQL function bodies there's an obvious use-case.
We could get something like this working just inside the CREATE EXTENSION command if we expanded on the @extschema@ idea a bit. At first I thought this idea would suffer race conditions with concurrent modifications of pg_extension or pg_namespace, but it looks like we already have a snapshot when processing the script file, so:
-CREATE DOMAIN earth AS cube +CREATE DOMAIN @@earthdistance@@::earth AS @@cube@@::cube
Right, extending the @extschema@ mechanism is what was discussed,
though I think I'd lean towards something like @extschema:cube@
to denote the schema of a referenced extension "cube".
I'm not sure this is useful enough to break feature freeze for,
but I'm +1 for investigating it for v15.
regards, tom lane
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 14, 2021 at 10:49 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The situation of interest is where you are trying to install an extension
into a schema that also contains malicious objects. We've managed to make
most of the commands you might use in an extension script secure against
that situation, and Noah wants to hold SQL-function creation to that same
standard.
Oh, I was forgetting that the creation schema has to be first in your
search path. :-(
Does the idea of allowing the creation schema to be set separately
have any legs? Because it seems like that would help here.
Doesn't help that much, because you still have to reference objects
already created by your own extension, so it's hard to see how the
target schema won't need to be in the path.
[ thinks for awhile ... ]
Could we hack things so that extension scripts are only allowed to
reference objects created (a) by the system, (b) earlier in the
same script, or (c) owned by one of the declared prerequisite
extensions? Seems like that might provide a pretty bulletproof
defense against trojan-horse objects, though I'm not sure how much
of a pain it'd be to implement.
regards, tom lane
On Wed, Apr 14, 2021 at 1:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Doesn't help that much, because you still have to reference objects
already created by your own extension, so it's hard to see how the
target schema won't need to be in the path.
Oh, woops.
Could we hack things so that extension scripts are only allowed to
reference objects created (a) by the system, (b) earlier in the
same script, or (c) owned by one of the declared prerequisite
extensions? Seems like that might provide a pretty bulletproof
defense against trojan-horse objects, though I'm not sure how much
of a pain it'd be to implement.
That doesn't seem like a crazy idea, but the previous idea of having
some magic syntax that means "the schema where extension FOO is" seems
like it might be easier to implement and more generally useful. If we
taught the core system that %!!**&^%?(earthdistance) means "the schema
where the earthdistance is located" that syntax might get some use
even outside of extension creation scripts, which seems like it could
be a good thing, just because code that is used more widely is more
likely to have been debugged to the point where it actually works.
--
Robert Haas
EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 14, 2021 at 1:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Could we hack things so that extension scripts are only allowed to
reference objects created (a) by the system, (b) earlier in the
same script, or (c) owned by one of the declared prerequisite
extensions? Seems like that might provide a pretty bulletproof
defense against trojan-horse objects, though I'm not sure how much
of a pain it'd be to implement.
That doesn't seem like a crazy idea, but the previous idea of having
some magic syntax that means "the schema where extension FOO is" seems
like it might be easier to implement and more generally useful.
I think that's definitely useful, but it's not a fix for the
reference-capture problem unless you care to assume that the other
extension's schema is free of trojan-horse objects. So I'm thinking
that we really ought to pursue both ideas.
This may mean that squeezing these contrib changes into v14 is a lost
cause. We certainly shouldn't try to do what I suggest above for
v14; but without it, these changes are just moving the security
issue to a different place rather than eradicating it completely.
regards, tom lane
On 4/14/21 2:03 PM, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 14, 2021 at 1:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Could we hack things so that extension scripts are only allowed to
reference objects created (a) by the system, (b) earlier in the
same script, or (c) owned by one of the declared prerequisite
extensions? Seems like that might provide a pretty bulletproof
defense against trojan-horse objects, though I'm not sure how much
of a pain it'd be to implement.That doesn't seem like a crazy idea, but the previous idea of having
some magic syntax that means "the schema where extension FOO is" seems
like it might be easier to implement and more generally useful.I think that's definitely useful, but it's not a fix for the
reference-capture problem unless you care to assume that the other
extension's schema is free of trojan-horse objects. So I'm thinking
that we really ought to pursue both ideas.This may mean that squeezing these contrib changes into v14 is a lost
cause. We certainly shouldn't try to do what I suggest above for
v14; but without it, these changes are just moving the security
issue to a different place rather than eradicating it completely.
Is there anything else we should be doing along the eat your own dogfood
line that don't have these security implications?
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes:
On 4/14/21 2:03 PM, Tom Lane wrote:
This may mean that squeezing these contrib changes into v14 is a lost
cause. We certainly shouldn't try to do what I suggest above for
v14; but without it, these changes are just moving the security
issue to a different place rather than eradicating it completely.
Is there anything else we should be doing along the eat your own dogfood
line that don't have these security implications?
We can still convert the initdb-created SQL functions to new style,
since there's no security threat during initdb. I'll make a patch
for that soon.
regards, tom lane
On 4/14/21 7:36 PM, Tom Lane wrote:
Mark Dilger <mark.dilger@enterprisedb.com> writes:
On Apr 13, 2021, at 3:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
However I think we may still need an assumption that earthdistance
and cube are in the same schema --- any comments on that?This is probably not worth doing, and we are already past feature
freeze, but adding syntax to look up the namespace of an extension might
help.Yeah, that idea was discussed before (perhaps only in private
security-team threads, though). We didn't do anything about it because
at the time there didn't seem to be pressing need, but in the context
of SQL function bodies there's an obvious use-case.We could get something like this working just inside the CREATE EXTENSION command if we expanded on the @extschema@ idea a bit. At first I thought this idea would suffer race conditions with concurrent modifications of pg_extension or pg_namespace, but it looks like we already have a snapshot when processing the script file, so:
-CREATE DOMAIN earth AS cube +CREATE DOMAIN @@earthdistance@@::earth AS @@cube@@::cubeRight, extending the @extschema@ mechanism is what was discussed,
though I think I'd lean towards something like @extschema:cube@
to denote the schema of a referenced extension "cube".I'm not sure this is useful enough to break feature freeze for,
but I'm +1 for investigating it for v15.
Just like we have a pseudo "$user" schema, could we have a pseudo
"$extension" catalog? That should avoid changing grammar rules too much.
CREATE TABLE unaccented_words (
word "$extension".citext.citext,
CHECK (word = "$extension".unaccent.unaccent(word)
);
--
Vik Fearing
On Apr 14, 2021, at 2:47 PM, Vik Fearing <vik@postgresfriends.org> wrote:
On 4/14/21 7:36 PM, Tom Lane wrote:
Mark Dilger <mark.dilger@enterprisedb.com> writes:
On Apr 13, 2021, at 3:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
However I think we may still need an assumption that earthdistance
and cube are in the same schema --- any comments on that?This is probably not worth doing, and we are already past feature
freeze, but adding syntax to look up the namespace of an extension might
help.Yeah, that idea was discussed before (perhaps only in private
security-team threads, though). We didn't do anything about it because
at the time there didn't seem to be pressing need, but in the context
of SQL function bodies there's an obvious use-case.We could get something like this working just inside the CREATE EXTENSION command if we expanded on the @extschema@ idea a bit. At first I thought this idea would suffer race conditions with concurrent modifications of pg_extension or pg_namespace, but it looks like we already have a snapshot when processing the script file, so:
-CREATE DOMAIN earth AS cube +CREATE DOMAIN @@earthdistance@@::earth AS @@cube@@::cubeRight, extending the @extschema@ mechanism is what was discussed,
though I think I'd lean towards something like @extschema:cube@
to denote the schema of a referenced extension "cube".I'm not sure this is useful enough to break feature freeze for,
but I'm +1 for investigating it for v15.Just like we have a pseudo "$user" schema, could we have a pseudo
"$extension" catalog? That should avoid changing grammar rules too much.CREATE TABLE unaccented_words (
word "$extension".citext.citext,
CHECK (word = "$extension".unaccent.unaccent(word)
);
Having a single variable $extension might help in many cases, but I don't see how to use it to handle the remaining cross-extension references, such as earthdistance needing to reference cube.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 4/15/21 12:18 AM, Mark Dilger wrote:
On Apr 14, 2021, at 2:47 PM, Vik Fearing <vik@postgresfriends.org> wrote:
On 4/14/21 7:36 PM, Tom Lane wrote:
Mark Dilger <mark.dilger@enterprisedb.com> writes:
On Apr 13, 2021, at 3:26 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
However I think we may still need an assumption that earthdistance
and cube are in the same schema --- any comments on that?This is probably not worth doing, and we are already past feature
freeze, but adding syntax to look up the namespace of an extension might
help.Yeah, that idea was discussed before (perhaps only in private
security-team threads, though). We didn't do anything about it because
at the time there didn't seem to be pressing need, but in the context
of SQL function bodies there's an obvious use-case.We could get something like this working just inside the CREATE EXTENSION command if we expanded on the @extschema@ idea a bit. At first I thought this idea would suffer race conditions with concurrent modifications of pg_extension or pg_namespace, but it looks like we already have a snapshot when processing the script file, so:
-CREATE DOMAIN earth AS cube +CREATE DOMAIN @@earthdistance@@::earth AS @@cube@@::cubeRight, extending the @extschema@ mechanism is what was discussed,
though I think I'd lean towards something like @extschema:cube@
to denote the schema of a referenced extension "cube".I'm not sure this is useful enough to break feature freeze for,
but I'm +1 for investigating it for v15.Just like we have a pseudo "$user" schema, could we have a pseudo
"$extension" catalog? That should avoid changing grammar rules too much.CREATE TABLE unaccented_words (
word "$extension".citext.citext,
CHECK (word = "$extension".unaccent.unaccent(word)
);Having a single variable $extension might help in many cases, but I don't see how to use it to handle the remaining cross-extension references, such as earthdistance needing to reference cube.
Sorry, I hadn't realized that was a real example so I made up my own.
Basically my idea is to use the fully qualified catalog.schema.object
syntax where the catalog is a special "$extension" value (meaning we
would have to forbid that as an actual database name) and the schema is
the name of the extension whose schema we want. The object is then just
the object.
CREATE DOMAIN earth AS "$extension".cube.cube
CONSTRAINT not_point check("$extension".cube.cube_is_point(value))
CONSTRAINT not_3d check("$extension".cube.cube_dim(value <= 3)
...;
CREATE FUNCTION earth_box(earth, float8)
RETURNS "$extension".cube.cube
LANGUAGE sql
IMMUTABLE PARALLEL SAFE STRICT
RETURN "$extension".cube.cube_enlarge($1, gc_to_sec($2), 3);
If I had my druthers, we would spell it pg_extension instead of
"$extension" because I hate double-quoting identifiers, but that's just
bikeshedding and has little to do with the concept itself.
--
Vik Fearing
On 2021-Apr-15, Vik Fearing wrote:
CREATE DOMAIN earth AS "$extension".cube.cube
CONSTRAINT not_point check("$extension".cube.cube_is_point(value))
CONSTRAINT not_3d check("$extension".cube.cube_dim(value <= 3)
...;
I find this syntax pretty weird -- here, the ".cube." part of the
identifier is acting as an argument of sorts for the preceding
$extension thingy. This looks very surprising.
Something similar to OPERATOR() syntax may be more palatable:
CREATE DOMAIN earth AS PG_EXTENSION_SCHEMA(cube).cube
CONSTRAINT not_point check(PG_EXTENSION_SCHEMA(cube).cube_is_point(value))
CONSTRAINT not_3d check(PG_EXTENSION_SCHEMA(cube).cube_dim(value <= 3)
...;
Here, the PG_EXTENSION_SCHEMA() construct expands into the schema of the
given extension. This looks more natural to me, since the extension
that acts as argument to PG_EXTENSION_SCHEMA() does look like an
argument.
I don't know if the parser would like this, though.
--
�lvaro Herrera Valdivia, Chile
On Wed, Apr 14, 2021 at 02:03:56PM -0400, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Wed, Apr 14, 2021 at 1:41 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Could we hack things so that extension scripts are only allowed to
reference objects created (a) by the system, (b) earlier in the
same script, or (c) owned by one of the declared prerequisite
extensions? Seems like that might provide a pretty bulletproof
defense against trojan-horse objects, though I'm not sure how much
of a pain it'd be to implement.
Good idea.
That doesn't seem like a crazy idea, but the previous idea of having
some magic syntax that means "the schema where extension FOO is" seems
like it might be easier to implement and more generally useful.I think that's definitely useful, but it's not a fix for the
reference-capture problem unless you care to assume that the other
extension's schema is free of trojan-horse objects.
I could see using that, perhaps in a non-SQL-language function. I agree it
solves different problems.
On 14.04.21 00:26, Tom Lane wrote:
Attached are some draft patches to convert almost all of the
contrib modules' SQL functions to use SQL-standard function bodies.
This first patch is still the patch of record in CF 2021-09, but from
the subsequent discussion, it seems more work is being contemplated.