pg_execute_from_file, patch v10

Started by Dimitri Fontaineover 15 years ago27 messageshackers
Jump to latest
#1Dimitri Fontaine
dimitri@2ndQuadrant.fr

Hi,

The other infrastructure patch that has been mark ready for commit then
commented further upon by Tom is $subject, even if the function provided
as been renamed to pg_execute_sql_file().

Please find attached the newer version that fixes Tom concerns, removing
the VARIADIC forms of the functions (those placeholders idea).

The git tree already contains a fixed extension code, but the next patch
for that one will have to wait some more (psql refactoring).

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

Attachments:

pg_execute_from_file.v10.patchtext/x-patchDownload+416-197
#2Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Dimitri Fontaine (#1)
Re: pg_execute_from_file, patch v10

On Sun, Dec 12, 2010 at 06:08, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:

The other infrastructure patch that has been mark ready for commit then
commented further upon by Tom is $subject, even if the function provided
as been renamed to pg_execute_sql_file().

Please find attached the newer version that fixes Tom concerns, removing
the VARIADIC forms of the functions (those placeholders idea).

I think the version is almost OK, but I have a couple of comments:
- Why do you need directory_fctx in genfile.h ?
- It might be reasonable to have 3 and 1 arguments version of pg_read_file.
i.e, (path, offset, size) and (path). Two args version (path, offset)
doesn't seem to be so useful. In addition, CREATE EXTENSION will always
call it with offset=0, no?
- We don't need some of added #include "utils/array.h" anymore.

--
Itagaki Takahiro

#3Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Itagaki Takahiro (#2)
Re: pg_execute_from_file, patch v10

Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:

I think the version is almost OK, but I have a couple of comments:
- Why do you need directory_fctx in genfile.h ?

I then use it in extension.c, this way:

typedef struct extension_fctx
{
directory_fctx dir;
ExtensionList *installed;
} extension_fctx;

- It might be reasonable to have 3 and 1 arguments version of pg_read_file.
i.e, (path, offset, size) and (path). Two args version (path, offset)
doesn't seem to be so useful. In addition, CREATE EXTENSION will always
call it with offset=0, no?

Depending on the 'relocatable' property, we now do either of those calls:

execute_sql_file(get_extension_absolute_path(control->script),
pg_encoding_to_char(encoding));

read_text_file_with_endoding(filename,
pg_encoding_to_char(encoding));

So we're using the internal forms only here, and we can propose whatever
API we find best. Reading through the end of the file seems common
enough, but I agree I would prefer reading the whole file here if I had
to pick only one.

- We don't need some of added #include "utils/array.h" anymore.

Ah yes, true.

Do you want another patch version from me?
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#4Robert Haas
robertmhaas@gmail.com
In reply to: Dimitri Fontaine (#3)
Re: pg_execute_from_file, patch v10

On Mon, Dec 13, 2010 at 9:36 AM, Dimitri Fontaine
<dimitri@2ndquadrant.fr> wrote:

Do you want another patch version from me?

I'm looking at this patch and I'm confused. Why do we need this at
all? pg_read_binary_file() seems like it might be useful to somebody,
but I don't see what it has to do with extensions. And the rest of
this doesn't appear to provide any new functionality. The extension
mechanism hardly needs SQL-callable functions.

As a side note, this comment almost makes sense, but not quite:

+ /* Abuse knowledge that we're bytea and text are both varlena */

...but my real question is why any of this is necessary at all and
what it has to do with extensions.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#5Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#4)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 10:53, Robert Haas <robertmhaas@gmail.com> wrote:

I'm looking at this patch and I'm confused.  Why do we need this at
all?  pg_read_binary_file() seems like it might be useful to somebody,
but I don't see what it has to do with extensions.  And the rest of
this doesn't appear to provide any new functionality.  The extension
mechanism hardly needs SQL-callable functions.

Hmm, I've expected that the EXTENSION patch would use the SQL functions
like as SPI_exec("SELECT pg_execute_sql(pg_read_file($1))", ...), but
it actually uses internal functions and nested DirectFunctionCalls.
So, the most important part of this patch is allowing to read any
files in the server file system. The current pg_read_file() allows
to read only files under $PGDATA and pg_log.

However, the interface of current pg_read_file() is mis-designed
to read files in multi-byte encoding because
1. The encoding must be same with the server encoding.
2. Users need to specify correct offset in the file
not to split multi-byte characters.
So, it'd be better to improve pg_read_file() aside from EXTENSION anyway.
I think pg_read_whole_binary_file() is one of the solutions for the issue.

--
Itagaki Takahiro

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Itagaki Takahiro (#5)
Re: pg_execute_from_file, patch v10

Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:

On Tue, Dec 14, 2010 at 10:53, Robert Haas <robertmhaas@gmail.com> wrote:

I'm looking at this patch and I'm confused.  Why do we need this at
all?  pg_read_binary_file() seems like it might be useful to somebody,
but I don't see what it has to do with extensions.  And the rest of
this doesn't appear to provide any new functionality.  The extension
mechanism hardly needs SQL-callable functions.

Hmm, I've expected that the EXTENSION patch would use the SQL functions
like as SPI_exec("SELECT pg_execute_sql(pg_read_file($1))", ...), but
it actually uses internal functions and nested DirectFunctionCalls.
So, the most important part of this patch is allowing to read any
files in the server file system. The current pg_read_file() allows
to read only files under $PGDATA and pg_log.

Has anyone thought twice about the security implications of that?
Not to mention that in most cases, the very last thing we want is to
have to specify an exact full path?

I think we'd be better off insisting that the extension files be under
sharedir or some such place.

In any case, I concur with what I gather Robert is thinking, which is
that there is no good reason to be exposing any of this at the SQL level.

regards, tom lane

#7Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#5)
Re: pg_execute_from_file, patch v10

On Mon, Dec 13, 2010 at 9:41 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

Hmm, I've expected that the EXTENSION patch would use the SQL functions
like as SPI_exec("SELECT pg_execute_sql(pg_read_file($1))", ...), but
it actually uses internal functions and nested DirectFunctionCalls.
So, the most important part of this patch is allowing to read any
files in the server file system. The current pg_read_file() allows
to read only files under $PGDATA and pg_log.

As Tom says, this is clearly not going to fly on security grounds.

However, the interface of current pg_read_file() is mis-designed
to read files in multi-byte encoding because
 1. The encoding must be same with the server encoding.
 2. Users need to specify correct offset in the file
    not to split multi-byte characters.
So, it'd be better to improve pg_read_file() aside from EXTENSION anyway.
I think pg_read_whole_binary_file() is one of the solutions for the issue.

I don't have any problem with a separate patch to try to improve some
of these issues, but this is supposedly part of the extensions work,
yet (1) most of what's here has little to do with extensions and (2)
extensions don't need this stuff exposed at the SQL level anyway. I'm
inclined to mark this patch as Returned with Feedback. The portions
of this patch that are trying to fix multi-byte encoding issues can be
submitted as a separate patch that just does that. Whatever material
here is relevant to extensions can either be resubmitted with all of
these other things taken out, or just get absorbed into the main
extensions patch if (as I suspect) there isn't enough there to warrant
a separate patch.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#7)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 12:02, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Dec 13, 2010 at 9:41 PM, Itagaki Takahiro

So, the most important part of this patch is allowing to read any
files in the server file system. The current pg_read_file() allows
to read only files under $PGDATA and pg_log.

As Tom says, this is clearly not going to fly on security grounds.

If it's a security hole, lo_import() should be also a hole
because we can use lo_import() and SELECT * FROM pg_largeobject
for the same purpose...

I don't have any problem with a separate patch to try to improve some
of these issues, but this is supposedly part of the extensions work,
yet (1) most of what's here has little to do with extensions and (2)
extensions don't need this stuff exposed at the SQL level anyway. I'm
inclined to mark this patch as Returned with Feedback.

If so, I'm not sure why we need to split the EXTENSION patch into sub pieces.
In my understanding, we did it because the sub pieces are also useful in
standalone. The requirement for the pieces was changed and extended in
discussions, but I hope the change will not be the reason to reject the patch.

--
Itagaki Takahiro

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Itagaki Takahiro (#8)
Re: pg_execute_from_file, patch v10

Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:

On Tue, Dec 14, 2010 at 12:02, Robert Haas <robertmhaas@gmail.com> wrote:

As Tom says, this is clearly not going to fly on security grounds.

If it's a security hole, lo_import() should be also a hole
because we can use lo_import() and SELECT * FROM pg_largeobject
for the same purpose...

lo_import is superuser-only. If we design this feature so that it will
forever have to be superuser-only, to get a behavior that I think we
don't even *want*, I believe we're making a serious error.

regards, tom lane

#10Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#8)
Re: pg_execute_from_file, patch v10

On Mon, Dec 13, 2010 at 10:21 PM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

I don't have any problem with a separate patch to try to improve some
of these issues, but this is supposedly part of the extensions work,
yet (1) most of what's here has little to do with extensions and (2)
extensions don't need this stuff exposed at the SQL level anyway.  I'm
inclined to mark this patch as Returned with Feedback.

If so, I'm not sure why we need to split the EXTENSION patch into sub pieces.
In my understanding, we did it because the sub pieces are also useful in
standalone. The requirement for the pieces was changed and extended in
discussions, but I hope the change will not be the reason to reject the patch.

Well, I think it is best when a patch has just one purpose. This
seems to be sort of an odd hodge-podge of things.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#11Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Tom Lane (#9)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 12:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:

lo_import is superuser-only.  If we design this feature so that it will
forever have to be superuser-only, to get a behavior that I think we
don't even *want*, I believe we're making a serious error.

CREATE EXTENSION and pg_read_file() is also superuser-only, no?

--
Itagaki Takahiro

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Itagaki Takahiro (#11)
Re: pg_execute_from_file, patch v10

Itagaki Takahiro <itagaki.takahiro@gmail.com> writes:

On Tue, Dec 14, 2010 at 12:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:

lo_import is superuser-only.  If we design this feature so that it will
forever have to be superuser-only, to get a behavior that I think we
don't even *want*, I believe we're making a serious error.

CREATE EXTENSION and pg_read_file() is also superuser-only, no?

CREATE EXTENSION will be superuser to start with, no doubt, but I think
we'll someday want to allow it to database owners, just as happened with
CREATE LANGUAGE. Let's not build it on top of operations that
inherently involve security problems, especially when there's no need
to.

regards, tom lane

#13Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Tom Lane (#6)
Re: pg_execute_from_file, patch v10

Tom Lane <tgl@sss.pgh.pa.us> writes:

Has anyone thought twice about the security implications of that?
Not to mention that in most cases, the very last thing we want is to
have to specify an exact full path?

Well, the security is left same as before, superuser only. And Itagaki
showed that superuser are allowed to read any file anywhere already, so
we didn't change anything here.

I think we'd be better off insisting that the extension files be under
sharedir or some such place.

That's the case, but the rework of genfile.c is more general than just
support for extension, or I wouldn't have been asked for a separate
patch, would I?

In any case, I concur with what I gather Robert is thinking, which is
that there is no good reason to be exposing any of this at the SQL level.

That used to be done this way, you know, in versions between 0 and 6 of
the patch. Starting at version 7, the underlyiong facilities have been
splitted and exposed, because of the file encoding and server encoding
issues reported by Itagaki.

I propose that more than 2 of you guys get in agreement on what the good
specs are and wake me up after that so that I spawn the right version of
the patch, and if necessary, revise it.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#14Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Tom Lane (#12)
Re: pg_execute_from_file, patch v10

Tom Lane <tgl@sss.pgh.pa.us> writes:

CREATE EXTENSION will be superuser to start with, no doubt, but I think
we'll someday want to allow it to database owners, just as happened with
CREATE LANGUAGE. Let's not build it on top of operations that
inherently involve security problems, especially when there's no need
to.

That boils down to moving the superuser() test in the right functions,
it's now in the innermost facility to read files. If you have something
precise enough for me to work on it, please say, but I guess you'd spend
less time making the copy/paste in the code rather than in the mail.
That schedule optimisation is for you to make, though.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#15Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Robert Haas (#10)
Re: pg_execute_from_file, patch v10

Robert Haas <robertmhaas@gmail.com> writes:

Well, I think it is best when a patch has just one purpose. This
seems to be sort of an odd hodge-podge of things.

The purpose here is clean-up the existing pg_read_file() facility so
that it's easy to build pg_execute_sql_file() on top of it.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#16Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Dimitri Fontaine (#13)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 18:01, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:

In any case, I concur with what I gather Robert is thinking, which is
that there is no good reason to be exposing any of this at the SQL level.

That used to be done this way, you know, in versions between 0 and 6 of
the patch. Starting at version 7, the underlyiong facilities have been
splitted and exposed, because of the file encoding and server encoding
issues reported by Itagaki.

I'm confused which part of the patch is the point of the discussion.
1. Relax pg_read_file() to be able to read any files.
2. pg_read_binary_file()
3. pg_execute_sql_string/file()

As I pointed out, 1 is reasonable as long as we restrict the usage
only to superuser. If we think it is a security hole, there are
the same issue in lo_import() and COPY FROM by superuser.

2 is a *fix* for the badly-designed pg_read_file() interface.
It should have returned bytea rather than text.

3 could simplify later EXTENSION patches, but it might not be
a large help because we can just use SPI_exec() instead of them
if we write codes with C. I think the most useful parts of the
patch is reading a whole file with encoding, i.e., 1 and 2.

--
Itagaki Takahiro

#17Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#16)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 11:48 AM, Itagaki Takahiro
<itagaki.takahiro@gmail.com> wrote:

I'm confused which part of the patch is the point of the discussion.
 1. Relax pg_read_file() to be able to read any files.
 2. pg_read_binary_file()
 3. pg_execute_sql_string/file()

As I pointed out, 1 is reasonable as long as we restrict the usage
only to superuser. If we think it is a security hole, there are
the same issue in lo_import() and COPY FROM by superuser.

2 is a *fix* for the badly-designed pg_read_file() interface.
It should have returned bytea rather than text.

3 could simplify later EXTENSION patches, but it might not be
a large help because we can just use SPI_exec() instead of them
if we write codes with C.  I think the most useful parts of the
patch is reading a whole file with encoding, i.e., 1 and 2.

So there are really four changes in here, right?

1. Relax pg_read_file() to be able to read any files.
2. pg_read_binary_file()
3. pg_execute_sql_string()/file()
4. ability to read a file in a given encoding (rather than the client encoding)

I think I agree that #1 doesn't open any security hole that doesn't
exist already. We have no similar check for COPY, and both are
superuser-only. I also see that this is useful for the extensions
work, if that code wants to internally DirectFunctionCall to
pg_read_file.

I think #2 might be a nice thing to have, but I'm not sure what it has
to do with extensions.

I don't see why we need #3.

I think #4 is useful. I am not clear whether it is needed for the
extension stuff or not.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#17)
Re: pg_execute_from_file, patch v10

Robert Haas <robertmhaas@gmail.com> writes:

So there are really four changes in here, right?

1. Relax pg_read_file() to be able to read any files.
2. pg_read_binary_file()
3. pg_execute_sql_string()/file()
4. ability to read a file in a given encoding (rather than the client encoding)

I think I agree that #1 doesn't open any security hole that doesn't
exist already.

That function would never have been accepted into core at all without a
locked-down range of how much of the filesystem it would let you get at.
There is nothing whatsoever in the extensions proposal that justifies
dropping that restriction. If you want to put it up as a separately
proposed, separately justified patch, go ahead ... but I'll vote against
it even then. (I will also point out that on SELinux-based systems,
relaxing the restriction would be completely useless anyway.)

I think #2 might be a nice thing to have, but I'm not sure what it has
to do with extensions.

Agreed. There might be some use for #4 in connection with extensions,
but I don't see that #2 is related.

BTW, it appears to me that pg_read_file expects server encoding not
client encoding. Minor detail only, but let's be clear what it is
we're talking about.

regards, tom lane

#19Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#18)
Re: pg_execute_from_file, patch v10

On Tue, Dec 14, 2010 at 1:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

So there are really four changes in here, right?

1. Relax pg_read_file() to be able to read any files.
2. pg_read_binary_file()
3. pg_execute_sql_string()/file()
4. ability to read a file in a given encoding (rather than the client encoding)

I think I agree that #1 doesn't open any security hole that doesn't
exist already.

That function would never have been accepted into core at all without a
locked-down range of how much of the filesystem it would let you get at.

I have some angst about opening it up wide, but I'm also having a hard
time seeing what problem it creates that you can't already create with
COPY FROM or lo_import().

I think #2 might be a nice thing to have, but I'm not sure what it has
to do with extensions.

Agreed.  There might be some use for #4 in connection with extensions,
but I don't see that #2 is related.

BTW, it appears to me that pg_read_file expects server encoding not
client encoding.  Minor detail only, but let's be clear what it is
we're talking about.

OK.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Tom Lane (#18)
Re: pg_execute_from_file, patch v10

Tom Lane <tgl@sss.pgh.pa.us> writes:

Robert Haas <robertmhaas@gmail.com> writes:

So there are really four changes in here, right?

1. Relax pg_read_file() to be able to read any files.
2. pg_read_binary_file()
3. pg_execute_sql_string()/file()
4. ability to read a file in a given encoding (rather than the client encoding)

I think I agree that #1 doesn't open any security hole that doesn't
exist already.

That function would never have been accepted into core at all without a
locked-down range of how much of the filesystem it would let you get at.

Ok. Previously pg_read_file() only allows absolute file names that point
into DataDir or into Log_directory. It used not to work in the first
versions of the extension's patch, but with the current code, the check
passes on a development install here: extension.c is only giving
genfile.c absolute file names.

Please note that debian will default to have DataDir in a different
place than the sharepath:

http://packages.debian.org/sid/amd64/postgresql-contrib-9.0/filelist

PGDATA: /var/lib/postgresql/9.1/main
sharepath: /usr/share/postgresql/9.1/contrib
libdir: /usr/lib/postgresql/9.1/lib

So I'm not sure how if it will play nice with such installs, or if
there's already some genfile.c patching on debian.

I think #2 might be a nice thing to have, but I'm not sure what it has
to do with extensions.

Agreed. There might be some use for #4 in connection with extensions,
but I don't see that #2 is related.

Well, in fact, the extension's code is using either execute_sql_file()
or read_text_file_with_endoding() then @extschema@ replacement then
execute_sql_string(), all those functions called directly thanks to
#include "utils/genfile.h". No DirectFunctionCall'ing, we can easily
remove SQL callable forms.

So what we need is 2, 3 and 4 (because 4 builds on 2).

BTW, it appears to me that pg_read_file expects server encoding not
client encoding. Minor detail only, but let's be clear what it is
we're talking about.

Hence the refactoring in the patch. Ask Itagaki for details with funny
environments using some file encoding that does not exists in the server
yet ain't client_encoding and can't be. I didn't follow the use case in
details, but he was happy with the current way of doing things and not
with any previous one.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#21Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Dimitri Fontaine (#20)
#22Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#19)
#23Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#22)
#24Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#23)
#25Robert Haas
robertmhaas@gmail.com
In reply to: Itagaki Takahiro (#24)
#26Itagaki Takahiro
itagaki.takahiro@gmail.com
In reply to: Robert Haas (#25)
#27Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Itagaki Takahiro (#26)