pg_dump new feature: exporting functions only. Bad or good idea ?
Hey hackers,
I had this idea, that I raised and cherished like my baby to add a switch
in `pg_dump` to allow exporting stored functions (and procedures) only.
However, when I finally got the time to look at it in detail, I found out
there was no way to solve the dependencies in the functions and procedures,
so that the exported file, when re-played could lead to invalid objects.
So, I decided this would not make Postgres better and decide to walk off
this patch.
Anyhow, when sharing my thoughts, several people told me to ask the
community about adding this feature because this could be useful in some
use cases. Another argument is that should you have all your functions in
one schema and your tables in another, exporting only the function schema
will lead to the same kind of file that could lead to invalid objects
created when the file is re-played against a database that does not have
the tables.
Of course, the documentation would add a warning against object invalidity
should only the stored functions/procedures be exported.
So, my question is: what do you think about such a feature? is it worth it?
Have a nice day,
Lætitia
On 3/17/21 6:00 PM, Lætitia Avrot wrote:
Hey hackers,
I had this idea, that I raised and cherished like my baby to add a switch
in `pg_dump` to allow exporting stored functions (and procedures) only.However, when I finally got the time to look at it in detail, I found out
there was no way to solve the dependencies in the functions and procedures,
so that the exported file, when re-played could lead to invalid objects.So, I decided this would not make Postgres better and decide to walk off
this patch.Anyhow, when sharing my thoughts, several people told me to ask the
community about adding this feature because this could be useful in some
use cases. Another argument is that should you have all your functions in
one schema and your tables in another, exporting only the function schema
will lead to the same kind of file that could lead to invalid objects
created when the file is re-played against a database that does not have
the tables.Of course, the documentation would add a warning against object invalidity
should only the stored functions/procedures be exported.So, my question is: what do you think about such a feature? is it worth it?
Yes, it is absolutely worth it to be able to extract just the functions
from a database. I have wanted this several times.
--
Vik Fearing
On Wed, Mar 17, 2021 at 2:00 PM Lætitia Avrot <laetitia.avrot@gmail.com>
wrote:
Hey hackers,
I had this idea, that I raised and cherished like my baby to add a switch
in `pg_dump` to allow exporting stored functions (and procedures) only.
However, when I finally got the time to look at it in detail, I found out
there was no way to solve the dependencies in the functions and procedures,
so that the exported file, when re-played could lead to invalid objects.
So, I decided this would not make Postgres better and decide to walk off
this patch.
Anyhow, when sharing my thoughts, several people told me to ask the
community about adding this feature because this could be useful in some
use cases. Another argument is that should you have all your functions in
one schema and your tables in another, exporting only the function schema
will lead to the same kind of file that could lead to invalid objects
created when the file is re-played against a database that does not have
the tables.
Of course, the documentation would add a warning against object
invalidity should only the stored functions/procedures be exported.
So, my question is: what do you think about such a feature? is it worth
it?
Make total sense since we already have --function=NAME(args) on pg_restore
and it doesn't solve dependencies... so we can add it to also export
function/procedure contents.
+1 for general idea.
--
Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Vik Fearing <vik@postgresfriends.org> writes:
On 3/17/21 6:00 PM, Lætitia Avrot wrote:
However, when I finally got the time to look at it in detail, I found out
there was no way to solve the dependencies in the functions and procedures,
so that the exported file, when re-played could lead to invalid objects.
...
So, my question is: what do you think about such a feature? is it worth it?
Yes, it is absolutely worth it to be able to extract just the functions
from a database. I have wanted this several times.
Selective dumps always have a risk of not being restorable on their
own; I don't see that "functions only" is noticeably less safe than
"just these tables", or other cases that we support already.
What I'm wondering about is how this might interact with the
discussion at [1]/messages/by-id/CAFj8pRB10wvW0CC9Xq=1XDs=zCQxer3cbLcNZa+qiX4cUH-G_A@mail.gmail.com.
regards, tom lane
[1]: /messages/by-id/CAFj8pRB10wvW0CC9Xq=1XDs=zCQxer3cbLcNZa+qiX4cUH-G_A@mail.gmail.com
Hello,
You'll find enclosed the first version of my patch. I did not include the
possibility of using a file to list tables to be exported as Tom suggested
because I genuinely think it is a totally different matter. It does not
mean I'm not open to the possibility, it just felt weird.
The patch allows using a `--functions-only` flag in `pg_dump` to export
only functions and stored procedures. My code was build and passed tests on
the last master branch of the PostgreSQL project. I added regression tests.
Documentation has been updated too and generation of the documentation
(HTML, man page, pdf in A4 and letter US format) has been tested
successfully.
I did not add a warning in the documentation that the file provided might
end up in a not restorable file or in a file restoring broken functions or
procedures. Do you think I should?
I don't know if this patch has any impact on performance. I guess that
adding 4 if statements will slow down `pg_dump` a little bit.
Have a nice day,
Lætitia
Le mer. 17 mars 2021 à 18:16, Tom Lane <tgl@sss.pgh.pa.us> a écrit :
Show quoted text
Vik Fearing <vik@postgresfriends.org> writes:
On 3/17/21 6:00 PM, Lætitia Avrot wrote:
However, when I finally got the time to look at it in detail, I found
out
there was no way to solve the dependencies in the functions and
procedures,
so that the exported file, when re-played could lead to invalid objects.
...
So, my question is: what do you think about such a feature? is it worthit?
Yes, it is absolutely worth it to be able to extract just the functions
from a database. I have wanted this several times.Selective dumps always have a risk of not being restorable on their
own; I don't see that "functions only" is noticeably less safe than
"just these tables", or other cases that we support already.What I'm wondering about is how this might interact with the
discussion at [1].regards, tom lane
[1]
/messages/by-id/CAFj8pRB10wvW0CC9Xq=1XDs=zCQxer3cbLcNZa+qiX4cUH-G_A@mail.gmail.com
Attachments:
pg_dump_functions_only_v0_1.patchapplication/octet-stream; name=pg_dump_functions_only_v0_1.patchDownload+80-8
On 3/27/21 8:22 AM, Lætitia Avrot wrote:
Hello,
You'll find enclosed the first version of my patch. I did not include
the possibility of using a file to list tables to be exported as Tom
suggested because I genuinely think it is a totally different matter.
It does not mean I'm not open to the possibility, it just felt weird.The patch allows using a `--functions-only` flag in `pg_dump` to
export only functions and stored procedures. My code was build and
passed tests on the last master branch of the PostgreSQL project. I
added regression tests. Documentation has been updated too and
generation of the documentation (HTML, man page, pdf in A4 and letter
US format) has been tested successfully.
We can bikeshed the name of the flag at some stage. --procedures-only
might also make sense
I did not add a warning in the documentation that the file provided
might end up in a not restorable file or in a file restoring broken
functions or procedures. Do you think I should?
No, I don't think it's any different from any of the other similar switches.
I don't know if this patch has any impact on performance. I guess that
adding 4 if statements will slow down `pg_dump` a little bit.
Not likely to be noticeable.
Please add this to the next commitfest.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Sat, Mar 27, 2021 at 6:23 AM Lætitia Avrot <laetitia.avrot@gmail.com>
wrote:
Hello,
You'll find enclosed the first version of my patch.
I tested a couple simple use cases. This is great, Thank you!
I did not include the possibility of using a file to list tables to be
exported as Tom suggested because I genuinely think it is a totally
different matter. It does not mean I'm not open to the possibility, it just
felt weird.The patch allows using a `--functions-only` flag in `pg_dump` to export
only functions and stored procedures. My code was build and passed tests on
the last master branch of the PostgreSQL project. I added regression tests.
Documentation has been updated too and generation of the documentation
(HTML, man page, pdf in A4 and letter US format) has been tested
successfully.I did not add a warning in the documentation that the file provided might
end up in a not restorable file or in a file restoring broken functions or
procedures. Do you think I should?
The docs for both the --table and --schema options do warn about this. On
the other hand, --data-only has no such warning. I'd lean towards matching
--data-only for this.
I don't know if this patch has any impact on performance. I guess that
adding 4 if statements will slow down `pg_dump` a little bit.Have a nice day,
Lætitia
Using --functions-only along with --table=<name> does not error out and
warn the user, instead it creates a dump containing only the SET commands.
An error similar to using --functions-only along with --data-only seems
like a good idea.
Cheers,
*Ryan Lambert*
RustProof Labs
Using --functions-only along with --table=<name> does not error out and
warn the user, instead it creates a dump containing only the SET commands.
An error similar to using --functions-only along with --data-only seems
like a good idea.
Thank you for giving my patch a try.
I added the new if statement, so that the program will error should the
`--functions-only` be used alongside the `--table` option.
The patch has been added to the next commifest.
Have a nice day,
Lætitia
Attachments:
pg_dump_functions_only_v0_2.patchapplication/octet-stream; name=pg_dump_functions_only_v0_2.patchDownload+86-8
The following review has been posted through the commitfest application:
make installcheck-world: tested, failed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested
Few minor comments :
- The latest patch has some hunk failures
- Regression with master has many failures with/without the patch, it is difficult to tell if the patch is causing any failures.
- This is probably intended behaviour that --functions-only switch is also dumping stored procedures?
- If i create a procedure with
LANGUAGE plpgsql
SECURITY INVOKER
It is not including "SECURITY INVOKER" in the dump. That's probably because INVOKER is default security rights.
Hi,
I took a quick look at the patch today. There was some minor bitrot
requiring a rebase, so I attach the rebased patch as v3.
The separate 0002 part contains some minor fixes - a couple
typos/rewording in the docs (would be good if a native speaker looked at
it, thought), and a slightly reworked chunk of code from pg_dump.c. The
change is more a matter of personal preference than correctness - it
just seems simpler this way, but ymmv. And I disliked that the comment
said "If we have to export only the functions .." but the if condition
was actually the exact opposite of that.
The main question I have is whether this should include procedures. I'd
probably argue procedures should be considered different from functions
(i.e. requiring a separate --procedures-only option), because it pretty
much is meant to be a separate object type. We don't allow calling DROP
FUNCTION on a procedure, etc. It'd be silly to introduce an unnecessary
ambiguity in pg_dump and have to deal with it sometime later.
I wonder if we should allow naming a function to dump, similarly to how
--table works for tables, for example.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
The main question I have is whether this should include procedures.
I feel a bit uncomfortable about sticking this sort of limited-purpose
selectivity mechanism into pg_dump. I'd rather see a general filter
method that can select object(s) of any type. Pavel was doing some
work towards that awhile ago, though I think he got frustrated about
the lack of consensus on details. Which is a problem, but I don't
think the solution is to accrue more and more independently-designed-
and-implemented features that each solve some subset of the big problem.
regards, tom lane
On Fri, Jul 09, 2021 at 07:43:02PM -0400, Tom Lane wrote:
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
The main question I have is whether this should include procedures.
I feel a bit uncomfortable about sticking this sort of limited-purpose
selectivity mechanism into pg_dump. I'd rather see a general filter
method that can select object(s) of any type. Pavel was doing some
work towards that awhile ago, though I think he got frustrated about
the lack of consensus on details.
That's this: https://commitfest.postgresql.org/33/2573/
--
Justin
On 7/10/21 1:43 AM, Tom Lane wrote:
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
The main question I have is whether this should include procedures.
I feel a bit uncomfortable about sticking this sort of limited-purpose
selectivity mechanism into pg_dump. I'd rather see a general filter
method that can select object(s) of any type. Pavel was doing some
work towards that awhile ago, though I think he got frustrated about
the lack of consensus on details. Which is a problem, but I don't
think the solution is to accrue more and more independently-designed-
and-implemented features that each solve some subset of the big problem.
I'm not against introducing such general filter mechanism, but why
should it block this patch? I'd understand it the patch was adding a lot
of code, but that's not the case - it's tiny. And we already have
multiple filter options (to pick tables, schemas, extensions, ...).
And if there's no consensus on details of Pavel's patch after multiple
commitfests, how likely is it it'll start moving forward?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Sat, Jul 10, 2021 at 5:06 AM Tomas Vondra <
tomas.vondra@enterprisedb.com> wrote:
On 7/10/21 1:43 AM, Tom Lane wrote:
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
The main question I have is whether this should include procedures.
I feel a bit uncomfortable about sticking this sort of limited-purpose
selectivity mechanism into pg_dump. I'd rather see a general filter
method that can select object(s) of any type. Pavel was doing some
work towards that awhile ago, though I think he got frustrated about
the lack of consensus on details. Which is a problem, but I don't
think the solution is to accrue more and more independently-designed-
and-implemented features that each solve some subset of the big problem.I'm not against introducing such general filter mechanism, but why
should it block this patch? I'd understand it the patch was adding a lot
of code, but that's not the case - it's tiny. And we already have
multiple filter options (to pick tables, schemas, extensions, ...).
And if there's no consensus on details of Pavel's patch after multiple
commitfests, how likely is it it'll start moving forward?
It seems to me that pg_dump already has plenty of limited-purpose options
for selectivity, adding this patch seems to fit in with the norm. I'm in
favor of this patch because it works in the same way the community is
already used to and meets the need. The other patch referenced upstream
appears to be intended to solve a specific problem encountered with very
long commands. It is also introducing a new way of working with pg_dump
via a config file, and honestly I've never wanted that type of feature. Not
saying that wouldn't be useful, but that has not been a pain point for me
and seems overly complex. For the use cases I imagine this patch will help
with, being required to go through a config file seems excessive vs pg_dump
--functions-only.
On Fri, Jul 9, 2021 at 4:43 PM Tomas Vondra <tomas.vondra@enterprisedb.com>
wrote:
The main question I have is whether this should include procedures. I'd
probably argue procedures should be considered different from functions
(i.e. requiring a separate --procedures-only option), because it pretty
much is meant to be a separate object type. We don't allow calling DROP
FUNCTION on a procedure, etc. It'd be silly to introduce an unnecessary
ambiguity in pg_dump and have to deal with it sometime later.
+1
*Ryan Lambert*
RustProof Labs
On Fri, Jul 9, 2021 at 4:43 PM Tomas Vondra <
tomas.vondra@enterprisedb.com> wrote:
The main question I have is whether this should include procedures. I'd
probably argue procedures should be considered different from functions
(i.e. requiring a separate --procedures-only option), because it pretty
much is meant to be a separate object type. We don't allow calling DROP
FUNCTION on a procedure, etc. It'd be silly to introduce an unnecessary
ambiguity in pg_dump and have to deal with it sometime later.
I respectfully disagree. In psql, the `\ef` and `\df` metacommands will
also list procedures, not just functions. So at one point we agreed to
consider for this client that functions were close enough to procedures to
use a simple metacommand to list/display without distinction. Why should it
be different for `pg_dump` ?
Have a nice day,
Lætitia
On 30 Jul 2021, at 12:55, Lætitia Avrot <laetitia.avrot@gmail.com> wrote:
On Fri, Jul 9, 2021 at 4:43 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:
The main question I have is whether this should include procedures. I'd
probably argue procedures should be considered different from functions
(i.e. requiring a separate --procedures-only option), because it pretty
much is meant to be a separate object type. We don't allow calling DROP
FUNCTION on a procedure, etc. It'd be silly to introduce an unnecessary
ambiguity in pg_dump and have to deal with it sometime later.I respectfully disagree. In psql, the `\ef` and `\df` metacommands will also list procedures, not just functions.
I tend to agree that we should include both, while they are clearly different I
don't think it would be helpful in this case to distinguish.
Looking at this thread I think it makes sense to go ahead with this patch. The
filter functionality worked on in another thread is dealing with cherry-picking
certain objects where this is an all-or-nothing switch, so I don't think they
are at odds with each other.
--
Daniel Gustafsson https://vmware.com/
On Tue, Nov 09, 2021 at 03:23:07PM +0100, Daniel Gustafsson wrote:
Looking at this thread I think it makes sense to go ahead with this patch. The
filter functionality worked on in another thread is dealing with cherry-picking
certain objects where this is an all-or-nothing switch, so I don't think they
are at odds with each other.
Including both procedures and functions sounds natural from here. Now
I have a different question, something that has not been discussed in
this thread at all. What about patterns? Switches like --table or
--extension are able to digest a psql-like pattern to decide which
objects to dump. Is there a reason not to have this capability for
this new switch with procedure names? I mean to handle the case
without the function arguments, even if the same name is used by
multiple functions with different arguments.
--
Michael
It looks like this discussion has reached a bit of an impasse with Tom
being against this approach and Michael and Daniel being for it. It
doesn't look like it's going to get committed this commitfest, shall
we move it forward or mark it returned with feedback?
On 24 Mar 2022, at 23:38, Greg Stark <stark@mit.edu> wrote:
It looks like this discussion has reached a bit of an impasse with Tom
being against this approach and Michael and Daniel being for it. It
doesn't look like it's going to get committed this commitfest, shall
we move it forward or mark it returned with feedback?
Lætitia mentioned the other day off-list that she was going to try and update
this patch with the pattern support proposed, so hopefully we will hear from
her shortly on that.
--
Daniel Gustafsson https://vmware.com/
Daniel Gustafsson <daniel@yesql.se> writes:
On 24 Mar 2022, at 23:38, Greg Stark <stark@mit.edu> wrote:
It looks like this discussion has reached a bit of an impasse with Tom
being against this approach and Michael and Daniel being for it. It
doesn't look like it's going to get committed this commitfest, shall
we move it forward or mark it returned with feedback?
Lætitia mentioned the other day off-list that she was going to try and update
this patch with the pattern support proposed, so hopefully we will hear from
her shortly on that.
To clarify: I'm not against having an easy way to dump all-and-only
functions. What concerns me is having such a feature that's designed
in isolation, without a plan for anything else. I'd like to create
some sort of road map for future selective-dumping options, and then
we can make sure that this feature fits into the bigger picture.
Otherwise we're going to end up with an accumulation of warts, with
inconsistent naming and syntax, and who knows what other sources of
confusion.
regards, tom lane