Dumping roles improvements?

Started by Josh Berkusover 14 years ago21 messages
#1Josh Berkus
josh@agliodbs.com

It occurs to me that we could really use two things to make it easier to
move copies of database stuff around:

pg_dump -r, which would include a CREATE ROLE for all roles needed to
restore the database (probably without passwords), and

pg_dumpall -r --no-passwords which would dump the roles but without
CREATE PASSWORD statements. This would be useful for cloning databases
for use in Dev, Test and Staging, where you don't what to copy the md5s
of passwords for possible cracking.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Josh Berkus (#1)
Re: Dumping roles improvements?

Hello

2011/10/11 Josh Berkus <josh@agliodbs.com>:

It occurs to me that we could really use two things to make it easier to
move copies of database stuff around:

pg_dump -r, which would include a CREATE ROLE for all roles needed to
restore the database (probably without passwords), and

pg_dumpall -r --no-passwords which would dump the roles but without
CREATE PASSWORD statements.  This would be useful for cloning databases
for use in Dev, Test and Staging, where you don't what to copy the md5s
of passwords for possible cracking.

maybe some switch for "complete" dump of one database with related roles.

Pavel

Show quoted text

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Bruce Momjian
bruce@momjian.us
In reply to: Josh Berkus (#1)
Re: Dumping roles improvements?

Josh Berkus wrote:

It occurs to me that we could really use two things to make it easier to
move copies of database stuff around:

pg_dump -r, which would include a CREATE ROLE for all roles needed to
restore the database (probably without passwords), and

pg_dumpall -r --no-passwords which would dump the roles but without
CREATE PASSWORD statements. This would be useful for cloning databases
for use in Dev, Test and Staging, where you don't what to copy the md5s
of passwords for possible cracking.

What would this do that pg_dumpall --globals-only doesn't?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Bruce Momjian (#3)
Re: Dumping roles improvements?

On 10/11/2011 12:40 PM, Bruce Momjian wrote:

Josh Berkus wrote:

It occurs to me that we could really use two things to make it easier to
move copies of database stuff around:

pg_dump -r, which would include a CREATE ROLE for all roles needed to
restore the database (probably without passwords), and

pg_dumpall -r --no-passwords which would dump the roles but without
CREATE PASSWORD statements. This would be useful for cloning databases
for use in Dev, Test and Staging, where you don't what to copy the md5s
of passwords for possible cracking.

What would this do that pg_dumpall --globals-only doesn't?

As stated, it would not export the passwords.

cheers

andrew

#5Bruce Momjian
bruce@momjian.us
In reply to: Andrew Dunstan (#4)
Re: Dumping roles improvements?

Andrew Dunstan wrote:

On 10/11/2011 12:40 PM, Bruce Momjian wrote:

Josh Berkus wrote:

It occurs to me that we could really use two things to make it easier to
move copies of database stuff around:

pg_dump -r, which would include a CREATE ROLE for all roles needed to
restore the database (probably without passwords), and

pg_dumpall -r --no-passwords which would dump the roles but without
CREATE PASSWORD statements. This would be useful for cloning databases
for use in Dev, Test and Staging, where you don't what to copy the md5s
of passwords for possible cracking.

What would this do that pg_dumpall --globals-only doesn't?

As stated, it would not export the passwords.

What is the logic for not dumping passwords but the CREATE ROLE
statement? I don't see how anyone would recognize that behavior as
logical. If you want to add a --no-passwords option to pg_dumpall, that
seems more logical to me.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#6Josh Berkus
josh@agliodbs.com
In reply to: Bruce Momjian (#5)
Re: Dumping roles improvements?

What is the logic for not dumping passwords but the CREATE ROLE
statement? I don't see how anyone would recognize that behavior as
logical. If you want to add a --no-passwords option to pg_dumpall, that
seems more logical to me.

That's what I'm suggesting.

Incidentally, what's the difference between -g and -r in terms of actual
output, anyway?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#7Bruce Momjian
bruce@momjian.us
In reply to: Josh Berkus (#6)
Re: Dumping roles improvements?

Josh Berkus wrote:

What is the logic for not dumping passwords but the CREATE ROLE
statement? I don't see how anyone would recognize that behavior as
logical. If you want to add a --no-passwords option to pg_dumpall, that
seems more logical to me.

That's what I'm suggesting.

Incidentally, what's the difference between -g and -r in terms of actual
output, anyway?

Acording the docs, I assume -r is only roles, while -g includes
tablespace, so what you want is already available in pg_dumpall.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#8Josh Berkus
josh@agliodbs.com
In reply to: Bruce Momjian (#7)
Re: Dumping roles improvements?

Acording the docs, I assume -r is only roles, while -g includes
tablespace, so what you want is already available in pg_dumpall.

No, it's not. You don't seem to be actually reading any of my proposals.

(1) I cannot produce a single file in custom dump format which includes
both a single database and all of the roles I need to build that database.

(2) I cannot dump a set of roles without md5 passwords.

Both of these are things I need to support dev/stage/testing integration
at multiple sites.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: Dumping roles improvements?

Andrew Dunstan <andrew@dunslane.net> writes:

On 10/11/2011 12:40 PM, Bruce Momjian wrote:

Josh Berkus wrote:

pg_dumpall -r --no-passwords which would dump the roles but without
CREATE PASSWORD statements. This would be useful for cloning databases
for use in Dev, Test and Staging, where you don't what to copy the md5s
of passwords for possible cracking.

What would this do that pg_dumpall --globals-only doesn't?

As stated, it would not export the passwords.

I can see some possible use in a --no-passwords option that's orthogonal
to all else. The rest of this seems rather confused...

regards, tom lane

#10Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#9)
Re: Dumping roles improvements?

On 10/11/2011 04:19 PM, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

On 10/11/2011 12:40 PM, Bruce Momjian wrote:

Josh Berkus wrote:

pg_dumpall -r --no-passwords which would dump the roles but without
CREATE PASSWORD statements. This would be useful for cloning databases
for use in Dev, Test and Staging, where you don't what to copy the md5s
of passwords for possible cracking.

What would this do that pg_dumpall --globals-only doesn't?

As stated, it would not export the passwords.

I can see some possible use in a --no-passwords option that's orthogonal
to all else. The rest of this seems rather confused...

Yes, I also think that will meet the case.

cheers

andrew

#11Bruce Momjian
bruce@momjian.us
In reply to: Josh Berkus (#8)
Re: Dumping roles improvements?

Josh Berkus wrote:

Acording the docs, I assume -r is only roles, while -g includes
tablespace, so what you want is already available in pg_dumpall.

No, it's not. You don't seem to be actually reading any of my proposals.

(1) I cannot produce a single file in custom dump format which includes
both a single database and all of the roles I need to build that database.

(2) I cannot dump a set of roles without md5 passwords.

Both of these are things I need to support dev/stage/testing integration
at multiple sites.

We are not writing this software for you. Please submit a clear
proposal. I am sure you have 10k customers who want this. :-|

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#12Andrew Dunstan
andrew@dunslane.net
In reply to: Josh Berkus (#8)
Re: Dumping roles improvements?

On 10/11/2011 03:50 PM, Josh Berkus wrote:

Acording the docs, I assume -r is only roles, while -g includes
tablespace, so what you want is already available in pg_dumpall.

No, it's not. You don't seem to be actually reading any of my proposals.

(1) I cannot produce a single file in custom dump format which includes
both a single database and all of the roles I need to build that database.

(2) I cannot dump a set of roles without md5 passwords.

Both of these are things I need to support dev/stage/testing integration
at multiple sites.

There seems to be agreement on something for (2), and it won't be hard.
(1) would probably be much more complex. Essentially we'd need to add a
new object type for roles, I think. But won't (2) give you most of what
you need for (1) anyway? AIUI, your problem is that the roles might not
exist, and so some or all of the dump will fail. But if you have (2) and
dump the roles without passwords and restore them before running
pg_restore that wouldn't happen. It won't be one command but it will be
two or three pretty easy commands.

cheers

andrew

#13Josh Berkus
josh@agliodbs.com
In reply to: Andrew Dunstan (#12)
Re: Dumping roles improvements?

There seems to be agreement on something for (2), and it won't be hard.
(1) would probably be much more complex. Essentially we'd need to add a
new object type for roles, I think. But won't (2) give you most of what
you need for (1) anyway? AIUI, your problem is that the roles might not
exist, and so some or all of the dump will fail. But if you have (2) and
dump the roles without passwords and restore them before running
pg_restore that wouldn't happen. It won't be one command but it will be
two or three pretty easy commands.

These serve two different purposes.

The reason I want to have the dependant roles created as part of a
database dump is so that we can ship around dump files as a single file,
and restore them with a single command. This is considerably simpler
than the current requirements, which are:

1. pg_dumpall the roles
2. pg_dump the database
3. tar both files
4. ship file
5. untar both files
6. psql the role file
7. pg_restore the database file

Since the above is something I'm doing on around 11 different machines
between once a day and once a week, eliminating the 4 extra steps would
be really nice.

However, we'd also need CREATE OR REPLACE ROLE to really integrate
shipping database copies. Without that, including roles in the database
dump doesn't help that much.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#13)
Re: Dumping roles improvements?

Josh Berkus <josh@agliodbs.com> writes:

The reason I want to have the dependant roles created as part of a
database dump is so that we can ship around dump files as a single file,
and restore them with a single command. This is considerably simpler
than the current requirements, which are:

1. pg_dumpall the roles
2. pg_dump the database
3. tar both files
4. ship file
5. untar both files
6. psql the role file
7. pg_restore the database file

I don't find this terribly convincing. I can see the rationales for two
endpoint cases: (1) restore these objects into exactly the same
ownership/permissions environment that existed before, and (2) restore
these objects with the absolute minimum of ownership/permissions
assumptions. The latter case seems to me to be covered already by
--no-owner --no-privileges. Cases in between those endpoints seem
pretty special-purpose, and I don't want to buy into the assumption that
we should fix them by creating a plethora of --do-it-joshs-way switches.
Can we invent something comparable to the --list/--use-list mechanism,
that can handle a range of use cases with a bit more manual effort?

regards, tom lane

#15Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#14)
Re: Dumping roles improvements?

On 10/11/11 9:43 PM, Tom Lane wrote:

I don't find this terribly convincing. I can see the rationales for two
endpoint cases: (1) restore these objects into exactly the same
ownership/permissions environment that existed before, and (2) restore
these objects with the absolute minimum of ownership/permissions
assumptions. The latter case seems to me to be covered already by
--no-owner --no-privileges.

But what I'm asking for is (1). The problem is that the roles don't
ship in the per-database pgdump file.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#16Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#14)
Re: Dumping roles improvements?

On 10/12/2011 12:43 AM, Tom Lane wrote:

Josh Berkus<josh@agliodbs.com> writes:

The reason I want to have the dependant roles created as part of a
database dump is so that we can ship around dump files as a single file,
and restore them with a single command. This is considerably simpler
than the current requirements, which are:
1. pg_dumpall the roles
2. pg_dump the database
3. tar both files
4. ship file
5. untar both files
6. psql the role file
7. pg_restore the database file

I don't find this terribly convincing. I can see the rationales for two
endpoint cases: (1) restore these objects into exactly the same
ownership/permissions environment that existed before, and (2) restore
these objects with the absolute minimum of ownership/permissions
assumptions. The latter case seems to me to be covered already by
--no-owner --no-privileges. Cases in between those endpoints seem
pretty special-purpose, and I don't want to buy into the assumption that
we should fix them by creating a plethora of --do-it-joshs-way switches.
Can we invent something comparable to the --list/--use-list mechanism,
that can handle a range of use cases with a bit more manual effort?

Not easily, that I can think of. The cleanest way I can imagine would be
to have explicit ROLE objects in the TOC. TWe can easily get a list of
object owners and turn that into a set of "create role" statements,
because owner names are in the metadata, but getting a list of roles
mentioned in ACL items can only be done by textually analysing them -
the information just isn't kept anywhere else currently.

I do think there's a case for doing "create if not exists role foo" (I
know we don't have that right now) for owners and roles mentioned in
ACLs. The hair in the ointment here comes when we consider how far to go
with that. In particular, would we follow role membership recursively?

OTOH, notwithstanding Josh's reasonable need, I'm not sure the ROI here
is high enough.

cheers

andrew

#17Andrew Dunstan
andrew@dunslane.net
In reply to: Josh Berkus (#15)
Re: Dumping roles improvements?

On 10/12/2011 03:16 PM, Josh Berkus wrote:

On 10/11/11 9:43 PM, Tom Lane wrote:

I don't find this terribly convincing. I can see the rationales for two
endpoint cases: (1) restore these objects into exactly the same
ownership/permissions environment that existed before, and (2) restore
these objects with the absolute minimum of ownership/permissions
assumptions. The latter case seems to me to be covered already by
--no-owner --no-privileges.

But what I'm asking for is (1). The problem is that the roles don't
ship in the per-database pgdump file.

I think Tom's (1) assumes you already have that environment, not that it
will be created on the fly by pg_restore.

cheers

andrew

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#15)
Re: Dumping roles improvements?

Josh Berkus <josh@agliodbs.com> writes:

On 10/11/11 9:43 PM, Tom Lane wrote:

I don't find this terribly convincing. I can see the rationales for two
endpoint cases: (1) restore these objects into exactly the same
ownership/permissions environment that existed before, and (2) restore
these objects with the absolute minimum of ownership/permissions
assumptions. The latter case seems to me to be covered already by
--no-owner --no-privileges.

But what I'm asking for is (1). The problem is that the roles don't
ship in the per-database pgdump file.

In that case you do "pg_dumpall -r" first and then pg_dump your
individual database. I thought you were looking for something that
would dump only roles referenced in the particular database, which
is why it sounded like an intermediate case.

I know that the division of labor between pg_dumpall and pg_dump could
use rethinking, but it needs to be actually rethought, in toto, not
hacked one minor feature at a time.

regards, tom lane

#19Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#18)
Re: Dumping roles improvements?

In that case you do "pg_dumpall -r" first and then pg_dump your
individual database. I thought you were looking for something that
would dump only roles referenced in the particular database, which
is why it sounded like an intermediate case.

I know that the division of labor between pg_dumpall and pg_dump could
use rethinking, but it needs to be actually rethought, in toto, not
hacked one minor feature at a time.

Sure. Maybe I should start a wiki page for that?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#20Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Josh Berkus (#8)
Re: Dumping roles improvements?

Josh Berkus <josh@agliodbs.com> writes:

(1) I cannot produce a single file in custom dump format which includes
both a single database and all of the roles I need to build that database.

I would see addressing this with the proposal to have pg_dumpall able to
issue an archive of -Fc dumps, that pg_restore would know how to handle.
Then we could have some option to list or exclude databases you want in
this pg_dumpall -Fc format.

It's been said that extending custom format to handle several databases
would be complex, but I don't think it's necessary. A simple archive,
tar formatted for example, containing the custom format file of all
selected databases, would be user friendly enough. All the more if you
add a -j option to control how many databases you dump at the same time.

The main drawback is that you need to prepare a directory with the
globals.sql file and a custom format file per database, and only when
all of those are finished can you pack them into the tar format.

Again, good enough for me.

(2) I cannot dump a set of roles without md5 passwords.

Both of these are things I need to support dev/stage/testing integration
at multiple sites.

I don't handle that in pg_staging, but the other features you need are
all implemented into that tool. It knows how to fetch the per cluster
settings then init your staging cluster with that and only then restore
your database (with a pgbouncer layer in between so that you can have
more than one available in parallel and an easy switch from one to the
other, like db -> db_20111013 rather than db_20111012).

https://github.com/dimitri/pg_staging

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#21Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Bruce Momjian (#11)
Re: Dumping roles improvements?

Bruce Momjian <bruce@momjian.us> writes:

We are not writing this software for you. Please submit a clear
proposal. I am sure you have 10k customers who want this. :-|

I think I did (write this software).

https://github.com/dimitri/pg_staging
http://tapoueh.org/blog/2011/03/29-towards-pg_staging-10.html

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support