pg_restore restores privileges differently from psql

Started by Sherrylyn Branchawabout 7 years ago8 messagesgeneral
Jump to latest
#1Sherrylyn Branchaw
sbranchaw@gmail.com

Hi,

I'm running two Postgres 9.6.11 databases on RHEL 6.9.

I'm restoring the schema from one database (prod) to another (dev). There
are users that exist in prod that don't exist in dev. When the restore job
tries to grant privileges to nonexistent users in dev, I would like it to
generate an error, which is safe to ignore, but still correctly grant
privileges to any user that does exist in dev.

That's the behavior I see when I dump to a plain file and restore it
using *psql
-f*, but not the behavior I see when I do a *pg_dump -Fc* followed by
*pg_restore.* *pg_restore *seems to treat all the *GRANT* statements for a
single object as a single statement, and when one errors out, they all
error out, meaning I'm left with no privileges on the object in question.

For instance, when this appears in my plaintext file:
GRANT ALL ON SCHEMA test TO user1;
GRANT USAGE ON SCHEMA test TO user2;
GRANT USAGE ON SCHEMA test TO user3;

and user1 doesn't exist on the target database, user2 and user3 get the
expected privileges when restoring from *psql* but not from *pg_restore*.

Here's a reproducible test case.

CREATE DATABASE prod_db;
CREATE DATABASE dev_db_pg_restore;
CREATE DATABASE dev_db_psql;
CREATE ROLE prod_user;
CREATE ROLE prod_and_dev_user;

-- in prod_db
CREATE SCHEMA test;
GRANT ALL ON SCHEMA test TO prod_user;
GRANT USAGE ON SCHEMA test TO prod_and_dev_user;

pg_dump -Fc prod_db &> prod_dump.bin
pg_dump prod_db &> prod_dump.sql

-- On database instance containing the dev dbs.
DROP ROLE prod_user;

pg_restore prod_dump.bin -d dev_db_pg_restore
psql -d dev_db_psql -f prod_dump.sql

-- In dev_db_psql
SELECT has_schema_privilege('prod_and_dev_user', 'test', 'usage');
Expected result: true
Actual result: true

-- In dev_db_pg_restore
SELECT has_schema_privilege('prod_and_dev_user', 'test', 'usage');
Expected result: true
Actual result: false

The behavior seems to be related to the fact that *pg_restore* reports the
failed command as containing all the semicolon-delimited privilege command,
which get executed separately when restoring from plaintext:

pg_restore: [archiver (db)] could not execute query: ERROR: role
"prod_user" does not exist
Command was: GRANT ALL ON SCHEMA test TO prod_user;
GRANT USAGE ON SCHEMA test TO prod_and_dev_user;

As a workaround, I've created a unprivileged dummy user by this name on the
dev database, but my question is, is this a bug or feature? If a feature,
is the behavior documented? I didn't find any documentation, but that
doesn't mean it doesn't exist.

Thanks,
Sherrylyn

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Sherrylyn Branchaw (#1)
Re: pg_restore restores privileges differently from psql

On 1/14/19 10:15 AM, Sherrylyn Branchaw wrote:

Hi,

I'm running two Postgres 9.6.11 databases on RHEL 6.9.

I'm restoring the schema from one database (prod) to another (dev).
There are users that exist in prod that don't exist in dev. When the
restore job tries to grant privileges to nonexistent users in dev, I
would like it to generate an error, which is safe to ignore, but still
correctly grant privileges to any user that does exist in dev.

That's the behavior I see when I dump to a plain file and restore it
using /psql -f/, but not the behavior I see when I do a /pg_dump
-Fc/ followed by /pg_restore./ /pg_restore /seems to treat all the
/GRANT/ statements for a single object as a single statement, and when
one errors out, they all error out, meaning I'm left with no privileges
on the object in question.

For instance, when this appears in my plaintext file:
GRANT ALL ON SCHEMA test TO user1;
GRANT USAGE ON SCHEMA test TO user2;
GRANT USAGE ON SCHEMA test TO user3;

and user1 doesn't exist on the target database, user2 and user3 get the
expected privileges when restoring from /psql/ but not from /pg_restore/.

Here's a reproducible test case.

CREATE DATABASE prod_db;
CREATE DATABASE dev_db_pg_restore;
CREATE DATABASE dev_db_psql;
CREATE ROLE prod_user;
CREATE ROLE prod_and_dev_user;

-- in prod_db
CREATE SCHEMA test;
GRANT ALL ON SCHEMA test TO prod_user;
GRANT USAGE ON SCHEMA test TO prod_and_dev_user;

pg_dump -Fc prod_db &> prod_dump.bin
pg_dump prod_db &> prod_dump.sql

-- On database instance containing the dev dbs.
DROP ROLE prod_user;

The above needs more information:

1) Are the dev_* databases on a different cluster?

2) If so did you run:

CREATE ROLE prod_user;
CREATE ROLE prod_and_dev_user;

on that cluster first?

Also if so:

In the restores below are you sure you are pointed at the same cluster
in each case?

pg_restore prod_dump.bin -d dev_db_pg_restore
psql -d dev_db_psql -f prod_dump.sqlu

What do you see if you do:

pg_restore -f prod_dump_restore.sql prod_dump.bin

and look in prod_dump_res?tore.sql?

-- In dev_db_psql
SELECT has_schema_privilege('prod_and_dev_user', 'test', 'usage');
Expected result: true
Actual result: true

-- In dev_db_pg_restore
SELECT has_schema_privilege('prod_and_dev_user', 'test', 'usage');
Expected result: true
Actual result: false

The behavior seems to be related to the fact that /pg_restore/ reports
the failed command as containing all the semicolon-delimited privilege
command, which get executed separately when restoring from plaintext:

pg_restore: [archiver (db)] could not execute query: ERROR:  role
"prod_user" does not exist
    Command was: GRANT ALL ON SCHEMA test TO prod_user;
GRANT USAGE ON SCHEMA test TO prod_and_dev_user;

As a workaround, I've created a unprivileged dummy user by this name on
the dev database, but my question is, is this a bug or feature? If a
feature, is the behavior documented? I didn't find any documentation,
but that doesn't mean it doesn't exist.

Thanks,
Sherrylyn

--
Adrian Klaver
adrian.klaver@aklaver.com

#3Sherrylyn Branchaw
sbranchaw@gmail.com
In reply to: Adrian Klaver (#2)
Re: pg_restore restores privileges differently from psql

The above needs more information:

1) Are the dev_* databases on a different cluster?

2) If so did you run:

CREATE ROLE prod_user;
CREATE ROLE prod_and_dev_user;

on that cluster first?

I happened to put them all on the same cluster for my test case, in order
to reproduce the unexpected behavior I encountered in the wild, where the
prod and dev dbs happened to live on different clusters. In short, as long
as you make sure the *prod_user* exists on the source cluster at the time
when the dump is taken, and doesn't exist on the target cluster when the
restore is carried out, you get the behavior I saw.

Also if so:

In the restores below are you sure you are pointed at the same cluster
in each case?

Yes, I am sure. Both for the test case I was creating for the mailing list,
and for the script where I first encountered this in the wild. Worked like
a charm when I used *psql*, didn't do what I expected when I used
*pg_restore*.

What do you see if you do:

pg_restore -f prod_dump_restore.sql prod_dump.bin

and look in prod_dump_res?tore.sql?

This is exactly what I did when I was first trying to figure out what was
going on. I see

GRANT USAGE ON SCHEMA test TO prod_and_dev_user;
GRANT ALL ON SCHEMA test TO prod_user;

If I then use *psql* to load *prod_dump_restore.sql* to a cluster that
doesn't have the *prod_user *role, I get the expected behavior (
*prod_and_dev_user* has usage on the schema *test*), because *psql* treats
each of those statements as a separate command. *pg_restore* seems to treat
them as a single command, judging by the error message and the behavior.

Best,
Sherrylyn

#4Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Sherrylyn Branchaw (#3)
Re: pg_restore restores privileges differently from psql

On 1/14/19 12:04 PM, Sherrylyn Branchaw wrote:

The above needs more information:

1) Are the dev_* databases on a different cluster?

2) If so did you run:

CREATE ROLE prod_user;
CREATE ROLE prod_and_dev_user;

on that cluster first?

I happened to put them all on the same cluster for my test case, in
order to reproduce the unexpected behavior I encountered in the wild,
where the prod and dev dbs happened to live on different clusters. In

I don't see how that can work:

test=# \c prod_db
You are now connected to database "prod_db" as user "postgres".
prod_db=# CREATE SCHEMA test;
CREATE SCHEMA
prod_db=# GRANT ALL ON SCHEMA test TO prod_user;

GRANT

prod_db=# GRANT USAGE ON SCHEMA test TO prod_and_dev_user;

GRANT

prod_db=# \c dev_db_psql
You are now connected to database "dev_db_psql" as user "postgres".

dev_db_psql=# DROP ROLE prod_user;
ERROR: role "prod_user" cannot be dropped because some objects depend
on it

DETAIL: 1 object in database prod_db

short, as long as you make sure the /prod_user/ exists on

the source

cluster at the time when the dump is taken, and doesn't exist on the
target cluster when the restore is carried out, you get the behavior I saw.

Also if so:

In the restores below are you sure you are pointed at the same cluster
in each case?

Yes, I am sure. Both for the test case I was creating for the mailing
list, and for the script where I first encountered this in the wild.
Worked like a charm when I used /psql/, didn't do what I expected when I
used /pg_restore/.

What do you see if you do:

pg_restore -f prod_dump_restore.sql prod_dump.bin

and look in prod_dump_res?tore.sql?

This is exactly what I did when I was first trying to figure out what
was going on. I see

GRANT USAGE ON SCHEMA test TO prod_and_dev_user;
GRANT ALL ON SCHEMA test TO prod_user;

If I then use /psql/ to load /prod_dump_restore.sql/ to a cluster that
doesn't have the /prod_user /role, I get the expected behavior
(/prod_and_dev_user/ has usage on the schema /test/), because /psql/
treats each of those statements as a separate command. /pg_restore/
seems to treat them as a single command, judging by the error message
and the behavior.

Best,
Sherrylyn

--
Adrian Klaver
adrian.klaver@aklaver.com

#5Sherrylyn Branchaw
sbranchaw@gmail.com
In reply to: Adrian Klaver (#4)
Re: pg_restore restores privileges differently from psql

I don't see how that can work:

test=# \c prod_db
You are now connected to database "prod_db" as user "postgres".
prod_db=# CREATE SCHEMA test;
CREATE SCHEMA
prod_db=# GRANT ALL ON SCHEMA test TO prod_user;

GRANT

prod_db=# GRANT USAGE ON SCHEMA test TO prod_and_dev_user;

GRANT

prod_db=# \c dev_db_psql
You are now connected to database "dev_db_psql" as user "postgres".

dev_db_psql=# DROP ROLE prod_user;
ERROR: role "prod_user" cannot be dropped because some objects depend
on it

DETAIL: 1 object in database prod_db
Yes, if you're going to put all your databases on the same cluster, you
first have to remove dependent objects before dropping the role. There are
multiple ways of going about that: dropping the database, revoking the
privileges on the objects in question, etc. If you put the databases on
different clusters and make sure you create only the prod_and_dev_user on
the second cluster, you won't run into this issue.

The goal is to make sure the prod_user role exists when the dump is taken
and doesn't exist when the restore is done. You can do this by putting the
databases on separate clusters and creating the appropriate roles, or by
dropping the user on the single cluster.

Sorry, I considered spelling all this out in the original post, because
there are two different ways of going about making sure the user isn't
present for the restore, but it seemed unnecessarily complicated, and I
thought I would let people decide what makes sense in their own environment
for testing. Sorry if that led to more confusion in the end. Just make sure
the user exists when you need it to exist and doesn't exist when you need
it not to exist, and test both the pg_restore and psql methods, and I
expect you'll see the same behavior I did (and if not, I'll be very
curious).

Best,
Sherrylyn

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sherrylyn Branchaw (#1)
Re: pg_restore restores privileges differently from psql

Sherrylyn Branchaw <sbranchaw@gmail.com> writes:

I'm restoring the schema from one database (prod) to another (dev). There
are users that exist in prod that don't exist in dev. When the restore job
tries to grant privileges to nonexistent users in dev, I would like it to
generate an error, which is safe to ignore, but still correctly grant
privileges to any user that does exist in dev.

That's the behavior I see when I dump to a plain file and restore it
using *psql
-f*, but not the behavior I see when I do a *pg_dump -Fc* followed by
*pg_restore.* *pg_restore *seems to treat all the *GRANT* statements for a
single object as a single statement, and when one errors out, they all
error out, meaning I'm left with no privileges on the object in question.

Yeah, this is a known issue --- the various GRANTs for a specific object
are stored in a single "TOC entry" in the archive, which pg_restore will
send to the server in a single PQexec call, causing them to be effectively
one transaction. The easiest way to deal with it is to not send
pg_restore's output directly to the target server, but feed it through
psql, something like

pg_restore ... | psql [connection parameters]

There's been some discussion of a real fix, but it seems messy.
pg_restore doesn't have a parser that would be adequate to separate
out multiple SQL commands in a TOC entry, and we'd rather not try
to give it one (mainly because of fear of cross-version compatibility
issues).

regards, tom lane

#7Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Sherrylyn Branchaw (#5)
Re: pg_restore restores privileges differently from psql

On 1/14/19 12:57 PM, Sherrylyn Branchaw wrote:

I don't see how that can work:

test=# \c prod_db
You are now connected to database "prod_db" as user "postgres".
prod_db=# CREATE SCHEMA test;
CREATE SCHEMA
prod_db=# GRANT ALL ON SCHEMA test TO prod_user;

GRANT

prod_db=# GRANT USAGE ON SCHEMA test TO prod_and_dev_user;

GRANT

prod_db=# \c dev_db_psql
You are now connected to database "dev_db_psql" as user "postgres".

dev_db_psql=# DROP ROLE prod_user;
ERROR:  role "prod_user" cannot be dropped because some objects depend
on it

DETAIL:  1 object in database prod_db
Yes, if you're going to put all your databases on the same cluster, you
first have to remove dependent objects before dropping the role. There
are multiple ways of going about that: dropping the database, revoking
the privileges on the objects in question, etc. If you put the databases
on different clusters and make sure you create only the
prod_and_dev_user on the second cluster, you won't run into this issue.

The goal is to make sure the prod_user role exists when the dump is
taken and doesn't exist when the restore is done. You can do this by
putting the databases on separate clusters and creating the appropriate
roles, or by dropping the user on the single cluster.

Sorry, I considered spelling all this out in the original post, because
there are two different ways of going about making sure the user isn't

True. The thing is that there are a lot of moving parts to this and the
more detail you give the better the chance that someone trying to
replicate actually does replicate your setup. Removes some of the
uncertainty when the results diverge.

present for the restore, but it seemed unnecessarily complicated, and I
thought I would let people decide what makes sense in their own
environment for testing. Sorry if that led to more confusion in the end.
Just make sure the user exists when you need it to exist and doesn't
exist when you need it not to exist, and test both the pg_restore and
psql methods, and I expect you'll see the same behavior I did (and if
not, I'll be very curious).'

Well if I use two clusters and:

DROP ROLE prod_user;

on the one that has the dev_* databases then I can replicate.

I am interested in this as I have noted another divergence between a
text dump and a binary dump, which as yet remains unanswered:

/messages/by-id/b7a24043-1c9c-1876-f06a-8f916293c142@aklaver.com

Best,
Sherrylyn

--
Adrian Klaver
adrian.klaver@aklaver.com

#8Sherrylyn Branchaw
sbranchaw@gmail.com
In reply to: Tom Lane (#6)
Re: pg_restore restores privileges differently from psql

Yeah, this is a known issue --- the various GRANTs for a specific object
are stored in a single "TOC entry" in the archive, which pg_restore will
send to the server in a single PQexec call, causing them to be effectively
one transaction. The easiest way to deal with it is to not send
pg_restore's output directly to the target server, but feed it through
psql, something like

pg_restore ... | psql [connection parameters]

There's been some discussion of a real fix, but it seems messy.
pg_restore doesn't have a parser that would be adequate to separate
out multiple SQL commands in a TOC entry, and we'd rather not try
to give it one (mainly because of fear of cross-version compatibility
issues).

Okay, thank you. I thought that might be the case: undesirable behavior
where the costs outweigh the benefits of fixing. Given that, would it be
worth making it more obvious in the pg_restore documentation that
pg_restore and its psql output don't always give the same results in the
target database?

Best,
Sherrylyn