Thread: pg_restore restores privileges differently from psql

pg_restore restores privileges differently from psql

From
Sherrylyn Branchaw
Date:
Hi,

I'm running two Postgres 9.6.11 databases on RHEL 6.9.

I'm restoring the schema from one database (prod) to another (dev). There are users that exist in prod that don't exist in dev. When the restore job tries to grant privileges to nonexistent users in dev, I would like it to generate an error, which is safe to ignore, but still correctly grant privileges to any user that does exist in dev.

That's the behavior I see when I dump to a plain file and restore it using psql -f, but not the behavior I see when I do a pg_dump -Fc followed by pg_restore. pg_restore seems to treat all the GRANT statements for a single object as a single statement, and when one errors out, they all error out, meaning I'm left with no privileges on the object in question.

For instance, when this appears in my plaintext file:
GRANT ALL ON SCHEMA test TO user1;
GRANT USAGE ON SCHEMA test TO user2;
GRANT USAGE ON SCHEMA test TO user3;

and user1 doesn't exist on the target database, user2 and user3 get the expected privileges when restoring from psql but not from pg_restore.

Here's a reproducible test case.

CREATE DATABASE prod_db;
CREATE DATABASE dev_db_pg_restore;
CREATE DATABASE dev_db_psql;
CREATE ROLE prod_user;
CREATE ROLE prod_and_dev_user;

-- in prod_db
CREATE SCHEMA test;
GRANT ALL ON SCHEMA test TO prod_user;
GRANT USAGE ON SCHEMA test TO prod_and_dev_user;

pg_dump -Fc prod_db &> prod_dump.bin
pg_dump prod_db &> prod_dump.sql

-- On database instance containing the dev dbs.
DROP ROLE prod_user;

pg_restore prod_dump.bin -d dev_db_pg_restore
psql -d dev_db_psql -f prod_dump.sql

-- In dev_db_psql
SELECT has_schema_privilege('prod_and_dev_user', 'test', 'usage');
Expected result: true
Actual result: true

-- In dev_db_pg_restore
SELECT has_schema_privilege('prod_and_dev_user', 'test', 'usage');
Expected result: true
Actual result: false

The behavior seems to be related to the fact that pg_restore reports the failed command as containing all the semicolon-delimited privilege command, which get executed separately when restoring from plaintext:

pg_restore: [archiver (db)] could not execute query: ERROR:  role "prod_user" does not exist
    Command was: GRANT ALL ON SCHEMA test TO prod_user;
GRANT USAGE ON SCHEMA test TO prod_and_dev_user;

As a workaround, I've created a unprivileged dummy user by this name on the dev database, but my question is, is this a bug or feature? If a feature, is the behavior documented? I didn't find any documentation, but that doesn't mean it doesn't exist.

Thanks,
Sherrylyn

Re: pg_restore restores privileges differently from psql

From
Adrian Klaver
Date:
On 1/14/19 10:15 AM, Sherrylyn Branchaw wrote:
> Hi,
> 
> I'm running two Postgres 9.6.11 databases on RHEL 6.9.
> 
> I'm restoring the schema from one database (prod) to another (dev). 
> There are users that exist in prod that don't exist in dev. When the 
> restore job tries to grant privileges to nonexistent users in dev, I 
> would like it to generate an error, which is safe to ignore, but still 
> correctly grant privileges to any user that does exist in dev.
> 
> That's the behavior I see when I dump to a plain file and restore it 
> using /psql -f/, but not the behavior I see when I do a /pg_dump 
> -Fc/ followed by /pg_restore./ /pg_restore /seems to treat all the 
> /GRANT/ statements for a single object as a single statement, and when 
> one errors out, they all error out, meaning I'm left with no privileges 
> on the object in question.
> 
> For instance, when this appears in my plaintext file:
> GRANT ALL ON SCHEMA test TO user1;
> GRANT USAGE ON SCHEMA test TO user2;
> GRANT USAGE ON SCHEMA test TO user3;
> 
> and user1 doesn't exist on the target database, user2 and user3 get the 
> expected privileges when restoring from /psql/ but not from /pg_restore/.
> 
> Here's a reproducible test case.
> 
> CREATE DATABASE prod_db;
> CREATE DATABASE dev_db_pg_restore;
> CREATE DATABASE dev_db_psql;
> CREATE ROLE prod_user;
> CREATE ROLE prod_and_dev_user;
> 
> -- in prod_db
> CREATE SCHEMA test;
> GRANT ALL ON SCHEMA test TO prod_user;
> GRANT USAGE ON SCHEMA test TO prod_and_dev_user;
> 
> pg_dump -Fc prod_db &> prod_dump.bin
> pg_dump prod_db &> prod_dump.sql
> 
> -- On database instance containing the dev dbs.
> DROP ROLE prod_user;

The above needs more information:

1) Are the dev_* databases on a different cluster?

2) If so did you run:

CREATE ROLE prod_user;
CREATE ROLE prod_and_dev_user;

on that cluster first?

Also if so:

In the restores below are you sure you are pointed at the same cluster 
in each case?

> 
> pg_restore prod_dump.bin -d dev_db_pg_restore
> psql -d dev_db_psql -f prod_dump.sqlu 

What do you see if you do:

pg_restore -f prod_dump_restore.sql prod_dump.bin

and look in prod_dump_res?tore.sql?

> 
> -- In dev_db_psql
> SELECT has_schema_privilege('prod_and_dev_user', 'test', 'usage');
> Expected result: true
> Actual result: true
> 
> -- In dev_db_pg_restore
> SELECT has_schema_privilege('prod_and_dev_user', 'test', 'usage');
> Expected result: true
> Actual result: false
> 
> The behavior seems to be related to the fact that /pg_restore/ reports 
> the failed command as containing all the semicolon-delimited privilege 
> command, which get executed separately when restoring from plaintext:
> 
> pg_restore: [archiver (db)] could not execute query: ERROR:  role 
> "prod_user" does not exist
>      Command was: GRANT ALL ON SCHEMA test TO prod_user;
> GRANT USAGE ON SCHEMA test TO prod_and_dev_user;
> 
> As a workaround, I've created a unprivileged dummy user by this name on 
> the dev database, but my question is, is this a bug or feature? If a 
> feature, is the behavior documented? I didn't find any documentation, 
> but that doesn't mean it doesn't exist.
> 
> Thanks,
> Sherrylyn


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: pg_restore restores privileges differently from psql

From
Sherrylyn Branchaw
Date:
The above needs more information:

1) Are the dev_* databases on a different cluster?

2) If so did you run:

CREATE ROLE prod_user;
CREATE ROLE prod_and_dev_user;

on that cluster first?

I happened to put them all on the same cluster for my test case, in order to reproduce the unexpected behavior I encountered in the wild, where the prod and dev dbs happened to live on different clusters. In short, as long as you make sure the prod_user exists on the source cluster at the time when the dump is taken, and doesn't exist on the target cluster when the restore is carried out, you get the behavior I saw.

Also if so:

In the restores below are you sure you are pointed at the same cluster 
in each case?

Yes, I am sure. Both for the test case I was creating for the mailing list, and for the script where I first encountered this in the wild. Worked like a charm when I used psql, didn't do what I expected when I used pg_restore.

What do you see if you do:

pg_restore -f prod_dump_restore.sql prod_dump.bin

and look in prod_dump_res?tore.sql?

This is exactly what I did when I was first trying to figure out what was going on. I see

GRANT USAGE ON SCHEMA test TO prod_and_dev_user;
GRANT ALL ON SCHEMA test TO prod_user;

If I then use psql to load prod_dump_restore.sql to a cluster that doesn't have the prod_user role, I get the expected behavior (prod_and_dev_user has usage on the schema test), because psql treats each of those statements as a separate command. pg_restore seems to treat them as a single command, judging by the error message and the behavior.

Best,
Sherrylyn

Re: pg_restore restores privileges differently from psql

From
Adrian Klaver
Date:
On 1/14/19 12:04 PM, Sherrylyn Branchaw wrote:
> The above needs more information:
> 
> 1) Are the dev_* databases on a different cluster?
> 
> 2) If so did you run:
> 
> CREATE ROLE prod_user;
> CREATE ROLE prod_and_dev_user;
> 
> on that cluster first?
> 
> I happened to put them all on the same cluster for my test case, in 
> order to reproduce the unexpected behavior I encountered in the wild, 
> where the prod and dev dbs happened to live on different clusters. In 

I don't see how that can work:

test=# \c prod_db
You are now connected to database "prod_db" as user "postgres".
prod_db=# CREATE SCHEMA test;
CREATE SCHEMA
prod_db=# GRANT ALL ON SCHEMA test TO prod_user; 
 

GRANT 
 

prod_db=# GRANT USAGE ON SCHEMA test TO prod_and_dev_user; 
 

GRANT

prod_db=# \c dev_db_psql
You are now connected to database "dev_db_psql" as user "postgres". 
 

dev_db_psql=# DROP ROLE prod_user;
ERROR:  role "prod_user" cannot be dropped because some objects depend 
on it 

DETAIL:  1 object in database prod_db

> short, as long as you make sure the /prod_user/ exists on

  the source
> cluster at the time when the dump is taken, and doesn't exist on the 
> target cluster when the restore is carried out, you get the behavior I saw.
> 
> Also if so:
> 
> In the restores below are you sure you are pointed at the same cluster
> in each case?
> 
> Yes, I am sure. Both for the test case I was creating for the mailing 
> list, and for the script where I first encountered this in the wild. 
> Worked like a charm when I used /psql/, didn't do what I expected when I 
> used /pg_restore/.
> 
> What do you see if you do:
> 
> pg_restore -f prod_dump_restore.sql prod_dump.bin
> 
> and look in prod_dump_res?tore.sql?
> 
> This is exactly what I did when I was first trying to figure out what 
> was going on. I see
> 
> GRANT USAGE ON SCHEMA test TO prod_and_dev_user;
> GRANT ALL ON SCHEMA test TO prod_user;
> 
> If I then use /psql/ to load /prod_dump_restore.sql/ to a cluster that 
> doesn't have the /prod_user /role, I get the expected behavior 
> (/prod_and_dev_user/ has usage on the schema /test/), because /psql/ 
> treats each of those statements as a separate command. /pg_restore/ 
> seems to treat them as a single command, judging by the error message 
> and the behavior.
> 
> Best,
> Sherrylyn


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: pg_restore restores privileges differently from psql

From
Sherrylyn Branchaw
Date:
I don't see how that can work:

test=# \c prod_db
You are now connected to database "prod_db" as user "postgres".
prod_db=# CREATE SCHEMA test;
CREATE SCHEMA
prod_db=# GRANT ALL ON SCHEMA test TO prod_user; 


GRANT 


prod_db=# GRANT USAGE ON SCHEMA test TO prod_and_dev_user; 


GRANT

prod_db=# \c dev_db_psql
You are now connected to database "dev_db_psql" as user "postgres". 


dev_db_psql=# DROP ROLE prod_user;
ERROR:  role "prod_user" cannot be dropped because some objects depend 
on it 

DETAIL:  1 object in database prod_db
Yes, if you're going to put all your databases on the same cluster, you first have to remove dependent objects before dropping the role. There are multiple ways of going about that: dropping the database, revoking the privileges on the objects in question, etc. If you put the databases on different clusters and make sure you create only the prod_and_dev_user on the second cluster, you won't run into this issue.

The goal is to make sure the prod_user role exists when the dump is taken and doesn't exist when the restore is done. You can do this by putting the databases on separate clusters and creating the appropriate roles, or by dropping the user on the single cluster.

Sorry, I considered spelling all this out in the original post, because there are two different ways of going about making sure the user isn't present for the restore, but it seemed unnecessarily complicated, and I thought I would let people decide what makes sense in their own environment for testing. Sorry if that led to more confusion in the end. Just make sure the user exists when you need it to exist and doesn't exist when you need it not to exist, and test both the pg_restore and psql methods, and I expect you'll see the same behavior I did (and if not, I'll be very curious).

Best,
Sherrylyn

Re: pg_restore restores privileges differently from psql

From
Tom Lane
Date:
Sherrylyn Branchaw <sbranchaw@gmail.com> writes:
> I'm restoring the schema from one database (prod) to another (dev). There
> are users that exist in prod that don't exist in dev. When the restore job
> tries to grant privileges to nonexistent users in dev, I would like it to
> generate an error, which is safe to ignore, but still correctly grant
> privileges to any user that does exist in dev.

> That's the behavior I see when I dump to a plain file and restore it
> using *psql
> -f*, but not the behavior I see when I do a *pg_dump -Fc* followed by
> *pg_restore.* *pg_restore *seems to treat all the *GRANT* statements for a
> single object as a single statement, and when one errors out, they all
> error out, meaning I'm left with no privileges on the object in question.

Yeah, this is a known issue --- the various GRANTs for a specific object
are stored in a single "TOC entry" in the archive, which pg_restore will
send to the server in a single PQexec call, causing them to be effectively
one transaction.  The easiest way to deal with it is to not send
pg_restore's output directly to the target server, but feed it through
psql, something like

    pg_restore  ... | psql [connection parameters]

There's been some discussion of a real fix, but it seems messy.
pg_restore doesn't have a parser that would be adequate to separate
out multiple SQL commands in a TOC entry, and we'd rather not try
to give it one (mainly because of fear of cross-version compatibility
issues).

            regards, tom lane


Re: pg_restore restores privileges differently from psql

From
Adrian Klaver
Date:
On 1/14/19 12:57 PM, Sherrylyn Branchaw wrote:
> I don't see how that can work:
> 
> test=# \c prod_db
> You are now connected to database "prod_db" as user "postgres".
> prod_db=# CREATE SCHEMA test;
> CREATE SCHEMA
> prod_db=# GRANT ALL ON SCHEMA test TO prod_user;
> 
> 
> GRANT
> 
> 
> prod_db=# GRANT USAGE ON SCHEMA test TO prod_and_dev_user;
> 
> 
> GRANT
> 
> prod_db=# \c dev_db_psql
> You are now connected to database "dev_db_psql" as user "postgres".
> 
> 
> dev_db_psql=# DROP ROLE prod_user;
> ERROR:  role "prod_user" cannot be dropped because some objects depend
> on it
> 
> DETAIL:  1 object in database prod_db
> Yes, if you're going to put all your databases on the same cluster, you 
> first have to remove dependent objects before dropping the role. There 
> are multiple ways of going about that: dropping the database, revoking 
> the privileges on the objects in question, etc. If you put the databases 
> on different clusters and make sure you create only the 
> prod_and_dev_user on the second cluster, you won't run into this issue.
> 
> The goal is to make sure the prod_user role exists when the dump is 
> taken and doesn't exist when the restore is done. You can do this by 
> putting the databases on separate clusters and creating the appropriate 
> roles, or by dropping the user on the single cluster.
> 
> Sorry, I considered spelling all this out in the original post, because 
> there are two different ways of going about making sure the user isn't 

True. The thing is that there are a lot of moving parts to this and the 
more detail you give the better the chance that someone trying to 
replicate actually does replicate your setup. Removes some of the 
uncertainty when the results diverge.

> present for the restore, but it seemed unnecessarily complicated, and I 
> thought I would let people decide what makes sense in their own 
> environment for testing. Sorry if that led to more confusion in the end. 
> Just make sure the user exists when you need it to exist and doesn't 
> exist when you need it not to exist, and test both the pg_restore and 
> psql methods, and I expect you'll see the same behavior I did (and if 
> not, I'll be very curious).'

Well if I use two clusters and:

DROP ROLE prod_user;

on the one that has the dev_* databases then I can replicate.

I am interested in this as I have noted another divergence between a 
text dump and a binary dump, which as yet remains unanswered:

https://www.postgresql.org/message-id/b7a24043-1c9c-1876-f06a-8f916293c142%40aklaver.com


> 
> Best,
> Sherrylyn


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: pg_restore restores privileges differently from psql

From
Sherrylyn Branchaw
Date:
Yeah, this is a known issue --- the various GRANTs for a specific object
are stored in a single "TOC entry" in the archive, which pg_restore will
send to the server in a single PQexec call, causing them to be effectively
one transaction.  The easiest way to deal with it is to not send
pg_restore's output directly to the target server, but feed it through
psql, something like

        pg_restore  ... | psql [connection parameters]

There's been some discussion of a real fix, but it seems messy.
pg_restore doesn't have a parser that would be adequate to separate
out multiple SQL commands in a TOC entry, and we'd rather not try
to give it one (mainly because of fear of cross-version compatibility
issues).

Okay, thank you. I thought that might be the case: undesirable behavior where the costs outweigh the benefits of fixing. Given that, would it be worth making it more obvious in the pg_restore documentation that pg_restore and its psql output don't always give the same results in the target database?

Best,
Sherrylyn