Re: pg_upgrade fails with non-standard ACL - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: pg_upgrade fails with non-standard ACL
Date
Msg-id 20190820150457.GR16436@tamriel.snowman.net
Whole thread Raw
In response to Re: pg_upgrade fails with non-standard ACL  (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Responses Re: pg_upgrade fails with non-standard ACL
List pgsql-hackers
Greetings,

* Anastasia Lubennikova (a.lubennikova@postgrespro.ru) wrote:
> 14.08.2019 3:28, Stephen Frost wrote:
> >* Bruce Momjian (bruce@momjian.us) wrote:
> >>As much as it would be nice if the release notes covered all that, and
> >>we updated pg_upgrade to somehow handle them, it just isn't realistic.
> >>As we can see here, the problems often take years to show up, and even
> >>then there were probably many other people who had the problem who never
> >>reported it to us.
> >Yeah, the possible changes when you think about column-level privileges
> >as well really gets to be quite large..
> >
> >This is why my thinking is that we should come up with additional
> >default roles, which aren't tied to specific catalog structures but
> >instead are for a more general set of capabilities which we manage and
> >users can either decide to use, or not.  If they decide to work with the
> >individual functions then they have to manage the upgrade process if and
> >when we make changes to those functions.
>
> Idea of having special roles looks good to me, though, I don't see
> how to define what grants are needed for each role. Let's say, we
> define role "backupuser" that obviously must have grants on
> pg_start_backup()
> and pg_stop_backup(). Should it also have access to pg_is_in_recovery()
> or for example version()?

pg_is_in_recovery() and version() are already allowed to be executed by
public, and I don't see any particular reason to change that, so I don't
believe those would need to be explicitly GRANT'd to this new role.

I would think the specific set would be those listed under:

https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-BACKUP

which currently require superuser access.

This isn't a new idea, btw, there was a great deal of discussion three
years ago around all of this.  Particularly relevant is this:

https://www.postgresql.org/message-id/20160104175516.GC3685%40tamriel.snowman.net

> >It'd be pretty neat if pg_upgrade could connect to the old and new
> >clusters concurrently and then perform a generic catalog comparison
> >between them and identify all objects which have been changed and
> >determine if there's any non-default ACLs or dependencies on the catalog
> >objects which are different between the clusters.  That seems like an
> >awful lot of work though, and I'm not sure there's really any need,
> >given that we don't change the catalog for a given major version- we
> >could just generate the list using some queries whenever we release a
> >new major version and update pg_upgrade with it.
> >
> >>The only positive is that when pg_upgrade does fail, at least we have a
> >>system that clearly points to the cause, but unfortunately usually at
> >>run-time, not at --check time.
> >Getting it to be at check time would certainly be a great improvement.
> >
> >Solving this in pg_upgrade does seem like it's probably the better
> >approach rather than trying to do it in pg_dump.  Unfortunately, that
> >likely means that all we can do is have pg_upgrade point out to the user
> >when something will fail, with recommendations on how to address it, but
> >that's also something users are likely used to and willing to accept,
> >and puts the onus on them to consider their ACL decisions when we're
> >making catalog changes, and it keeps these issues out of pg_dump.
>
> I wrote a prototype to check API and ACL compatibility (see attachment).
> In the current implementation it fetches the list of system procedures for
> both old and new clusters
> and reports deleted functions or ACL changes during pg_upgrade check:
>
>
> Performing Consistency Checks
> -----------------------------
> ...
> Checking for system functions API compatibility
> dbname postgres : check procsig is equal pg_stop_backup(), procacl not equal
> {anastasia=X/anastasia,backup=X/anastasia} vs {anastasia=X/anastasia}
> dbname postgres : procedure pg_stop_backup(exclusive boolean, OUT lsn
> pg_lsn, OUT labelfile text, OUT spcmapfile text) doesn't exist in
> new_cluster
> dbname postgres : procedure pg_switch_xlog() doesn't exist in new_cluster
> dbname postgres : procedure pg_xlog_replay_pause() doesn't exist in
> new_cluster
> dbname postgres : procedure pg_xlog_replay_resume() doesn't exist in
> new_cluster
> ...
>
> I think it's a good first step in the right direction.
> Now I have questions about implementation details:
>
> 1) How exactly should we report this incompatibility to a user?
> I think it's fine to leave the warnings and also write some hint for the
> user by analogy with other checks.
> "Reset ACL on the problem functions to default in the old cluster to
> continue"
>
> Is it enough?

Not really sure what else we could do there..?  Did you have something
else in mind?  We could possibly provide the specific commands to run,
that seems like about the only other thing we could possibly do.

> 2) This approach can be extended to other catalog modifications, you
> mentioned above.
> I don't see what else can break pg_upgrade in the same way. However, I don't
> mind
> implementing more checks, while I work on this issue, if you can point on
> them.

I was thinking of, for example, column-level privileges on system
relations (tables or views) where that column was later removed, for
example.  I do appreciate that such changes are relatively rare but they
do happen...

Will try to take a look at the actual patch later today.

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Alex
Date:
Subject: Re: understand the pg locks in in an simple case
Next
From: Sergei Kornilov
Date:
Subject: Re: Change ereport level for QueuePartitionConstraintValidation