Thread: improve predefined roles documentation

improve predefined roles documentation

From
Nathan Bossart
Date:
IMHO there are a couple of opportunities for improving the predefined roles
documentation [0]:

* Several of the roles in the table do not have corresponding descriptions
  in the paragraphs below the table (e.g., pg_read_all_data,
  pg_write_all_data, pg_checkpoint, pg_maintain,
  pg_use_reserved_connections, and pg_create_subscription).  Furthermore,
  IMHO it is weird to have some of the information in the table and some
  more in a paragraph down the page.

* The table has grown quite a bit over the years, but the entries are
  basically unordered, requiring readers to perform a linear search (O(n))
  to find information about a specific role.

* Documentation that refers to these roles cannot link to a specific one.
  Currently, we just link to the page or the table.

I think we could improve matters by abandoning the table and instead
documenting these roles more like we document GUCs, i.e., each one has a
section below it where we can document it in as much detail as we want.
Some of these roles should probably be documented together (e.g.,
pg_read_all_data and pg_write_all_data), so the ordering is unlikely to be
perfect, but I'm hoping it would still be a net improvement.

Thoughts?

[0] https://www.postgresql.org/docs/devel/predefined-roles.html

-- 
nathan



Re: improve predefined roles documentation

From
"David G. Johnston"
Date:
On Thu, Jun 13, 2024 at 12:48 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
I think we could improve matters by abandoning the table and instead
documenting these roles more like we document GUCs, i.e., each one has a
section below it where we can document it in as much detail as we want.


One of the main attributes for the GUCs is their category.  If we want to improve organization we'd need to assign categories first.  We already implicitly do so in the description section where we do group them together and explain why - but it is all informal.  But getting rid of those groupings and descriptions and isolating each role so it can be linked to more easily seems like a net loss in usability.

I'm against getting rid of the table.  If we do add authoritative subsection anchors we should just do like we do in System Catalogs and make the existing table name values hyperlinks to those newly added anchors.  Breaking the one table up into multiple tables along category lines is something to consider.

David J.

Re: improve predefined roles documentation

From
Nathan Bossart
Date:
On Thu, Jun 13, 2024 at 01:05:33PM -0700, David G. Johnston wrote:
> One of the main attributes for the GUCs is their category.  If we want to
> improve organization we'd need to assign categories first.  We already
> implicitly do so in the description section where we do group them together
> and explain why - but it is all informal.  But getting rid of those
> groupings and descriptions and isolating each role so it can be linked to
> more easily seems like a net loss in usability.

What I had in mind is that we would retain these groupings.  I agree that
isolating roles like pg_read_all_data and pg_write_all_data would be no
good.

-- 
nathan



Re: improve predefined roles documentation

From
Robert Haas
Date:
On Thu, Jun 13, 2024 at 3:48 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> I think we could improve matters by abandoning the table and instead
> documenting these roles more like we document GUCs, i.e., each one has a
> section below it where we can document it in as much detail as we want.
> Some of these roles should probably be documented together (e.g.,
> pg_read_all_data and pg_write_all_data), so the ordering is unlikely to be
> perfect, but I'm hoping it would still be a net improvement.

+1. I'm not sure about all of the details, but I like the general idea.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: improve predefined roles documentation

From
Nathan Bossart
Date:
On Mon, Jun 17, 2024 at 02:10:22PM -0400, Robert Haas wrote:
> On Thu, Jun 13, 2024 at 3:48 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> I think we could improve matters by abandoning the table and instead
>> documenting these roles more like we document GUCs, i.e., each one has a
>> section below it where we can document it in as much detail as we want.
>> Some of these roles should probably be documented together (e.g.,
>> pg_read_all_data and pg_write_all_data), so the ordering is unlikely to be
>> perfect, but I'm hoping it would still be a net improvement.
> 
> +1. I'm not sure about all of the details, but I like the general idea.

Here is a first try.  I did pretty much exactly what I proposed in the
quoted text, so I don't have much else to say about it.  I didn't see an
easy way to specify multiple ids and xreflabels for a given entry, so the
entries that describe multiple roles just use the name of the first role
listed.  In practice, I think this just means you need to do a little extra
work when linking to one of the other roles from elsewhere in the docs,
which doesn't seem too terrible.

-- 
nathan

Attachment

Re: improve predefined roles documentation

From
"David G. Johnston"
Date:
On Tue, Jun 18, 2024 at 9:52 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Jun 17, 2024 at 02:10:22PM -0400, Robert Haas wrote:
> On Thu, Jun 13, 2024 at 3:48 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> I think we could improve matters by abandoning the table and instead
>> documenting these roles more like we document GUCs, i.e., each one has a
>> section below it where we can document it in as much detail as we want.
>> Some of these roles should probably be documented together (e.g.,
>> pg_read_all_data and pg_write_all_data), so the ordering is unlikely to be
>> perfect, but I'm hoping it would still be a net improvement.
>
> +1. I'm not sure about all of the details, but I like the general idea.

Here is a first try.  I did pretty much exactly what I proposed in the
quoted text, so I don't have much else to say about it.  I didn't see an
easy way to specify multiple ids and xreflabels for a given entry, so the
entries that describe multiple roles just use the name of the first role
listed.  In practice, I think this just means you need to do a little extra
work when linking to one of the other roles from elsewhere in the docs,
which doesn't seem too terrible.


I like this.  Losing the table turned out to be ok.  Thank you.

I would probably put pg_monitor first in the list.

+ A user granted this role cannot however send signals to a backend owned by a superuser.

Remove "however", or put commas around it.  I prefer the first option.

Do we really need to repeat "even without having it explicitly" everywhere?

+ This role does not have the role attribute BYPASSRLS set.

Even if it did, that attribute isn't inherited anyway...

"This role is still governed by any row level security policies that may be in force.  Consider setting the BYPASSRLS attribute on member roles."

(assuming they intend it to be ALL data then doing the bypassrls even if they are not today using it doesn't hurt)

pg_stat_scan_tables - This explanation leaves me wanting more.  Maybe give an example of such a function?  I think the bar is set a bit too high just talking about a specific lock level.

"As these roles are able to access any file on the server file system,"

We forbid running under root so this isn't really true.  They do have operating system level access logged in as the database process owner.  They are able to access all PostgreSQL files on the server file system and usually can run a wide-variety of commands on the server.

"access, therefore great care should be taken"

I would go with:

"access.  Great care should be taken"

Seems more impactful as its own sentence then at the end of a long multi-part sentence.

"server with COPY any other file-access functions." - s/with/using/

David J.


Re: improve predefined roles documentation

From
Nathan Bossart
Date:
On Thu, Jun 20, 2024 at 07:57:16PM -0700, David G. Johnston wrote:
> I like this.  Losing the table turned out to be ok.  Thank you.

Awesome.

> I would probably put pg_monitor first in the list.

Done.

> + A user granted this role cannot however send signals to a backend owned
> by a superuser.
> 
> Remove "however", or put commas around it.  I prefer the first option.

This sentence caught my eye earlier, too, because it seems to imply that a
superuser granted this role cannot signal superuser-owned backends.  I
changed it to the following:

    Note that this role does not permit signaling backends owned by a
    superuser.

How does that sound?

> Do we really need to repeat "even without having it explicitly" everywhere?

Removed.

> + This role does not have the role attribute BYPASSRLS set.
> 
> Even if it did, that attribute isn't inherited anyway...
> 
> "This role is still governed by any row level security policies that may be
> in force.  Consider setting the BYPASSRLS attribute on member roles."
> 
> (assuming they intend it to be ALL data then doing the bypassrls even if
> they are not today using it doesn't hurt)

How does something like the following sound?

    This role does not bypass row-level security (RLS) policies.  If RLS is
    being used, an administrator may wish to set BYPASSRLS on roles which
    this role is granted to.

> pg_stat_scan_tables - This explanation leaves me wanting more.  Maybe give
> an example of such a function?  I think the bar is set a bit too high just
> talking about a specific lock level.

I was surprised to learn that this role only provides privileges for
functions in contrib/ modules.  Anyway, added an example.

> "As these roles are able to access any file on the server file system,"
> 
> We forbid running under root so this isn't really true.  They do have
> operating system level access logged in as the database process owner.
> They are able to access all PostgreSQL files on the server file system and
> usually can run a wide-variety of commands on the server.

I just deleted this clause.

> "access, therefore great care should be taken"
> 
> I would go with:
> 
> "access.  Great care should be taken"
> 
> Seems more impactful as its own sentence then at the end of a long
> multi-part sentence.

Done.

> "server with COPY any other file-access functions." - s/with/using/

Done.

-- 
nathan

Attachment

Re: improve predefined roles documentation

From
Robert Haas
Date:
On Fri, Jun 21, 2024 at 11:40 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
> Done.

If you look at how the varlistentries begin, there are three separate patterns:

* Some document a single role and start with "Allow doing blah blah blah".

* Some document a couple of rolls so there are several paragraphs,
each beginning with "<literal>name_of_role</literal allows doing blah
blah blah". This is sometimes preceded by an introductory paragraph
explaining why this group of roles exists and what it's intended to
do.

* pg_database_owner is completely different from the rest, focusing on
explaining who is in the role rather than what the role gets to do.

I think the first two cases could be made more like each other by
changing the varlistentires that are just about one setting to use the
second format instead of the first, e.g. pg_checkpoint allows
executing the CHECKPOINT command.

I don't know what to do about pg_database_owner. I almost wonder if
that should be moved out of the table and documented as a special
case. Or maybe some more wordsmithing would add clarity. Or maybe it's
fine as-is.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: improve predefined roles documentation

From
Nathan Bossart
Date:
On Mon, Jun 24, 2024 at 02:44:33PM -0400, Robert Haas wrote:
> I think the first two cases could be made more like each other by
> changing the varlistentires that are just about one setting to use the
> second format instead of the first, e.g. pg_checkpoint allows
> executing the CHECKPOINT command.

Done.

> I don't know what to do about pg_database_owner. I almost wonder if
> that should be moved out of the table and documented as a special
> case. Or maybe some more wordsmithing would add clarity. Or maybe it's
> fine as-is.

I've left it alone for now.  I thought about adding something like
"pg_database_owner does not provide any special capabilities or access
out-of-the-box" to the beginning of the entry, but I don't have time at the
moment to properly wordsmith the rest.  If anyone else wants to give it a
try before I get to it (probably tomorrow), please be my guest.  TBH I
think the existing content is pretty good, so I'm not opposed to leaving it
alone, even if the style is different than the other entries.

-- 
nathan

Attachment

Re: improve predefined roles documentation

From
"David G. Johnston"
Date:
On Mon, Jun 24, 2024 at 2:53 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Jun 24, 2024 at 02:44:33PM -0400, Robert Haas wrote:

> I don't know what to do about pg_database_owner. I almost wonder if
> that should be moved out of the table and documented as a special
> case. Or maybe some more wordsmithing would add clarity. Or maybe it's
> fine as-is.

I've left it alone for now.  I thought about adding something like
"pg_database_owner does not provide any special capabilities or access
out-of-the-box" to the beginning of the entry, but I don't have time at the
moment to properly wordsmith the rest.  If anyone else wants to give it a
try before I get to it (probably tomorrow), please be my guest.

This feels like a case where why is more important than what, so here's my first draft suggestion.

pg_database_owner owns the initially created public schema and has an implicit membership list of one - the role owning the connected-to database.  It exists to encourage and facilitate best practices regarding database administration.  The primary rule being to avoid using superuser to own or do things.  The bootstrap superuser thus should connect to the postgres database and create a login role, with the createdb attribute, and then use that role to create and administer additional databases.  In that context, this feature allows the creator of the new database to log into it and immediately begin working in the public schema.

As a result, in version 14, PostgreSQL no longer initially grants create and usage privileges, on the public schema, to the public pseudo-role.

For technical reasons, pg_database_owner may not participate in explicitly granted role memberships.  This is an easily mitigated limitation since the role that owns the database may be a group and any inheriting members of that group will be considered owners as well.

David J.

Re: improve predefined roles documentation

From
Nathan Bossart
Date:
On Mon, Jun 24, 2024 at 03:53:46PM -0700, David G. Johnston wrote:
> pg_database_owner owns the initially created public schema and has an
> implicit membership list of one - the role owning the connected-to database.
> It exists to encourage and facilitate best practices regarding database
> administration.  The primary rule being to avoid using superuser to own or
> do things.

This part restates much of the existing text in a slightly different order,
but I'm not sure it's an improvement.  I like that it emphasizes the intent
of the role, but the basic description of the role is kind-of buried in the
first sentence.  IMO the way this role works is confusing enough that we
ought to keep the basic facts at the very top.  I might even add a bit of
fluff in an attempt to make things clearer:

    The pg_database_owner role always has exactly one implicit,
    situation-dependent member, namely the owner of the current database.

One other thing I like about your proposal is that it moves the bit about
the role initially owning the public schema much earlier.  That seems like
possibly the most important practical piece of information to convey to
administrators.  Perhaps that could be the very next thing after the basic
description of the role.

> The bootstrap superuser thus should connect to the postgres
> database and create a login role, with the createdb attribute, and then use
> that role to create and administer additional databases.  In that context,
> this feature allows the creator of the new database to log into it and
> immediately begin working in the public schema.

IMHO the majority of this is too prescriptive, even if it's generally good
advice.

> As a result, in version 14, PostgreSQL no longer initially grants create
> and usage privileges, on the public schema, to the public pseudo-role.

IME we tend to shy away from adding too many historical details in the
documentation, and I'm not sure this information is directly related enough
to the role to include here.

> For technical reasons, pg_database_owner may not participate in explicitly
> granted role memberships.  This is an easily mitigated limitation since the
> role that owns the database may be a group and any inheriting members of
> that group will be considered owners as well.

IIUC the intent of this is to expand on the following sentence in the
existing docs:

    pg_database_owner cannot be a member of any role, and it cannot have
    non-implicit members.

My instinct would be to do something like this:

    pg_database_owner cannot be granted membership in any role, and no role
    may be granted non-implicit membership in pg_database_owner.

IMHO the part about mitigating this limitation via groups is again too
prescriptive.

-- 
nathan



Re: improve predefined roles documentation

From
Robert Haas
Date:
On Tue, Jun 25, 2024 at 11:35 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
> IIUC the intent of this is to expand on the following sentence in the
> existing docs:
>
>         pg_database_owner cannot be a member of any role, and it cannot have
>         non-implicit members.
>
> My instinct would be to do something like this:
>
>         pg_database_owner cannot be granted membership in any role, and no role
>         may be granted non-implicit membership in pg_database_owner.

But you couldn't grant someone implicit membership either, because
then it wouldn't be implicit. So maybe something like this:

pg_database_owner is a predefined role for which membership consists,
implicitly, of the current database owner. It cannot be granted
membership in any role, and no role can be granted membership in
pg_database_owner. However, like any role, it can own objects or
receive grants of access privileges. Consequently, once
pg_database_owner has rights within a template database, each owner of
a database instantiated from that template will exercise those rights.
Initially, this role owns the public schema, so each database owner
governs local use of the schema.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: improve predefined roles documentation

From
Nathan Bossart
Date:
On Tue, Jun 25, 2024 at 12:16:30PM -0400, Robert Haas wrote:
> pg_database_owner is a predefined role for which membership consists,
> implicitly, of the current database owner. It cannot be granted
> membership in any role, and no role can be granted membership in
> pg_database_owner. However, like any role, it can own objects or
> receive grants of access privileges. Consequently, once
> pg_database_owner has rights within a template database, each owner of
> a database instantiated from that template will exercise those rights.
> Initially, this role owns the public schema, so each database owner
> governs local use of the schema.

The main difference between this and the existing documentation is that the
sentence on membership has been rephrased and moved to earlier in the
paragraph.  I think this helps the logical flow a bit.  We first talk about
implicit membership, then explicit membership, then we talk about
privileges and the consequences of those privileges, and finally we talk
about the default privileges.  So, WFM.

-- 
nathan



Re: improve predefined roles documentation

From
Nathan Bossart
Date:
On Tue, Jun 25, 2024 at 11:28:18AM -0500, Nathan Bossart wrote:
> On Tue, Jun 25, 2024 at 12:16:30PM -0400, Robert Haas wrote:
>> pg_database_owner is a predefined role for which membership consists,
>> implicitly, of the current database owner. It cannot be granted
>> membership in any role, and no role can be granted membership in
>> pg_database_owner. However, like any role, it can own objects or
>> receive grants of access privileges. Consequently, once
>> pg_database_owner has rights within a template database, each owner of
>> a database instantiated from that template will exercise those rights.
>> Initially, this role owns the public schema, so each database owner
>> governs local use of the schema.
> 
> The main difference between this and the existing documentation is that the
> sentence on membership has been rephrased and moved to earlier in the
> paragraph.  I think this helps the logical flow a bit.  We first talk about
> implicit membership, then explicit membership, then we talk about
> privileges and the consequences of those privileges, and finally we talk
> about the default privileges.  So, WFM.

I used this in v4 (with some minor changes).  I've copied it here to ease
review.

    pg_database_owner always has exactly one implicit member: the current
    database owner. It cannot be granted membership in any role, and no
    role can be granted membership in pg_database_owner. However, like any
    other role, it can own objects and receive grants of access privileges.
    Consequently, once pg_database_owner has rights within a template
    database, each owner of a database instantiated from that template will
    possess those rights. Initially, this role owns the public schema, so
    each database owner governs local use of that schema.

-- 
nathan

Attachment

Re: improve predefined roles documentation

From
Robert Haas
Date:
On Tue, Jun 25, 2024 at 3:26 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> I used this in v4 (with some minor changes).

Looking at this again, how happy are you with the way you've got
several roles per <varlistentry> instead of one for each? I realize
that was probably part of the intent of the change, to move the data
from below the table into the table, and I see the merit of that. But
one of your other complaints was the entries in the table were
unordered, and it's hard for them to really be ordered if you have
groups like this, since you can't alphabetize, for example, unless you
have just a single entry per <varlistentry>.

I don't have a problem with doing it the way you have here if you
think that's good. I'm just asking.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: improve predefined roles documentation

From
Nathan Bossart
Date:
On Tue, Jun 25, 2024 at 04:04:03PM -0400, Robert Haas wrote:
> Looking at this again, how happy are you with the way you've got
> several roles per <varlistentry> instead of one for each? I realize
> that was probably part of the intent of the change, to move the data
> from below the table into the table, and I see the merit of that. But
> one of your other complaints was the entries in the table were
> unordered, and it's hard for them to really be ordered if you have
> groups like this, since you can't alphabetize, for example, unless you
> have just a single entry per <varlistentry>.

Yeah, my options were to either separate the roles or to weaken the
ordering, and I guess I felt like the weaker ordering was slightly less
bad.  The extra context in some of the groups seemed worth keeping, and
this probably isn't the only page of our docs that might require ctrl+f.
But I'll yield to the majority opinion here.

-- 
nathan



Re: improve predefined roles documentation

From
"David G. Johnston"
Date:
On Tue, Jun 25, 2024 at 1:19 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Tue, Jun 25, 2024 at 04:04:03PM -0400, Robert Haas wrote:
> Looking at this again, how happy are you with the way you've got
> several roles per <varlistentry> instead of one for each? I realize
> that was probably part of the intent of the change, to move the data
> from below the table into the table, and I see the merit of that. But
> one of your other complaints was the entries in the table were
> unordered, and it's hard for them to really be ordered if you have
> groups like this, since you can't alphabetize, for example, unless you
> have just a single entry per <varlistentry>.

Yeah, my options were to either separate the roles or to weaken the
ordering, and I guess I felt like the weaker ordering was slightly less
bad.  The extra context in some of the groups seemed worth keeping, and
this probably isn't the only page of our docs that might require ctrl+f.
But I'll yield to the majority opinion here.


There are few enough that logical grouping instead of strict alphabetical makes sense.

v4 WFM

David J.

Re: improve predefined roles documentation

From
Robert Haas
Date:
On Tue, Jun 25, 2024 at 4:19 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> Yeah, my options were to either separate the roles or to weaken the
> ordering, and I guess I felt like the weaker ordering was slightly less
> bad.  The extra context in some of the groups seemed worth keeping, and
> this probably isn't the only page of our docs that might require ctrl+f.
> But I'll yield to the majority opinion here.

I'm not objecting. I'm just asking.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: improve predefined roles documentation

From
Nathan Bossart
Date:
On Wed, Jun 26, 2024 at 10:40:10AM -0400, Robert Haas wrote:
> On Tue, Jun 25, 2024 at 4:19 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> Yeah, my options were to either separate the roles or to weaken the
>> ordering, and I guess I felt like the weaker ordering was slightly less
>> bad.  The extra context in some of the groups seemed worth keeping, and
>> this probably isn't the only page of our docs that might require ctrl+f.
>> But I'll yield to the majority opinion here.
> 
> I'm not objecting. I'm just asking.

Cool.  I'll plan on committing this latest version once v18devel hacking
begins.

-- 
nathan



Re: improve predefined roles documentation

From
Nathan Bossart
Date:
rebased (due to commit ccd3802, which introduced
pg_signal_autovacuum_worker)

-- 
nathan

Attachment

Re: improve predefined roles documentation

From
Nathan Bossart
Date:
Committed.  Thank you for reviewing!

-- 
nathan