Thread: Proposal to sync SET ROLE and pg_stat_activity

Proposal to sync SET ROLE and pg_stat_activity

From
Grant Finnemore
Date:
Hi,

In the manual for SET ROLE, it's noted that an invocation of SET ROLE
will leave the session_user unchanged, but will change the current_user.

Invoking pg_stat_activity after the SET ROLE is changed will however
leave the usename unchanged. (Also from the manual we note that a
snapshot is taken at the first call, although in the case of
current_query and others, the field is updated at regular intervals)

SET SESSION AUTHORIZATION behaves similarly, although in that case,
it's documented that both session_user and current_user are changed
to reflect the new user.

An example:-

test=# select current_user, session_user;
  current_user | session_user
--------------+--------------
  grant        | grant
(1 row)

test=# select usename from pg_stat_activity;
  usename
---------
  grant
(1 row)


test=# set session role bob;
SET
test=> select current_user, session_user;
  current_user | session_user
--------------+--------------
  bob          | grant
(1 row)

test=> select usename from pg_stat_activity;
  usename
---------
  grant
(1 row)


I have on occasion used a database pooling scheme that whenever a
connection is retrieved from the pool, either a SET ROLE or SET
SESSION AUTHORIZATION is issued to enable database level access
restrictions. Similarly, when the connection is returned, a RESET
instruction is issued.

IMHO, it would be advantageous to be able to display which
connections are in use by a given user through the pg_stat_activity
view. Looking through the archives, I've found one other request
for this which AFAICS wasn't answered.

   http://archives.postgresql.org/pgsql-bugs/2007-04/msg00035.php

There are two ways in which this could be done. Firstly, we could
alter the current usename field in the view. This would keep the
view definition the same, but would alter the semantics, which could
affect existing clients. Alternatively, we could introduce another
column that would reflect the role name.

I attach a patch that kinda works for the SET SESSION AUTH case, and
will undertake to complete the work should there be some general
support for this proposal.

Comments?

Regards,
Grant Finnemore
Index: src/backend/postmaster/pgstat.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/postmaster/pgstat.c,v
retrieving revision 1.179
diff -c -r1.179 pgstat.c
*** src/backend/postmaster/pgstat.c    15 Aug 2008 08:37:39 -0000    1.179
--- src/backend/postmaster/pgstat.c    25 Aug 2008 09:08:46 -0000
***************
*** 2264,2269 ****
--- 2264,2294 ----
  }

  /* ----------
+  * pgstat_report_change_authorization()
+  *
+  *      Called from *** to report the changing of session authorization.
+  * ----------
+  */
+ void
+ pgstat_report_change_authorization(void)
+ {
+         volatile PgBackendStatus *beentry = MyBEEntry;
+
+         if (!pgstat_track_activities || !beentry)
+                 return;
+
+         /*
+          * Update my status entry, following the protocol of bumping
+          * st_changecount before and after.  We use a volatile pointer
+          * here to ensure the compiler doesn't try to get cute.
+          */
+         beentry->st_changecount++;
+         beentry->st_userid = GetSessionUserId();
+         beentry->st_changecount++;
+         Assert((beentry->st_changecount & 1) == 0);
+ }
+
+ /* ----------
   * pgstat_report_waiting() -
   *
   *    Called from lock manager to report beginning or end of a lock wait.
Index: src/backend/utils/init/miscinit.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/init/miscinit.c,v
retrieving revision 1.167
diff -c -r1.167 miscinit.c
*** src/backend/utils/init/miscinit.c    27 Mar 2008 17:24:16 -0000    1.167
--- src/backend/utils/init/miscinit.c    25 Aug 2008 09:08:46 -0000
***************
*** 349,354 ****
--- 349,357 ----
      /* We force the effective user IDs to match, too */
      OuterUserId = userid;
      CurrentUserId = userid;
+
+     /* Let the stats subsystem know of the change */
+     pgstat_report_change_authorization();
  }


Index: src/include/pgstat.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/pgstat.h,v
retrieving revision 1.78
diff -c -r1.78 pgstat.h
*** src/include/pgstat.h    15 Aug 2008 08:37:40 -0000    1.78
--- src/include/pgstat.h    25 Aug 2008 09:08:46 -0000
***************
*** 626,631 ****
--- 626,632 ----

  extern void pgstat_report_activity(const char *what);
  extern void pgstat_report_xact_timestamp(TimestampTz tstamp);
+ extern void pgstat_report_change_authorization(void);
  extern void pgstat_report_waiting(bool waiting);
  extern const char *pgstat_get_backend_current_activity(int pid, bool checkUser);


Re: Proposal to sync SET ROLE and pg_stat_activity

From
Euler Taveira de Oliveira
Date:
Grant Finnemore escreveu:
> Invoking pg_stat_activity after the SET ROLE is changed will however
> leave the usename unchanged.
> 
You're right. Because, as you spotted, usename is synonym of session
usename.

> SET SESSION AUTHORIZATION behaves similarly, although in that case,
> it's documented that both session_user and current_user are changed
> to reflect the new user.
> 
Ugh? The manual [1][2] documents the behavior of both commands.

> I have on occasion used a database pooling scheme that whenever a
> connection is retrieved from the pool, either a SET ROLE or SET
> SESSION AUTHORIZATION is issued to enable database level access
> restrictions. Similarly, when the connection is returned, a RESET
> instruction is issued.
> 
I can't see in your use case the advantage of allowing to show current_user.

> IMHO, it would be advantageous to be able to display which
> connections are in use by a given user through the pg_stat_activity
> view.
> 
Isn't it embarrassing if, for example, mary queries pg_stat_activity and
sees that I'm using her role, is it? I'm not against exposing this
information but I think it could be superuser-only.

> There are two ways in which this could be done. Firstly, we could
> alter the current usename field in the view. This would keep the
> view definition the same, but would alter the semantics, which could
> affect existing clients. Alternatively, we could introduce another
> column that would reflect the role name.
> 
Why not add another column: current_usename? I would object if we've
intended to change the view semantics.

[1] http://www.postgresql.org/docs/8.3/static/sql-set-role.html
[2]
http://www.postgresql.org/docs/8.3/static/sql-set-session-authorization.html


--  Euler Taveira de Oliveira http://www.timbira.com/


Re: Proposal to sync SET ROLE and pg_stat_activity

From
Grant Finnemore
Date:
Hi Euler,

Euler Taveira de Oliveira wrote:
> Grant Finnemore escreveu:
>> Invoking pg_stat_activity after the SET ROLE is changed will however
>> leave the usename unchanged.
>>
> You're right. Because, as you spotted, usename is synonym of session
> usename.>
The one problem with this mapping is that per the manual, user is
equivalent to current_user, and so it could be argued that usename
is equivalent to both of these.

>> SET SESSION AUTHORIZATION behaves similarly, although in that case,
>> it's documented that both session_user and current_user are changed
>> to reflect the new user.
>>
> Ugh? The manual [1][2] documents the behavior of both commands.
>
Sorry if I wasn't clear here - I agree that the manual documents this
behaviour. My intent was to use these to highlight the different
between what these display, and what pg_stat_activity displays.

>> I have on occasion used a database pooling scheme that whenever a
>> connection is retrieved from the pool, either a SET ROLE or SET
>> SESSION AUTHORIZATION is issued to enable database level access
>> restrictions. Similarly, when the connection is returned, a RESET
>> instruction is issued.
>>
> I can't see in your use case the advantage of allowing to show current_user.>
Perhaps an example would clarify my use case.

I have a session pool, where all connections to the database are
obtained as a superuser. On issuing connections to the client, we
invoke either SET ROLE or SET SESSION AUTHORIZATION and switch to
a role with less permissions. This means that we don't have to
reserve a connection per user, and we can still use the database
access restrictions.

Now, if someone starts a query that takes a long time, with the
changes I'm proposing, I can see which user is running that query. As
it is now, all I see is a list of connections issued to a superuser.

>> IMHO, it would be advantageous to be able to display which
>> connections are in use by a given user through the pg_stat_activity
>> view.
>>
> Isn't it embarrassing if, for example, mary queries pg_stat_activity and
> sees that I'm using her role, is it? I'm not against exposing this
> information but I think it could be superuser-only.
> 
Well, it could be argued that if it's embarrassing, then the user
using that role is doing something illicit. Also, if we have rights
to switch to another role, then surely that's an intended use?

>> There are two ways in which this could be done. Firstly, we could
>> alter the current usename field in the view. This would keep the
>> view definition the same, but would alter the semantics, which could
>> affect existing clients. Alternatively, we could introduce another
>> column that would reflect the role name.
>>
> Why not add another column: current_usename? I would object if we've
> intended to change the view semantics.
> 
Yeah, my thoughts would be to map user/current_user to usename, and
create a new column for session_user called susename (or something
similar.)

Behaviour would be something along the following lines...

1. Login as user foo
2.  user/current_user = foo, session_user = foo
2a. select usename, susename from pg_stat_activity => (foo, foo)
3. SET ROLE bar
4.  user/current_user = bar, session_user = foo
4a. select usename, susename from pg_stat_activity => (bar, foo)

.. alternatively ..

1. Login as user foo
2.  user/current_user = foo, session_user = foo
2a. select usename, susename from pg_stat_activity => (foo, foo)
3. SET SESSION AUTHORIZATION bar
4.  user/current_user = bar, session_user = bar
4a. select usename, susename from pg_stat_activity => (bar, bar)

Regards,
Grant


Re: Proposal to sync SET ROLE and pg_stat_activity

From
Bernd Helmle
Date:
--On Mittwoch, August 27, 2008 09:35:03 +0200 Grant Finnemore 
<grant@guruhut.com> wrote:

> I have a session pool, where all connections to the database are
> obtained as a superuser. On issuing connections to the client, we
> invoke either SET ROLE or SET SESSION AUTHORIZATION and switch to
> a role with less permissions. This means that we don't have to
> reserve a connection per user, and we can still use the database
> access restrictions.


But you have to ensure that your session pool is smaller than 
max_connections, since this will eat up superuser_reserved_connections and 
would make administrator intervention  impossible under certain 
circumstances.

And why do you need to hack pg_stat_activity, isn't it possible to plug 
your own view in?

--  Thanks
                   Bernd


Re: Proposal to sync SET ROLE and pg_stat_activity

From
Grant Finnemore
Date:
Hi Bernd,

Bernd Helmle wrote:
> --On Mittwoch, August 27, 2008 09:35:03 +0200 Grant Finnemore 
> <grant@guruhut.com> wrote:
> 
>> I have a session pool, where all connections to the database are
>> obtained as a superuser. On issuing connections to the client, we
>> invoke either SET ROLE or SET SESSION AUTHORIZATION and switch to
>> a role with less permissions. This means that we don't have to
>> reserve a connection per user, and we can still use the database
>> access restrictions.
> 
> 
> But you have to ensure that your session pool is smaller than 
> max_connections, since this will eat up superuser_reserved_connections 
> and would make administrator intervention  impossible under certain 
> circumstances.
> 
Yes, but that's the easy part. Any reasonable pooling software allows
you to set max connections.

> And why do you need to hack pg_stat_activity, isn't it possible to plug 
> your own view in?
> 
Well, pg_stat_activity isn't really the problem here, because as you
point out, it's just a view, and I could certainly redefine the view.
The limiting factor is that the backend doesn't push the role name
changes to the stats subsystem for either SET ROLE or SET SESSION
AUTH.

An alternative to changing the current behaviour would be to introduce
new variables in the backend structures that are sent to the stats
subsystem, and which could be read by as yet undefined functions. This
would keep existing behaviour, but allow others to obtain the
alternative behaviour through the creation of a separate view.

Regards,
Grant


Re: Proposal to sync SET ROLE and pg_stat_activity

From
Alvaro Herrera
Date:
Grant Finnemore wrote:

> Well, pg_stat_activity isn't really the problem here, because as you
> point out, it's just a view, and I could certainly redefine the view.
> The limiting factor is that the backend doesn't push the role name
> changes to the stats subsystem for either SET ROLE or SET SESSION
> AUTH.

Keep in mind that stats are updated only once every 500 ms, and messages
have a nontrivial overhead.  With your proposed changes, there would be
a significant performance overhead to running security definer
functions.

A possible solution to this would be to publish current_user in shared
memory, so that remote processes could read it from there (similar to
how current_query is published).

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Proposal to sync SET ROLE and pg_stat_activity

From
Grant Finnemore
Date:
Hi Alvaro,

Alvaro Herrera wrote:
> Grant Finnemore wrote:
> 
>> Well, pg_stat_activity isn't really the problem here, because as you
>> point out, it's just a view, and I could certainly redefine the view.
>> The limiting factor is that the backend doesn't push the role name
>> changes to the stats subsystem for either SET ROLE or SET SESSION
>> AUTH.
> 
> Keep in mind that stats are updated only once every 500 ms, and messages
> have a nontrivial overhead.  With your proposed changes, there would be
> a significant performance overhead to running security definer
> functions.
> 
> A possible solution to this would be to publish current_user in shared
> memory, so that remote processes could read it from there (similar to
> how current_query is published).
> 
Yeah, I was concerned about security definer functions, although I
hadn't yet got round to benchmarking the effects.

If there is some consensus that from a user perspective this is a
reasonable enhancement, I'll pursue the changes using your suggestion of
the current_query approach.

Regards,
Grant