Thread: [PATCH] Expose port->authn_id to extensions and triggers

[PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
Hi all,

Stephen pointed out [1] that the authenticated identity that's stored
in MyProcPort can't be retrieved by extensions or triggers. Attached is
a patch that provides both a C API and a SQL function for retrieving
it.

GetAuthenticatedIdentityString() is a mouthful but I wanted to
differentiate it from the existing GetAuthenticatedUserId(); better
names welcome. It only exists as an accessor because I wasn't sure if
extensions outside of contrib were encouraged to rely on the internal
layout of struct Port. (If they can, then that call can go away
entirely.)

Thanks,
--Jacob

[1] https://www.postgresql.org/message-id/CAOuzzgpFpuroNRabEvB9kST_TSyS2jFicBNoXvW7G2pZFixyBw%40mail.gmail.com

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Thu, Feb 24, 2022 at 12:15:40AM +0000, Jacob Champion wrote:
> Stephen pointed out [1] that the authenticated identity that's stored
> in MyProcPort can't be retrieved by extensions or triggers. Attached is
> a patch that provides both a C API and a SQL function for retrieving
> it.
>
> GetAuthenticatedIdentityString() is a mouthful but I wanted to
> differentiate it from the existing GetAuthenticatedUserId(); better
> names welcome. It only exists as an accessor because I wasn't sure if
> extensions outside of contrib were encouraged to rely on the internal
> layout of struct Port. (If they can, then that call can go away
> entirely.)

+char *
+GetAuthenticatedIdentityString(void)
+{
+   if (!MyProcPort || !MyProcPort->authn_id)
+       return NULL;
+
+   return pstrdup(MyProcPort->authn_id);

I don't quite see the additional value that this API brings as
MyProcPort is directly accessible, and contrib modules like
postgres_fdw and sslinfo just use that to find the data of the current
backend.  Cannot a module like pgaudit, through the elog hook, do its
work with what we have already in place?

What's the use case for a given session to be able to report back
only its own authn through SQL?

I could still see a use case for that at a more global level with
beentrys, but it looked like there was not much interest the last time
I dropped this idea.
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Thu, 2022-02-24 at 20:39 +0900, Michael Paquier wrote:
> I don't quite see the additional value that this API brings as
> MyProcPort is directly accessible, and contrib modules like
> postgres_fdw and sslinfo just use that to find the data of the current
> backend.

Right -- I just didn't know if third-party modules were actually able
to rely on the internal layout of struct Port. Is that guaranteed to
remain constant for a major release line? If so, this new API is
superfluous.

> Cannot a module like pgaudit, through the elog hook, do its
> work with what we have already in place?

Given the above, I would hope so. Stephen mentioned that pgaudit only
had access to the logged-in role, and when I proposed a miscadmin.h API
he said that would help. CC'ing him to see what he meant; I don't know
if pgaudit has additional constraints.

> What's the use case for a given session to be able to report back
> only its own authn through SQL?

That's for triggers to be able to grab the current ID for
logging/auditing. Is there a better way to expose this for that use
case?

> I could still see a use case for that at a more global level with
> beentrys, but it looked like there was not much interest the last time
> I dropped this idea.

I agree that this would be useful to have in the stats. From my outside
perspective, it seems like it's difficult to get strings of arbitrary
length in there; is that accurate?

Thanks,
--Jacob

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Julien Rouhaud
Date:
Hi,

On Thu, Feb 24, 2022 at 04:50:59PM +0000, Jacob Champion wrote:
> On Thu, 2022-02-24 at 20:39 +0900, Michael Paquier wrote:
> > I don't quite see the additional value that this API brings as
> > MyProcPort is directly accessible, and contrib modules like
> > postgres_fdw and sslinfo just use that to find the data of the current
> > backend.
> 
> Right -- I just didn't know if third-party modules were actually able
> to rely on the internal layout of struct Port. Is that guaranteed to
> remain constant for a major release line? If so, this new API is
> superfluous.

Yes, third-party can rely on Port layout.  We don't break ABI between minor
release.  In some occasions we can add additional fields at the end of a
struct, but nothing more.

> > I could still see a use case for that at a more global level with
> > beentrys, but it looked like there was not much interest the last time
> > I dropped this idea.
> 
> I agree that this would be useful to have in the stats. From my outside
> perspective, it seems like it's difficult to get strings of arbitrary
> length in there; is that accurate?

Yes, as it's all in shared memory.  The only workaround is using something like
track_activity_query_size, but it's not great.



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Fri, 2022-02-25 at 01:15 +0800, Julien Rouhaud wrote:
> On Thu, Feb 24, 2022 at 04:50:59PM +0000, Jacob Champion wrote:
> > On Thu, 2022-02-24 at 20:39 +0900, Michael Paquier wrote:
> > > I don't quite see the additional value that this API brings as
> > > MyProcPort is directly accessible, and contrib modules like
> > > postgres_fdw and sslinfo just use that to find the data of the current
> > > backend.
> > 
> > Right -- I just didn't know if third-party modules were actually able
> > to rely on the internal layout of struct Port. Is that guaranteed to
> > remain constant for a major release line? If so, this new API is
> > superfluous.
> 
> Yes, third-party can rely on Port layout.  We don't break ABI between minor
> release.  In some occasions we can add additional fields at the end of a
> struct, but nothing more.

That simplifies things. PFA a smaller v2; if pgaudit can't make use of
libpq-be.h for some reason, then I guess we can tailor a fix to that
use case.

> > > I could still see a use case for that at a more global level with
> > > beentrys, but it looked like there was not much interest the last time
> > > I dropped this idea.
> > 
> > I agree that this would be useful to have in the stats. From my outside
> > perspective, it seems like it's difficult to get strings of arbitrary
> > length in there; is that accurate?
> 
> Yes, as it's all in shared memory.  The only workaround is using something like
> track_activity_query_size, but it's not great.

Yeah... I was following a similar track with the initial work last
year, but I dropped it when the cost of implementation started to grow
considerably. At the time, though, it looked like some overhauls to the
stats framework were being pursued? I should read up on that thread.

--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Julien Rouhaud
Date:
On Thu, Feb 24, 2022 at 08:44:08PM +0000, Jacob Champion wrote:
> 
> Yeah... I was following a similar track with the initial work last
> year, but I dropped it when the cost of implementation started to grow
> considerably. At the time, though, it looked like some overhauls to the
> stats framework were being pursued? I should read up on that thread.

Do you mean the shared memory stats patchset?  IIUC this is unrelated, as the
beentry stuff Michael was talking about is a different infrastructure
(backend_status.[ch]), and I don't think there are any plans to move that to
dynamic shared memory.



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Andres Freund
Date:
Hi,

On 2022-02-25 13:01:26 +0800, Julien Rouhaud wrote:
> On Thu, Feb 24, 2022 at 08:44:08PM +0000, Jacob Champion wrote:
> > 
> > Yeah... I was following a similar track with the initial work last
> > year, but I dropped it when the cost of implementation started to grow
> > considerably. At the time, though, it looked like some overhauls to the
> > stats framework were being pursued? I should read up on that thread.
> 
> Do you mean the shared memory stats patchset?  IIUC this is unrelated, as the
> beentry stuff Michael was talking about is a different infrastructure
> (backend_status.[ch]), and I don't think there are any plans to move that to
> dynamic shared memory.

Until a year ago the backend_status.c stuff was in in pgstat.c - I just split
them out because of the shared memory status work - so it'd not be surprising
for them to mentally be thrown in one bucket.

Basically the type of stats we're trying to move to dynamic shared memory is
about counters that should persist for a while and are accumulated across
connections etc. Whereas backend_status.c is more about tracking the current
state (what query is a backend running, what user, etc).

They're not unrelated though: backend_status.c feeds pgstat.c with some
information occasionally.

Greetings,

Andres Freund



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Julien Rouhaud
Date:
Hi,

On Thu, Feb 24, 2022 at 09:18:26PM -0800, Andres Freund wrote:
> 
> On 2022-02-25 13:01:26 +0800, Julien Rouhaud wrote:
> > On Thu, Feb 24, 2022 at 08:44:08PM +0000, Jacob Champion wrote:
> > > 
> > > Yeah... I was following a similar track with the initial work last
> > > year, but I dropped it when the cost of implementation started to grow
> > > considerably. At the time, though, it looked like some overhauls to the
> > > stats framework were being pursued? I should read up on that thread.
> > 
> > Do you mean the shared memory stats patchset?  IIUC this is unrelated, as the
> > beentry stuff Michael was talking about is a different infrastructure
> > (backend_status.[ch]), and I don't think there are any plans to move that to
> > dynamic shared memory.
> 
> Until a year ago the backend_status.c stuff was in in pgstat.c - I just split
> them out because of the shared memory status work - so it'd not be surprising
> for them to mentally be thrown in one bucket.

Right.

> Basically the type of stats we're trying to move to dynamic shared memory is
> about counters that should persist for a while and are accumulated across
> connections etc. Whereas backend_status.c is more about tracking the current
> state (what query is a backend running, what user, etc).

But would it be acceptable to use dynamic shared memory in backend_status and
e.g. have a dsa_pointer rather than a fixed length array?  That seems like a
bad idea for query text in general, but for authn_id for instance it seems less
likely to hold gigantic strings, and also have more or less consistent size
when provided.



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Thu, 2022-02-24 at 20:44 +0000, Jacob Champion wrote:
> That simplifies things. PFA a smaller v2; if pgaudit can't make use of
> libpq-be.h for some reason, then I guess we can tailor a fix to that
> use case.

Ha, opr_sanity caught my use of cstring. I'll work on a fix later
today.

--Jacob

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Fri, 2022-02-25 at 16:28 +0000, Jacob Champion wrote:
> Ha, opr_sanity caught my use of cstring. I'll work on a fix later
> today.

Fixed in v3.

--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Andres Freund
Date:
On 2022-02-25 20:19:24 +0000, Jacob Champion wrote:
> From 2fde60a6bc3739f1894c8c264120e4fa0f04df64 Mon Sep 17 00:00:00 2001
> From: Jacob Champion <pchampion@vmware.com>
> Date: Mon, 14 Feb 2022 08:10:53 -0800
> Subject: [PATCH v3] Add API to retrieve authn_id from SQL

> The authn_id field in MyProcPort is currently only accessible to the
> backend itself.  Add a SQL function, session_authn_id(), to expose the
> field to triggers that may want to make use of it.

Looks to me like authn_id isn't synchronized to parallel workers right now. So
the function will return the wrong thing when executed as part of a parallel
query.

I don't think we should add further functions not prefixed with pg_.


Perhaps a few tests for less trivial authn_ids could be worthwhile?
E.g. certificate DNs.

Greetings,

Andres Freund



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Andres Freund
Date:
Hi,

On 2022-02-25 14:25:52 +0800, Julien Rouhaud wrote:
> > Basically the type of stats we're trying to move to dynamic shared memory is
> > about counters that should persist for a while and are accumulated across
> > connections etc. Whereas backend_status.c is more about tracking the current
> > state (what query is a backend running, what user, etc).
> 
> But would it be acceptable to use dynamic shared memory in backend_status and
> e.g. have a dsa_pointer rather than a fixed length array?

Might be OK, but it does add a fair bit of complexity. Suddenly there's a
bunch more initialization order dependencies that you don't have right
now. I'd not go there for just this.

Greetings,

Andres Freund



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Fri, Feb 25, 2022 at 01:23:49PM -0800, Andres Freund wrote:
> Looks to me like authn_id isn't synchronized to parallel workers right now. So
> the function will return the wrong thing when executed as part of a parallel
> query.

FWIW, I am not completely sure what's the use case for being able to
see the authn of the current session through a trigger.  We expose
that when log_connections is enabled, for audit purposes.  I can also
get behind something more central so as one can get a full picture of
the authn used by a bunch of session, particularly with complex HBA
policies, but this looks rather limited to me in usability.  Perhaps
that's not enough to stand as an objection, though, and the patch is
dead simple.

> I don't think we should add further functions not prefixed with pg_.

Yep.

> Perhaps a few tests for less trivial authn_ids could be worthwhile?
> E.g. certificate DNs.

Yes, src/test/ssl would handle that just fine.  Now, this stuff
already looks after authn results with log_connections=on, so that
feels like a duplicate.
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Fri, Feb 25, 2022 at 01:23:49PM -0800, Andres Freund wrote:
> Looks to me like authn_id isn't synchronized to parallel workers right now. So
> the function will return the wrong thing when executed as part of a parallel
> query.

Thanks for the catch. It looks like MyProcPort is left empty, and other
functions that rely on like inet_server_addr() are marked parallel-
restricted, so I've done the same in v4.

On Sat, 2022-02-26 at 14:39 +0900, Michael Paquier wrote:
> FWIW, I am not completely sure what's the use case for being able to
> see the authn of the current session through a trigger.  We expose
> that when log_connections is enabled, for audit purposes.  I can also
> get behind something more central so as one can get a full picture of
> the authn used by a bunch of session, particularly with complex HBA
> policies, but this looks rather limited to me in usability.  Perhaps
> that's not enough to stand as an objection, though, and the patch is
> dead simple.

I'm primarily motivated by the linked thread -- if the gap between
builtin roles and authn_id are going to be used as ammo against other
security features, then let's close that gap. But I think it's fair to
say that if someone is already using triggers to exhaustively audit a
table, it'd be nice to have this info in the same place too.

> > I don't think we should add further functions not prefixed with pg_.
> 
> Yep.

Fixed.

> > Perhaps a few tests for less trivial authn_ids could be worthwhile?
> > E.g. certificate DNs.
> 
> Yes, src/test/ssl would handle that just fine.  Now, this stuff
> already looks after authn results with log_connections=on, so that
> feels like a duplicate.

It was easy enough to add, so I added it. I suppose it does protect
against any reimplementations of pg_session_authn_id() that can't
handle longer ID strings, though I admit that's a stretch.

Thanks,
--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Stephen Frost
Date:
Greetings,

* Jacob Champion (pchampion@vmware.com) wrote:
> On Fri, Feb 25, 2022 at 01:23:49PM -0800, Andres Freund wrote:
> > Looks to me like authn_id isn't synchronized to parallel workers right now. So
> > the function will return the wrong thing when executed as part of a parallel
> > query.
>
> Thanks for the catch. It looks like MyProcPort is left empty, and other
> functions that rely on like inet_server_addr() are marked parallel-
> restricted, so I've done the same in v4.

That's probably alright.

> On Sat, 2022-02-26 at 14:39 +0900, Michael Paquier wrote:
> > FWIW, I am not completely sure what's the use case for being able to
> > see the authn of the current session through a trigger.  We expose
> > that when log_connections is enabled, for audit purposes.  I can also
> > get behind something more central so as one can get a full picture of
> > the authn used by a bunch of session, particularly with complex HBA
> > policies, but this looks rather limited to me in usability.  Perhaps
> > that's not enough to stand as an objection, though, and the patch is
> > dead simple.
>
> I'm primarily motivated by the linked thread -- if the gap between
> builtin roles and authn_id are going to be used as ammo against other
> security features, then let's close that gap. But I think it's fair to
> say that if someone is already using triggers to exhaustively audit a
> table, it'd be nice to have this info in the same place too.

Yeah, we really should make this available to trigger-based auditing
systems too and not just through log_connections which involves a great
deal more log parsing and work external to the database to figure out
who did what.

> > > I don't think we should add further functions not prefixed with pg_.
> >
> > Yep.
>
> Fixed.

That's fine.

> > > Perhaps a few tests for less trivial authn_ids could be worthwhile?
> > > E.g. certificate DNs.
> >
> > Yes, src/test/ssl would handle that just fine.  Now, this stuff
> > already looks after authn results with log_connections=on, so that
> > feels like a duplicate.
>
> It was easy enough to add, so I added it. I suppose it does protect
> against any reimplementations of pg_session_authn_id() that can't
> handle longer ID strings, though I admit that's a stretch.
>
> Thanks,
> --Jacob

> commit efec9f040843d1de2fc52f5ce0d020478a5bc75d
> Author: Jacob Champion <pchampion@vmware.com>
> Date:   Mon Feb 28 10:28:51 2022 -0800
>
>     squash! Add API to retrieve authn_id from SQL

Bleh. :)  Squash indeed.

> Subject: [PATCH v4] Add API to retrieve authn_id from SQL
>
> The authn_id field in MyProcPort is currently only accessible to the
> backend itself.  Add a SQL function, pg_session_authn_id(), to expose
> the field to triggers that may want to make use of it.

Only did a quick look but generally looks reasonable to me.

Thanks,

Stephen

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Mon, 2022-02-28 at 16:00 -0500, Stephen Frost wrote:
> > commit efec9f040843d1de2fc52f5ce0d020478a5bc75d
> > Author: Jacob Champion <pchampion@vmware.com>
> > Date:   Mon Feb 28 10:28:51 2022 -0800
> > 
> >     squash! Add API to retrieve authn_id from SQL
> 
> Bleh. :)  Squash indeed.

Ha, I wasn't sure if anyone read the since-diffs :) I'll start
wordsmithing them more in the future.

> > Subject: [PATCH v4] Add API to retrieve authn_id from SQL
> > 
> > The authn_id field in MyProcPort is currently only accessible to the
> > backend itself.  Add a SQL function, pg_session_authn_id(), to expose
> > the field to triggers that may want to make use of it.
> 
> Only did a quick look but generally looks reasonable to me.

Thanks!

--Jacob

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Mon, Feb 28, 2022 at 04:00:36PM -0500, Stephen Frost wrote:
> * Jacob Champion (pchampion@vmware.com) wrote:
>> On Fri, Feb 25, 2022 at 01:23:49PM -0800, Andres Freund wrote:
>>> Looks to me like authn_id isn't synchronized to parallel workers right now. So
>>> the function will return the wrong thing when executed as part of a parallel
>>> query.
>>
>> Thanks for the catch. It looks like MyProcPort is left empty, and other
>> functions that rely on like inet_server_addr() are marked parallel-
>> restricted, so I've done the same in v4.
>
> That's probably alright.

I'd say as well that this is right as-is.  If it happens that there is
a use-case for making this parallel aware in the future, it could be
done.  Now, it may be a bit weird to make parallel workers inherit the
authn ID of the parent as these did not go through authentication, no?
Letting this function being run only by the leader feels intuitive.

> Yeah, we really should make this available to trigger-based auditing
> systems too and not just through log_connections which involves a great
> deal more log parsing and work external to the database to figure out
> who did what.

Okay, I won't fight hard on that if all of you think that this is
useful for a given session.

>> Subject: [PATCH v4] Add API to retrieve authn_id from SQL
>>
>> The authn_id field in MyProcPort is currently only accessible to the
>> backend itself.  Add a SQL function, pg_session_authn_id(), to expose
>> the field to triggers that may want to make use of it.
>
> Only did a quick look but generally looks reasonable to me.

The function and the test are fine, pgperltidy complains a bit about
the format of the tests.

Ayway, this function needs to be documented.  I think that you should
just add that in "Session Information Functions" in func.sgml, same
area as current_user().  The last time we talked about the authn ID,
one thing we discussed about was how to describe that in a good way to
the user, which is why the section of log_connections was reworked a
bit.  And we don't have yet any references to what an authenticated
identity is in the docs.

There is no need to update catversion.h in the patch, committers
usually take care of that and that's an area of the code that
conflicts a lot.
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Stephen Frost
Date:
Greetings,

* Michael Paquier (michael@paquier.xyz) wrote:
> On Mon, Feb 28, 2022 at 04:00:36PM -0500, Stephen Frost wrote:
> > * Jacob Champion (pchampion@vmware.com) wrote:
> >> On Fri, Feb 25, 2022 at 01:23:49PM -0800, Andres Freund wrote:
> >>> Looks to me like authn_id isn't synchronized to parallel workers right now. So
> >>> the function will return the wrong thing when executed as part of a parallel
> >>> query.
> >>
> >> Thanks for the catch. It looks like MyProcPort is left empty, and other
> >> functions that rely on like inet_server_addr() are marked parallel-
> >> restricted, so I've done the same in v4.
> >
> > That's probably alright.
>
> I'd say as well that this is right as-is.  If it happens that there is
> a use-case for making this parallel aware in the future, it could be
> done.  Now, it may be a bit weird to make parallel workers inherit the
> authn ID of the parent as these did not go through authentication, no?
> Letting this function being run only by the leader feels intuitive.

I'm not really sure why we're arguing about this, but clearly the authn
ID of the leader process is what should be used because that's the
authentication under which the parallel worker is running, just as much
as the effective role is the authorization.  Having this be available in
worker processes would certainly be good as it would allow more query
plans to be considered when these functions are used.  At this time, I
don't think that outweighs the complications around having it and I'm
not suggesting that Jacob needs to go do that, but surely it would be
better.

> >> Subject: [PATCH v4] Add API to retrieve authn_id from SQL
> >>
> >> The authn_id field in MyProcPort is currently only accessible to the
> >> backend itself.  Add a SQL function, pg_session_authn_id(), to expose
> >> the field to triggers that may want to make use of it.
> >
> > Only did a quick look but generally looks reasonable to me.
>
> The function and the test are fine, pgperltidy complains a bit about
> the format of the tests.
>
> Ayway, this function needs to be documented.  I think that you should
> just add that in "Session Information Functions" in func.sgml, same
> area as current_user().  The last time we talked about the authn ID,
> one thing we discussed about was how to describe that in a good way to
> the user, which is why the section of log_connections was reworked a
> bit.  And we don't have yet any references to what an authenticated
> identity is in the docs.

Agreed that it should be documented and that location seems reasonable
to me.

> There is no need to update catversion.h in the patch, committers
> usually take care of that and that's an area of the code that
> conflicts a lot.

Yeah, best to let committers handle catversion bumps.

Thanks,

Stephen

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Peter Eisentraut
Date:
On 25.02.22 21:19, Jacob Champion wrote:
> On Fri, 2022-02-25 at 16:28 +0000, Jacob Champion wrote:
>> Ha, opr_sanity caught my use of cstring. I'll work on a fix later
>> today.
> 
> Fixed in v3.

This patch contains no documentation.  I'm having a hard time 
understanding what the name "session_authn_id" is supposed to convey. 
The comment for the Port.authn_id field says this is the "system 
username", which sounds like a clearer terminology.



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Tue, 2022-03-01 at 08:35 -0500, Stephen Frost wrote:
> * Michael Paquier (michael@paquier.xyz) wrote:
> > 
> > Ayway, this function needs to be documented.  I think that you should
> > just add that in "Session Information Functions" in func.sgml, same
> > area as current_user().  The last time we talked about the authn ID,
> > one thing we discussed about was how to describe that in a good way to
> > the user, which is why the section of log_connections was reworked a
> > bit.  And we don't have yet any references to what an authenticated
> > identity is in the docs.
> 
> Agreed that it should be documented and that location seems reasonable
> to me.

Added a first draft in v5, alongside the perltidy fixups mentioned by
Michael.

> > There is no need to update catversion.h in the patch, committers
> > usually take care of that and that's an area of the code that
> > conflicts a lot.
> 
> Yeah, best to let committers handle catversion bumps.

Heh, that was added for my benefit -- I was tired of forgetting to
initdb after switching dev branches -- but I've dropped it from the
patch and will just carry that diff locally.

Thanks,
--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Tue, 2022-03-01 at 19:56 +0100, Peter Eisentraut wrote:
> This patch contains no documentation.  I'm having a hard time 
> understanding what the name "session_authn_id" is supposed to convey. 
> The comment for the Port.authn_id field says this is the "system 
> username", which sounds like a clearer terminology.

"System username" may help from an internal development perspective,
especially as it relates to pg_ident.conf, but I don't think that's
likely to be a useful descriptor to an end user. (I don't think of a
client certificate's Subject Distinguished Name as a "system
username".) Does my attempt in v5 help?

--Jacob

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Tue, Mar 01, 2022 at 10:03:20PM +0000, Jacob Champion wrote:
> Added a first draft in v5, alongside the perltidy fixups mentioned by
> Michael.

+    The authenticated identity is an immutable identifier for the user
+    presented during the connection handshake; the exact format depends on
+    the authentication method in use. (For example, when using the
+    <literal>scram-sha-256</literal> auth method, the authenticated identity
+    is simply the username. When using the <literal>cert</literal> auth
+    method, the authenticated identity is the Distinguished Name of the
+    client certificate.) Even for auth methods which use the username as
+    the authenticated identity, this function differs from
+    <literal>session_user</literal> in that its return value cannot be
+    changed after login.

That looks enough seen from here.  Thanks!

Nit: "auth method" would be a first in the documentation, so this had
better be "authentication method".  (No need to send an updated patch
just for that).

So, any comments and/or opinions from others?
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Peter Eisentraut
Date:
On 01.03.22 23:05, Jacob Champion wrote:
> On Tue, 2022-03-01 at 19:56 +0100, Peter Eisentraut wrote:
>> This patch contains no documentation.  I'm having a hard time
>> understanding what the name "session_authn_id" is supposed to convey.
>> The comment for the Port.authn_id field says this is the "system
>> username", which sounds like a clearer terminology.
> 
> "System username" may help from an internal development perspective,
> especially as it relates to pg_ident.conf, but I don't think that's
> likely to be a useful descriptor to an end user. (I don't think of a
> client certificate's Subject Distinguished Name as a "system
> username".) Does my attempt in v5 help?

Yeah, maybe there are better names.  But I have no idea what the letter 
combination "authn_id" is supposed to stand for.  Is it an 
"authentication identifier"?  What does it identify?  Maybe I'm missing 
something here, but I don't find it clear.



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Andres Freund
Date:
Hi,

On 2022-03-01 08:35:27 -0500, Stephen Frost wrote:
> I'm not really sure why we're arguing about this, but clearly the authn
> ID of the leader process is what should be used because that's the
> authentication under which the parallel worker is running, just as much
> as the effective role is the authorization.  Having this be available in
> worker processes would certainly be good as it would allow more query
> plans to be considered when these functions are used.  At this time, I
> don't think that outweighs the complications around having it and I'm
> not suggesting that Jacob needs to go do that, but surely it would be
> better.

I don't think we should commit this without synchronizing the authn between
worker / leader (in a separate commit). Too likely that some function that's
marked parallel ok queries the authn_id, opening up a security/monitoring hole
or such because of a bogus return value.

Greetings,

Andres Freund



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Wed, Mar 02, 2022 at 01:27:40PM -0800, Andres Freund wrote:
> I don't think we should commit this without synchronizing the authn between
> worker / leader (in a separate commit). Too likely that some function that's
> marked parallel ok queries the authn_id, opening up a security/monitoring hole
> or such because of a bogus return value.

Hmm, OK.  Using the same authn ID for the leader and the workers still
looks a bit strange to me as the worker is not the one that does the
authentication, only the leader does that.  Anyway, FixedParallelState
includes some authentication data passed down by the leader when
spawning a worker.  So, if we were to pass down the authn, we are
going to need a new PARALLEL_KEY_* to serialize and restore the data
passed down via a DSM like any other states as per the business in
parallel.c.  Jacob, what do you think?
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Wed, 2022-03-02 at 09:18 +0100, Peter Eisentraut wrote:
> On 01.03.22 23:05, Jacob Champion wrote:
> > On Tue, 2022-03-01 at 19:56 +0100, Peter Eisentraut wrote:
> > > This patch contains no documentation.  I'm having a hard time
> > > understanding what the name "session_authn_id" is supposed to convey.
> > > The comment for the Port.authn_id field says this is the "system
> > > username", which sounds like a clearer terminology.
> > 
> > "System username" may help from an internal development perspective,
> > especially as it relates to pg_ident.conf, but I don't think that's
> > likely to be a useful descriptor to an end user. (I don't think of a
> > client certificate's Subject Distinguished Name as a "system
> > username".) Does my attempt in v5 help?
> 
> Yeah, maybe there are better names.  But I have no idea what the letter 
> combination "authn_id" is supposed to stand for.  Is it an 
> "authentication identifier"? What does it identify?

Authenticated identity, but yeah, that's the gist. ("AuthN" being a
standard-ish way to differentiate authentication from "AuthZ"
authorization.)

It's meant to uniquely identify the end user in the case of usermaps,
where multiple separate entities might log in using the same role. It
is distinct from the authorized role name, though they might be exactly
the same in many common setups. And it's not set at all if no
authentication was done.

> Maybe I'm missing something here, but I don't find it clear.

I just used the internal name, but if we want to make it more clear
then now would be a good time. Do you have any suggestions? Does
expanding the name (pg_session_authenticated_id, or even
pg_session_authenticated_identity) help?

--Jacob

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Thu, 2022-03-03 at 16:45 +0900, Michael Paquier wrote:
> Anyway, FixedParallelState
> includes some authentication data passed down by the leader when
> spawning a worker.  So, if we were to pass down the authn, we are
> going to need a new PARALLEL_KEY_* to serialize and restore the data
> passed down via a DSM like any other states as per the business in
> parallel.c.  Jacob, what do you think?

I guess it depends on what we want MyProcPort to look like in a
parallel worker. Are we comfortable having most of it be blank/useless?
Does it need to be filled in?

--Jacob

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Thu, Mar 03, 2022 at 07:16:17PM +0000, Jacob Champion wrote:
> I guess it depends on what we want MyProcPort to look like in a
> parallel worker. Are we comfortable having most of it be blank/useless?
> Does it need to be filled in?

Good question.  It depends on the definition of how much
authentication information makes sense for the parallel workers to
inherit from the parent.  And as I mentioned upthread, this definition
is not completely clear to me because the parallel workers do *not* go
through the authentication paths of the parent, they are just spawned
in their own dedicated paths that the leader invokes.  Inheriting all
this information from the leader has also an impact on the
PgBackendStatus entries of the workers as these are reported in
pg_stat_activity as far as I recall, and it could be confusing to see,
for example, some SSL or some GSS information for automatically
spawned processes because these don't use SSL or GSS when they pass
back data to the leader.

I have been looking at the commit history, and found about 6b7d11f
that switched all the functions of sslinfo to be parallel-restricted
especially because of this.  So if we inherit all this information the
restriction on the sslinfo functions could be lifted, though the
interest is honestly limited in this case.

postgres_fdw has introduced recently the concept of cached
connections, as of v14 with 411ae64 and 708d165, with a set of
parallel-restricted functions.  Some of the code paths related to
appname look at MyProcPort, so there could be a risk of having some
inconsistent information if this is accessed in a parallel worker.
Looking at the code, I don't think that it would happen now but
copying some of the data of MyProcPort to the worker could avoid any
future issues if this code gets extended.

At the end of the day, Port is an interface used for the communication
between the postmaster with the frontends, so I'd like to say that it
is correct to not apply this concept to parallel workers because they
are not designed to contact any frontend-side things.
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Fri, 2022-03-04 at 10:45 +0900, Michael Paquier wrote:
> At the end of the day, Port is an interface used for the communication
> between the postmaster with the frontends, so I'd like to say that it
> is correct to not apply this concept to parallel workers because they
> are not designed to contact any frontend-side things.

Coming back to this late, sorry. I'm not quite sure where to move with
this. I'm considering copying pieces of Port over just so we can see
what it looks like in practice?

Personally I think it makes sense for the parallel workers to have the
authn information for the client -- in fact there's a lot of
information that it seems shouldn't be hidden from them -- but there
are other pieces, like the socket handle, that are clearly not useful.

Thanks,
--Jacob

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Tom Lane
Date:
Jacob Champion <pchampion@vmware.com> writes:
> On Fri, 2022-03-04 at 10:45 +0900, Michael Paquier wrote:
>> At the end of the day, Port is an interface used for the communication
>> between the postmaster with the frontends, so I'd like to say that it
>> is correct to not apply this concept to parallel workers because they
>> are not designed to contact any frontend-side things.

> Coming back to this late, sorry. I'm not quite sure where to move with
> this. I'm considering copying pieces of Port over just so we can see
> what it looks like in practice?

> Personally I think it makes sense for the parallel workers to have the
> authn information for the client -- in fact there's a lot of
> information that it seems shouldn't be hidden from them -- but there
> are other pieces, like the socket handle, that are clearly not useful.

Yeah.  It seems to me that putting the auth info into struct Port was
a fairly random thing to do in the first place, and we are now dealing
with the fallout of that.

I think what we ought to do here is separate out the data that we think
parallel workers need access to.  It does not seem wise to say "workers
can access fields A,B,C of MyPort but not fields X,Y,Z".  I do not have
a concrete proposal for fixing it though.

            regards, tom lane



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Thu, 2022-03-17 at 18:33 -0400, Tom Lane wrote:
> Yeah.  It seems to me that putting the auth info into struct Port was
> a fairly random thing to do in the first place, and we are now dealing
> with the fallout of that.
> 
> I think what we ought to do here is separate out the data that we think
> parallel workers need access to.  It does not seem wise to say "workers
> can access fields A,B,C of MyPort but not fields X,Y,Z".  I do not have
> a concrete proposal for fixing it though.

v6-0002 has my first attempt at this. I moved authn_id into its own
substruct inside Port, which gets serialized with the parallel key
machinery. (My name selection of "SharedPort" is pretty bland.)

Over time, we could move more fields into the shared struct and fill
out the serialization logic as needed, and then maybe eventually
SharedPort can be broken off into its own thing with its own
allocation. But I don't know if we should do it all at once, yet.

WDYT?

--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Tom Lane
Date:
Jacob Champion <pchampion@vmware.com> writes:
> On Thu, 2022-03-17 at 18:33 -0400, Tom Lane wrote:
>> I think what we ought to do here is separate out the data that we think
>> parallel workers need access to.  It does not seem wise to say "workers
>> can access fields A,B,C of MyPort but not fields X,Y,Z".  I do not have
>> a concrete proposal for fixing it though.

> v6-0002 has my first attempt at this. I moved authn_id into its own
> substruct inside Port, which gets serialized with the parallel key
> machinery. (My name selection of "SharedPort" is pretty bland.)

Hm.  I was more envisioning getting the "sharable" info out of Port
entirely, although I'm not quite sure where it should go instead.

            regards, tom lane



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Wed, 2022-03-23 at 19:00 -0400, Tom Lane wrote:
> Hm.  I was more envisioning getting the "sharable" info out of Port
> entirely, although I'm not quite sure where it should go instead.

If it helps, I can move the substruct out and up to a new global struct
(MyProcShared?). At this point I think it's mostly search-and-replace.

--Jacob

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Andres Freund
Date:
Hi,

On 2022-03-23 23:06:14 +0000, Jacob Champion wrote:
> On Wed, 2022-03-23 at 19:00 -0400, Tom Lane wrote:
> > Hm.  I was more envisioning getting the "sharable" info out of Port
> > entirely, although I'm not quite sure where it should go instead.
> 
> If it helps, I can move the substruct out and up to a new global struct
> (MyProcShared?). At this point I think it's mostly search-and-replace.

Perhaps alongside CurrentUserId etc in miscinit.c?  It would be nicer if all
those were together in a struct, but oh well.

Another option would be to make it a GUC. With a bit of care it could be
automatically synced by the existing parallelism infrastructure...

Greetings,

Andres Freund



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Wed, 2022-03-23 at 16:54 -0700, Andres Freund wrote:
> On 2022-03-23 23:06:14 +0000, Jacob Champion wrote:
> > On Wed, 2022-03-23 at 19:00 -0400, Tom Lane wrote:
> > > Hm.  I was more envisioning getting the "sharable" info out of Port
> > > entirely, although I'm not quite sure where it should go instead.
> > 
> > If it helps, I can move the substruct out and up to a new global struct
> > (MyProcShared?). At this point I think it's mostly search-and-replace.
> 
> Perhaps alongside CurrentUserId etc in miscinit.c?  It would be nicer if all
> those were together in a struct, but oh well.

Next draft in v7. My naming choices probably make even less sense now. Any ideas for names for "a bag of stuff that we
wantparallel workers to have too"?
 

> Another option would be to make it a GUC. With a bit of care it could be
> automatically synced by the existing parallelism infrastructure...

Like a write-once, PGC_INTERNAL setting? I guess I don't have any
intuition on how that would compare to the separate-global-and-accessor 
approach. Is the primary advantage that you don't have to maintain the
serialization logic, or is there more to it?

Thanks,
--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Robert Haas
Date:
On Wed, Mar 2, 2022 at 4:27 PM Andres Freund <andres@anarazel.de> wrote:
> I don't think we should commit this without synchronizing the authn between
> worker / leader (in a separate commit). Too likely that some function that's
> marked parallel ok queries the authn_id, opening up a security/monitoring hole
> or such because of a bogus return value.

It is not free to copy data from the leader to the worker. I don't
think we should just adopt a policy of copying everything anyone
thinks of, because then most of the time we'll be copying a bunch of
stuff that really isn't needed.

My gut reaction is to think that this is way too marginal to be worth
making parallel-safe, but it is also possible that I just don't know
enough to understand its true value.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Andres Freund
Date:
Hi,

On 2022-03-24 13:55:29 -0400, Robert Haas wrote:
> On Wed, Mar 2, 2022 at 4:27 PM Andres Freund <andres@anarazel.de> wrote:
> > I don't think we should commit this without synchronizing the authn between
> > worker / leader (in a separate commit). Too likely that some function that's
> > marked parallel ok queries the authn_id, opening up a security/monitoring hole
> > or such because of a bogus return value.
>
> It is not free to copy data from the leader to the worker. I don't
> think we should just adopt a policy of copying everything anyone
> thinks of, because then most of the time we'll be copying a bunch of
> stuff that really isn't needed.

I agree.


> My gut reaction is to think that this is way too marginal to be worth
> making parallel-safe, but it is also possible that I just don't know
> enough to understand its true value.

My problem with that is that as far as I can tell the only real use of the
field / function is for stuff like audit logging, RLS etc. Which seems
problematic for two reasons:

1) It's likely that the call to the function is nested into other functions,
   "hiding" the parallel safety. Then it'd return bogus data silently. At
   the very least we need to make it error out if called in a parallel worker.
2) If used for the purposes above, there's basically no parallelism possible
   anymore.

Greetings,

Andres Freund



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Thu, Mar 24, 2022 at 05:44:06PM +0000, Jacob Champion wrote:
> On Wed, 2022-03-23 at 16:54 -0700, Andres Freund wrote:
>> Another option would be to make it a GUC. With a bit of care it could be
>> automatically synced by the existing parallelism infrastructure...
>
> Like a write-once, PGC_INTERNAL setting? I guess I don't have any
> intuition on how that would compare to the separate-global-and-accessor
> approach. Is the primary advantage that you don't have to maintain the
> serialization logic, or is there more to it?

Hmm.  That would be a first for a GUC, no?  It is not seem natural
compared to the other information pieces passed down from the leader
to the workers.

+extern SharedPort MyProcShared;

This naming is interesting, and seems to be in line with a couple of
executor structures that share information across workers.  Still
that's a bit inconsistent as Shared is used once at the beginning and
once at the end?  I don't have a better idea on top of my mind.

Anyway, wouldn't it be better to reverse the patch order, introducing
the shared Proc information first and then build the parallel-safe
function on top of it?
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Andres Freund
Date:
Hi,

On 2022-03-26 15:18:59 +0900, Michael Paquier wrote:
> On Thu, Mar 24, 2022 at 05:44:06PM +0000, Jacob Champion wrote:
> > On Wed, 2022-03-23 at 16:54 -0700, Andres Freund wrote:
> >> Another option would be to make it a GUC. With a bit of care it could be
> >> automatically synced by the existing parallelism infrastructure...
> >
> > Like a write-once, PGC_INTERNAL setting?

Perhaps PGC_INTERNAL, perhaps PGC_SU_BACKEND, set with PGC_S_OVERRIDE?


> > I guess I don't have any
> > intuition on how that would compare to the separate-global-and-accessor
> > approach. Is the primary advantage that you don't have to maintain the
> > serialization logic, or is there more to it?
>
> Hmm.  That would be a first for a GUC, no?  It is not seem natural
> compared to the other information pieces passed down from the leader
> to the workers.

What would be the first for a GUC? We have plenty GUCs that are set on a
per-connection basis to reflect some fact? And there's several authenitcation
related bits of state known to guc.c , think role, session_authorization,
is_superuser.

Sharing per-connection state via GUCs for paralellism? I don't think that is
true either. E.g. application_name, client_encoding.


> +extern SharedPort MyProcShared;

I strongly dislike MyProcShared. It's way too easily confused with MyProc
which point to shared memory.

Greetings,

Andres Freund



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2022-03-26 15:18:59 +0900, Michael Paquier wrote:
>> On Thu, Mar 24, 2022 at 05:44:06PM +0000, Jacob Champion wrote:
>>> On Wed, 2022-03-23 at 16:54 -0700, Andres Freund wrote:
>>>> Another option would be to make it a GUC. With a bit of care it could be
>>>> automatically synced by the existing parallelism infrastructure...

>>> Like a write-once, PGC_INTERNAL setting?

> Perhaps PGC_INTERNAL, perhaps PGC_SU_BACKEND, set with PGC_S_OVERRIDE?

Seems like making it a GUC does nothing to fix the complaint you had about
passing data to workers whether it's needed or not.  Sure, we don't then
need to write any new code for it, but it's still new cycles.  And it's
new cycles all throughout guc.c, too, not just in parallel worker start.

I also note that exposing it as a GUC means we have zero control over
who/what can read it.  Maybe that's not a problem, but it needs to be
thought about before we go down that path.

            regards, tom lane



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Andres Freund
Date:
Hi,

On 2022-03-26 13:57:39 -0400, Tom Lane wrote:
> Seems like making it a GUC does nothing to fix the complaint you had about
> passing data to workers whether it's needed or not.

I don't think that was my complaint. Maybe Robert's?


> Sure, we don't then need to write any new code for it, but it's still new
> cycles.

I think it'd quite possibly less cycles than separately syncing it.

Because I wanted to know what the overhead be in relation to other things, I
made serialize_variable() log whenever it decides to serialize, and it's a bit
depressing :/.

serialized DateStyle = ISO, MDY
serialized default_text_search_config = pg_catalog.english
serialized force_parallel_mode = on
serialized lc_messages = en_US.UTF-8
serialized lc_monetary = en_US.UTF-8
serialized lc_numeric = en_US.UTF-8
serialized lc_time = en_US.UTF-8
serialized log_checkpoints = true
serialized log_line_prefix = %m [%p][%b][%v:%x][%a]
serialized log_timezone = America/Los_Angeles
serialized max_stack_depth = 2048
serialized max_wal_size = 153600
serialized min_wal_size = 48
serialized restart_after_crash = true
serialized session_authorization = andres
serialized ssl_cert_file = /home/andres/tmp/pgdev/ssl-cert-snakeoil.pem
serialized ssl_key_file = /home/andres/tmp/pgdev/ssl-cert-snakeoil.key
serialized TimeZone = America/Los_Angeles
serialized timezone_abbreviations = Default
serialized track_io_timing = true
serialized transaction_deferrable = false
serialized transaction_isolation = read committed
serialized transaction_read_only = false
total serialized guc state is 1324

Of course, compared to the total size of 94784 bytes that's not too
much... FWIW, 65536 of that is for the tuple queues...


> I also note that exposing it as a GUC means we have zero control over
> who/what can read it.  Maybe that's not a problem, but it needs to be
> thought about before we go down that path.

Yes, I think that's a fair concern.

Greetings,

Andres Freund



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Sat, 2022-03-26 at 11:36 -0700, Andres Freund wrote:
> > I also note that exposing it as a GUC means we have zero control over
> > who/what can read it.  Maybe that's not a problem, but it needs to be
> > thought about before we go down that path.
> 
> Yes, I think that's a fair concern.

I like that there's no builtin way, today, for a superuser to modify
the internal value; it strengthens the use as an auditing tool. Moving
this to a PGC_SU_BACKEND GUC seems to weaken that. And it looks like
PGC_INTERNAL is skipped during the serialization, so if we chose that
option, we'd need to write new code anyway?

We'd also need to guess whether the GUC system's serialization of NULL
as an empty string is likely to cause problems for any future auth
methods. My guess is "no", to be honest, but I do like maintaining the
distinction -- it feels safer.

v8 rebases over the recent SSL changes to get the cfbot green again.

Thanks,
--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Andres Freund
Date:
Hi,

On 2022-03-29 23:38:29 +0000, Jacob Champion wrote:
> On Sat, 2022-03-26 at 11:36 -0700, Andres Freund wrote:
> > > I also note that exposing it as a GUC means we have zero control over
> > > who/what can read it.  Maybe that's not a problem, but it needs to be
> > > thought about before we go down that path.
> > 
> > Yes, I think that's a fair concern.
> 
> I like that there's no builtin way, today, for a superuser to modify
> the internal value; it strengthens the use as an auditing tool. Moving
> this to a PGC_SU_BACKEND GUC seems to weaken that. And it looks like
> PGC_INTERNAL is skipped during the serialization, so if we chose that
> option, we'd need to write new code anyway?

It'd be pretty simple to change can_skip_gucvar()'s selection of what to
sync. E.g. an additional flag like GUC_PARALLEL_SYNCHRONIZE.

I'm not convinced that a GUC is the answer, to be clear.


> We'd also need to guess whether the GUC system's serialization of NULL
> as an empty string is likely to cause problems for any future auth
> methods.

You can't represent a NULL in a postgres 'text' datum, independent of
parallelism. So the current definition of pg_session_authn_id() already
precludes that (and set_authn_id() and ...).  Honestly, I can't see a reason
why we should ever allow authn_id to contain a NULL byte.

Greetings,

Andres Freund



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Tue, 2022-03-29 at 16:53 -0700, Andres Freund wrote:
> > We'd also need to guess whether the GUC system's serialization of NULL
> > as an empty string is likely to cause problems for any future auth
> > methods.
> 
> You can't represent a NULL in a postgres 'text' datum, independent of
> parallelism. So the current definition of pg_session_authn_id() already
> precludes that (and set_authn_id() and ...).  Honestly, I can't see a reason
> why we should ever allow authn_id to contain a NULL byte.

I don't mean a NULL byte, just a NULL pointer. This part of the
implementation doesn't distinguish between it and an empty string:

> /* NULL becomes empty string, see estimate_variable_size() */
> do_serialize(destptr, maxbytes, "%s",
>              *conf->variable ? *conf->variable : "");

Whether that's a problem in the future entirely depends on whether
there's some authentication method that considers the empty string a
sane and meaningful identity. We might reasonably decide that the
answer is "no", but I like being able to make that decision as opposed
to delegating it to an existing generic framework.

(That last point may be my core concern about making it a GUC: I'd like
us to have full control of how and where this particular piece of
information gets modified.)

Thanks,
--Jacob

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Tue, 2022-03-29 at 23:38 +0000, Jacob Champion wrote:
> v8 rebases over the recent SSL changes to get the cfbot green again.

I think the Windows failure [1] is unrelated to this patch, but for
posterity:

> [03:01:58.925] c:\cirrus>call "C:/Program Files/Git/usr/bin/timeout.exe" -v -k60s 15m perl
src/tools/msvc/vcregress.plrecoverycheck 
 
> [03:03:16.106] [03:03:16] t/001_stream_rep.pl .................. ok    76551 ms ( 0.00 usr  0.00 sys +  0.00 cusr
0.00csys =  0.00 CPU)
 
> [03:03:16.120] [03:03:16] t/002_archiving.pl ................... ok        0 ms ( 0.02 usr  0.00 sys +  0.00 cusr
0.00csys =  0.02 CPU)
 
> [03:03:16.128] [03:03:16] t/003_recovery_targets.pl ............ ok        0 ms ( 0.00 usr  0.00 sys +  0.00 cusr
0.00csys =  0.00 CPU)
 
> [03:03:16.138] [03:03:16] t/004_timeline_switch.pl ............. ok        0 ms ( 0.00 usr  0.00 sys +  0.00 cusr
0.00csys =  0.00 CPU)
 
> [03:03:16.141] [03:03:16] t/005_replay_delay.pl ................ ok        0 ms ( 0.00 usr  0.00 sys +  0.00 cusr
0.00csys =  0.00 CPU)
 
> [03:03:24.561] [03:03:24] t/006_logical_decoding.pl ............ ok     8416 ms ( 0.00 usr  0.00 sys +  0.00 cusr
0.00csys =  0.00 CPU)
 
> [03:03:32.496] [03:03:32] t/007_sync_rep.pl .................... ok     7895 ms ( 0.00 usr  0.00 sys +  0.00 cusr
0.00csys =  0.00 CPU)
 
> [03:03:32.496] [03:03:32] t/008_fsm_truncation.pl .............. ok        0 ms ( 0.00 usr  0.00 sys +  0.00 cusr
0.00csys =  0.00 CPU)
 
> [03:16:58.985] /usr/bin/timeout: sending signal TERM to command ‘perl’

The server and client logs don't quite match up; it looks like we get
partway through t/018_wal_optimize.pl, maybe to the end of the
`wal_level = minimal, SET TABLESPACE commit subtransaction` test,
before hanging.

I see that there's an active thread about a hang later in the recovery
suite [2]. That's suspicious since 019 is just the next test, but I
don't see any evidence in the logs that we actually started test 019 in
this run.

--Jacob

[1] https://cirrus-ci.com/task/5434752374145024
[2] https://www.postgresql.org/message-id/flat/83b46e5f-2a52-86aa-fa6c-8174908174b8%40iki.fi

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Wed, Mar 30, 2022 at 04:02:09PM +0000, Jacob Champion wrote:
> Whether that's a problem in the future entirely depends on whether
> there's some authentication method that considers the empty string a
> sane and meaningful identity. We might reasonably decide that the
> answer is "no", but I like being able to make that decision as opposed
> to delegating it to an existing generic framework.

My guess on the matter is that an empty authn holds the same meaning
as NULL because it has no data, but I can see your point as well to
make this distinction.  In order to do that, couldn't you just use
shm_toc_lookup(noError=true)?  PARALLEL_KEY_SHAREDPORT could be an
optional entry in the TOC data.

The name choice is still an issue, as per Andres' point that
MyProcShared is confusing as it can refer to shared memory.  What we
want is a structure name for something that's related to MyProc and
shared across all parallel workers including the leader.  I would
give up on the "Shared" part, using "Parallel" and "Info" instead.
Here are some ideas:
- ProcParallelInfo
- ProcInfoParallel
- ParallelProcInfo
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Tue, 2022-04-05 at 15:13 +0900, Michael Paquier wrote:
> On Wed, Mar 30, 2022 at 04:02:09PM +0000, Jacob Champion wrote:
> > Whether that's a problem in the future entirely depends on whether
> > there's some authentication method that considers the empty string a
> > sane and meaningful identity. We might reasonably decide that the
> > answer is "no", but I like being able to make that decision as opposed
> > to delegating it to an existing generic framework.
> 
> My guess on the matter is that an empty authn holds the same meaning
> as NULL because it has no data,

Whether it holds meaning or not depends entirely on the auth method, I
think. Hypothetical example -- a system could accept client
certificates with an empty Subject. What identity that Subject
represents would depend on the organization, but it's distinct from
NULL/unauthenticated because the certificate is still signed by a CA.

(Postgres rejects empty Subjects when using clientname=DN and I'm not
proposing that we change that; I'm haven't actually checked that
they're RFC-legal. But it's possible that a future auth method could
have a reasonable standard definition for an empty identifier.)

> but I can see your point as well to
> make this distinction.  In order to do that, couldn't you just use
> shm_toc_lookup(noError=true)?  PARALLEL_KEY_SHAREDPORT could be an
> optional entry in the TOC data.

The current patch already handles NULL with a byte of overhead; is
there any advantage to using noError? (It might make things messier
once a second member gets added to the struct.) My concern was directed
at the GUC proposal.

> The name choice is still an issue, as per Andres' point that
> MyProcShared is confusing as it can refer to shared memory.  What we
> want is a structure name for something that's related to MyProc and
> shared across all parallel workers including the leader.  I would
> give up on the "Shared" part, using "Parallel" and "Info" instead.
> Here are some ideas:
> - ProcParallelInfo
> - ProcInfoParallel
> - ParallelProcInfo

I like ParallelProcInfo; it reads nicely. I've used that in v9.

Thanks!
--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Tue, Apr 05, 2022 at 06:23:06PM +0000, Jacob Champion wrote:
> Whether it holds meaning or not depends entirely on the auth method, I
> think. Hypothetical example -- a system could accept client
> certificates with an empty Subject. What identity that Subject
> represents would depend on the organization, but it's distinct from
> NULL/unauthenticated because the certificate is still signed by a CA.

Interesting point.

> The current patch already handles NULL with a byte of overhead; is
> there any advantage to using noError? (It might make things messier
> once a second member gets added to the struct.) My concern was directed
> at the GUC proposal.

FWIW, I am a bit concerned by this approach because it feels
inconsistent with any other conditional fields passed down from the
parallel leader to its workers.  And if we need to add more fields to
ParallelProcInfo in the future, it will be cleaner to use different
TOC keys to pass down different fields anyway, no?
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Wed, 2022-04-06 at 20:09 +0900, Michael Paquier wrote:
> > The current patch already handles NULL with a byte of overhead; is
> > there any advantage to using noError? (It might make things messier
> > once a second member gets added to the struct.) My concern was directed
> > at the GUC proposal.
> 
> FWIW, I am a bit concerned by this approach because it feels
> inconsistent with any other conditional fields passed down from the
> parallel leader to its workers.  And if we need to add more fields to 
> ParallelProcInfo in the future, it will be cleaner to use different
> TOC keys to pass down different fields anyway, no?

I assumed that we would follow the existing model of "(de)serialize a
whole struct", rather than pulling it apart into many separate keys. If
it got too complicated then we could consider introducing a
SerializedParallelProcInfo struct like some of the other functions do.
Maybe that wouldn't work so well if many of the fields are strings?

Is there a significant cost to changing this later, if one approach
turns out to be wrong?

--Jacob

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Wed, Apr 06, 2022 at 07:16:43PM +0000, Jacob Champion wrote:
> I assumed that we would follow the existing model of "(de)serialize a
> whole struct", rather than pulling it apart into many separate keys. If
> it got too complicated then we could consider introducing a
> SerializedParallelProcInfo struct like some of the other functions do.
> Maybe that wouldn't work so well if many of the fields are strings?
>
> Is there a significant cost to changing this later, if one approach
> turns out to be wrong?

I don't think this is going to be an issue as long as we don't change
the definitions of MyParallelProcInfo, Port or PARALLEL_KEY_* in the
stable branch.  My guess is that we are fine to switch to one approach
or the other as this is just some internal communication logic between
the parallel leader and its workers.

What is the feeling of others about this patch and the introduction of
ParallelProcInfo (or ParallelPortInfo?) to store the authn coming from
Port?  The feature freeze is very close.
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
v10 is rebased over latest; I've also added a PGDLLIMPORT to the new global.

--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Robert Haas
Date:
On Tue, May 31, 2022 at 6:21 PM Jacob Champion <jchampion@timescale.com> wrote:
> v10 is rebased over latest; I've also added a PGDLLIMPORT to the new global.

I took a quick look at this and it doesn't seem crazy to me, except
that I think ParallelProcInfo is a bad name for it. It's kind of
generic, because neither "proc" nor "info" means a whole lot. It's
also kind of wrong, because I think "parallel" should be things that
have to do with parallelism, not just things that happen to be
synchronized across processes when parallelism is in use. It doesn't
make sense to me to have something called a ParallelProcInfo that is
used for every single connection in the universe even if parallelism
is completely disabled on the system.

I'm not sure what it SHOULD be called, exactly: that's one of the hard
problems in computer science.[1]

-- 
Robert Haas
EDB: http://www.enterprisedb.com

[1] https://martinfowler.com/bliki/TwoHardThings.html



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Thu, Jun 2, 2022 at 6:52 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I'm not sure what it SHOULD be called, exactly: that's one of the hard
> problems in computer science.[1]

Yeah...

All right, here's the full list of previous suggestions, I think:

- SharedPort
- MyProcShared
- ProcParallelInfo
- ProcInfoParallel
- ParallelProcInfo
- ParallelPortInfo

I have a few new proposals:

- GlobalPortInfo
- GlobalConnInfo
- Synced[Port/Conn]Info
- Worker[Port/Conn]Info (but I think this suffers from the exact same
problem as Parallel)
- ThingsThatHappenToBeSynchronizedAcrossProcessesWhenParallelismIsInUse
- OrderImporterConsumerTemplateBeanFactory

I am struggling to come up with a single adjective that captures this
concept of sometimes-synchronized, that isn't conflicting with
existing uses (like Shared). Other suggestions are very welcome.

Thanks,
--Jacob



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Thu, Jun 02, 2022 at 03:56:28PM -0700, Jacob Champion wrote:
> All right, here's the full list of previous suggestions, I think:
>
> - SharedPort
> - MyProcShared
> - ProcParallelInfo
> - ProcInfoParallel
> - ParallelProcInfo
> - ParallelPortInfo
>
> I have a few new proposals:
>
> - GlobalPortInfo
> - GlobalConnInfo
> - Synced[Port/Conn]Info
> - Worker[Port/Conn]Info (but I think this suffers from the exact same
> problem as Parallel)
> - ThingsThatHappenToBeSynchronizedAcrossProcessesWhenParallelismIsInUse
> - OrderImporterConsumerTemplateBeanFactory

ParallelPortInfo sounds kind of right for the job to me in this set of
proposals, as the data is from the Port, and that's some information
shared between all the parallel workers and the leader.
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> ParallelPortInfo sounds kind of right for the job to me in this set of
> proposals, as the data is from the Port, and that's some information
> shared between all the parallel workers and the leader.

I agree with Robert's complaint that Parallel is far too generic
a term here.  Also, the fact that this data is currently in struct
Port seems like an artifact.

Don't we have a term for the set of processes comprising a leader
plus parallel workers?  If we called that set FooGroup, then
something like FooGroupSharedInfo would be on-point.

            regards, tom lane



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Fri, Jun 03, 2022 at 10:04:12AM -0400, Tom Lane wrote:
> I agree with Robert's complaint that Parallel is far too generic
> a term here.  Also, the fact that this data is currently in struct
> Port seems like an artifact.
>
> Don't we have a term for the set of processes comprising a leader
> plus parallel workers?  If we called that set FooGroup, then
> something like FooGroupSharedInfo would be on-point.

As far as I know, proc.h includes the term "group members", which
includes the leader and its workers (see CLOG and lock part)?
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Fri, Jun 3, 2022 at 7:36 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Fri, Jun 03, 2022 at 10:04:12AM -0400, Tom Lane wrote:
> > I agree with Robert's complaint that Parallel is far too generic
> > a term here.  Also, the fact that this data is currently in struct
> > Port seems like an artifact.
> >
> > Don't we have a term for the set of processes comprising a leader
> > plus parallel workers?  If we called that set FooGroup, then
> > something like FooGroupSharedInfo would be on-point.
>
> As far as I know, proc.h includes the term "group members", which
> includes the leader and its workers (see CLOG and lock part)?

lmgr/README also refers to "gangs of related processes" and "parallel
groups". So

- GroupSharedInfo
- ParallelGroupSharedInfo
- GangSharedInfo
- SharedLeaderInfo

?

--Jacob



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Robert Haas
Date:
On Fri, Jun 3, 2022 at 10:04 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I agree with Robert's complaint that Parallel is far too generic
> a term here.  Also, the fact that this data is currently in struct
> Port seems like an artifact.

Why do we call this thing a Port, anyway?

I think I'd feel more comfortable here if we were defining what went
into which struct on some semantic basis rather than being like, OK,
so all the stuff we want to serialize goes into struct #1, and the
stuff we don't want to serialize goes into struct #2. I suppose if
it's just based on whether or not we want to serialize it, then the
placement of future additions will just be based on how people happen
to feel about the thing they're adding right at that moment, and there
won't be any consistency.

One could imagine dividing the Port struct into a couple of different
structs, e.g.

AuthenticationState: stuff that is needed only during authentication
and can be discarded thereafter (e.g. the HBA line, at least if the
comment is to be believed)
ClientCommunicationState: stuff that is used to communicate with the
client but doesn't need to be or can't be shared (e.g. the SSL object
itself)
ClientConnectionInfo: stuff that someone might want to look at for
information purposes at any time (e.g. authn_id, apparently)

Then we could serialize the third of these, keep the second around but
not serialize it, and free the first once connection setup is
complete.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Mon, Jun 6, 2022 at 11:44 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I think I'd feel more comfortable here if we were defining what went
> into which struct on some semantic basis rather than being like, OK,
> so all the stuff we want to serialize goes into struct #1, and the
> stuff we don't want to serialize goes into struct #2. I suppose if
> it's just based on whether or not we want to serialize it, then the
> placement of future additions will just be based on how people happen
> to feel about the thing they're adding right at that moment, and there
> won't be any consistency.

"This struct contains connection fields that are explicitly safe for
workers to access" _is_ a useful semantic, in my opinion. And it seems
like it'd make it easier to determine what needs to be included in the
struct; I'm not sure I follow why it would result in less consistency.

But to your suggestion, if we just called the new struct
"ClientConnectionInfo", would it be a useful step towards your
proposed three-bucket state? I guess I'm having trouble understanding
why a struct that is defined as "this stuff *doesn't* get serialized"
is materially different from having one that's the opposite.

Thanks,
--Jacob



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Robert Haas
Date:
On Tue, Jun 7, 2022 at 6:54 PM Jacob Champion <jchampion@timescale.com> wrote:
> "This struct contains connection fields that are explicitly safe for
> workers to access" _is_ a useful semantic, in my opinion. And it seems
> like it'd make it easier to determine what needs to be included in the
> struct; I'm not sure I follow why it would result in less consistency.
>
> But to your suggestion, if we just called the new struct
> "ClientConnectionInfo", would it be a useful step towards your
> proposed three-bucket state? I guess I'm having trouble understanding
> why a struct that is defined as "this stuff *doesn't* get serialized"
> is materially different from having one that's the opposite.

Well, it isn't, and if my proposal boils down to that, which perhaps
it does, then my proposal isn't that great, honestly. Let me try again
to explain, though, and maybe it will seem less arbitrary with a
second explanation -- or maybe it won't.

If we say "this struct contains authentication-related information
that we got from the client and which functions may want to look at
later," then I feel like the chances are good that when someone adds a
new thing to the system in the future, they will know whether or not
that new thing falls into that category or not. If the definition of a
struct is "everything that should be serialized," then I feel like the
chances are less good that everyone will know whether a new thing they
are adding falls into that category or not. Perhaps that is
ill-founded, but I don't think "should be serialized" is necessarily
something that everybody is going to have the same view on, or even
know what it means.

Also, I don't think we want to end up with a situation where we have a
struct that contains wildly unrelated things that all need to be
serialized. If the definition of the struct is "stuff we should
serialize and send to the worker," well then maybe the transaction
snapshot ought to go in there! Well, no. I mean, we already have a
separate place for that, but suppose somehow we didn't. It doesn't
belong here, because yes the things in this struct get serialized, but
it's not only any old thing that needs serializing, it's more specific
than that.

I guess what this boils down to is that I really want this thing to
have a meaningful name by means of which a future developer can make a
guess as to whether some field they're adding ought to go in there. I
theorize that SharedPort is not too great because (a) Port is already
a bad name and (b) how am I supposed to know whether my stuff ought to
be shared or not? I like something like ClientConnectionInfo better
because it seems to describe what the stuff in the struct *is* rather
than what we *do* with it.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Tue, Jun 7, 2022 at 5:45 PM Robert Haas <robertmhaas@gmail.com> wrote:
> Perhaps that is
> ill-founded, but I don't think "should be serialized" is necessarily
> something that everybody is going to have the same view on, or even
> know what it means.

Using this thread as an example, once it was decided that the parallel
workers needed the additional info, the need for serialization
followed directly from that. I don't get the feeling that developers
are going to jump through the hoops of writing serialization logic for
a new field in the struct just by accident; they should know why
they're writing that code, and hopefully it would be easy for
reviewers to catch a patch that showed up with pointless
serialization.

> Also, I don't think we want to end up with a situation where we have a
> struct that contains wildly unrelated things that all need to be
> serialized. If the definition of the struct is "stuff we should
> serialize and send to the worker," well then maybe the transaction
> snapshot ought to go in there! Well, no. I mean, we already have a
> separate place for that, but suppose somehow we didn't. It doesn't
> belong here, because yes the things in this struct get serialized, but
> it's not only any old thing that needs serializing, it's more specific
> than that.

I completely agree with you here -- the name should not be so generic
that it's just a catch-all for any serialized fields that exist.

> I guess what this boils down to is that I really want this thing to
> have a meaningful name by means of which a future developer can make a
> guess as to whether some field they're adding ought to go in there. I
> theorize that SharedPort is not too great because (a) Port is already
> a bad name and (b) how am I supposed to know whether my stuff ought to
> be shared or not? I like something like ClientConnectionInfo better
> because it seems to describe what the stuff in the struct *is* rather
> than what we *do* with it.

I think having both would be useful in this case -- what the stuff is,
so that it's clear what doesn't belong in it, and what we do with it,
so it's clear that you have to write serialization code if you add new
things. The nature of the struct is such that I think you _have_ to
figure out whether or not your stuff ought to be shared before you
have any business adding it.

But I don't have any better ideas for how to achieve both. I'm fine
with your suggestion of ClientConnectionInfo, if that sounds good to
others; the doc comment can clarify why it differs from Port? Or add
one of the Shared-/Gang-/Group- prefixes to it, maybe?

Thanks,
--Jacob



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Robert Haas
Date:
On Wed, Jun 8, 2022 at 7:53 PM Jacob Champion <jchampion@timescale.com> wrote:
> But I don't have any better ideas for how to achieve both. I'm fine
> with your suggestion of ClientConnectionInfo, if that sounds good to
> others; the doc comment can clarify why it differs from Port? Or add
> one of the Shared-/Gang-/Group- prefixes to it, maybe?

I don't like the prefixes, so I'd prefer explaining it in the struct comment.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Thu, Jun 9, 2022 at 6:23 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Jun 8, 2022 at 7:53 PM Jacob Champion <jchampion@timescale.com> wrote:
> > But I don't have any better ideas for how to achieve both. I'm fine
> > with your suggestion of ClientConnectionInfo, if that sounds good to
> > others; the doc comment can clarify why it differs from Port? Or add
> > one of the Shared-/Gang-/Group- prefixes to it, maybe?
>
> I don't like the prefixes, so I'd prefer explaining it in the struct comment.

Done that way in v11.

Thanks!
--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
"Drouvot, Bertrand"
Date:
Hi,

On 6/10/22 7:58 PM, Jacob Champion wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you
canconfirm the sender and know the content is safe.
 
>
>
>
> On Thu, Jun 9, 2022 at 6:23 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Jun 8, 2022 at 7:53 PM Jacob Champion <jchampion@timescale.com> wrote:
>>> But I don't have any better ideas for how to achieve both. I'm fine
>>> with your suggestion of ClientConnectionInfo, if that sounds good to
>>> others; the doc comment can clarify why it differs from Port? Or add
>>> one of the Shared-/Gang-/Group- prefixes to it, maybe?
>> I don't like the prefixes, so I'd prefer explaining it in the struct comment.
> Done that way in v11.
>
> Thanks!
> --Jacob

FWIW, I just created a new thread to expose the port->authn_id through 
the SYSTEM_USER sql reserved word.

Regards,

Bertrand





Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On 6/22/22 06:31, Drouvot, Bertrand wrote:
> FWIW, I just created a new thread to expose the port->authn_id through 
> the SYSTEM_USER sql reserved word.

Review for both seems to have dried up a bit. I'm not particularly
invested in my code, but I do want to see *a* solution go in. So if it
helps the review momentum for me to withdraw this patch and put my
effort into SYSTEM_USER, I can do that no problem.

Thoughts from prior reviewers? Is SYSTEM_USER the way to go?

--Jacob



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
"Drouvot, Bertrand"
Date:
Hi,

On 8/2/22 11:57 PM, Jacob Champion wrote:
> On 6/22/22 06:31, Drouvot, Bertrand wrote:
>> FWIW, I just created a new thread to expose the port->authn_id through
>> the SYSTEM_USER sql reserved word.
> Review for both seems to have dried up a bit. I'm not particularly
> invested in my code, but I do want to see *a* solution go in. So if it
> helps the review momentum for me to withdraw this patch and put my
> effort into SYSTEM_USER, I can do that no problem.
>
> Thoughts from prior reviewers? Is SYSTEM_USER the way to go?

I did not look in detail to this thread, but if the goal is "only" to 
expose authn_id (as the subject describes) then it seems to me that 
SYSTEM_USER [1] is the way to go.

[1]: https://commitfest.postgresql.org/39/3703/

-- 
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Fri, Aug 05, 2022 at 12:48:33PM +0200, Drouvot, Bertrand wrote:
> On 8/2/22 11:57 PM, Jacob Champion wrote:
>> Thoughts from prior reviewers? Is SYSTEM_USER the way to go?

Reading through the other thread, there is a clear parallel between
both in concept to provide this information at SQL level, indeed.

> I did not look in detail to this thread, but if the goal is "only" to expose
> authn_id (as the subject describes) then it seems to me that SYSTEM_USER [1]
> is the way to go.
>
> [1]: https://commitfest.postgresql.org/39/3703/

However, I am not sure if the suggestion of auth_method:authn as
output generated by SYSTEM_USER would be correct according to the SQL
specification, either.  The spec being not really talkative about the
details of what an external module should be opens up for a lot of
interpretation, something that both thread are dealing with.

Anyway, we are talking about two different problems on this thread:
1) Provide the authn/SYSTEM_USER through some SQL interface, which is
what 0001 is an attempt of, SYSTEM_USER is a different attempt.
2) Move authn out of Port into its own structure, named
ClientConnectionInfo, and pass it down to the parallel workers.

SYSTEM_USER overlaps with 0001, but I see no reason to not do 0002 in
all cases.  Even if we are not sure yet of how to expose that at SQL
level, we still want to pass down this information to the parallel
workers and we still want to not have that in Port.  An extension
could also do easily the job once 0002 is done, so I see a good
argument about doing it anyway.  The name ClientConnectionInfo from
Robert looks like the compromise we have for the new structure holding
the information about the client information passed down to the
workers.
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Joe Conway
Date:
On 8/6/22 02:26, Michael Paquier wrote:
> On Fri, Aug 05, 2022 at 12:48:33PM +0200, Drouvot, Bertrand wrote:
>> On 8/2/22 11:57 PM, Jacob Champion wrote:
>>> Thoughts from prior reviewers? Is SYSTEM_USER the way to go?
> 
> Reading through the other thread, there is a clear parallel between
> both in concept to provide this information at SQL level, indeed.
> 
>> I did not look in detail to this thread, but if the goal is "only" to expose
>> authn_id (as the subject describes) then it seems to me that SYSTEM_USER [1]
>> is the way to go.
>> 
>> [1]: https://commitfest.postgresql.org/39/3703/
> 
> However, I am not sure if the suggestion of auth_method:authn as
> output generated by SYSTEM_USER would be correct according to the SQL
> specification, either.  The spec being not really talkative about the
> details of what an external module should be opens up for a lot of
> interpretation, something that both thread are dealing with.

As I pointed out here [ 
https://www.postgresql.org/message-id/28b4a9ef-5103-f117-99e1-99ae5a86a6e8%40joeconway.com 
] both the SQL Server and Oracle interpretations are similar to the one 
provided by Bertrand's patch:

SQL Server:
   "If the current user is logged in to SQL Server by
    using Windows Authentication, SYSTEM_USER returns the
    Windows login identification name in the form:
    DOMAIN\user_login_name. However, if the current user
    is logged in to SQL Server by using SQL Server
    Authentication, SYSTEM_USER returns the SQL Server
    login identification name"

Oracle:
   "SYSTEM_USER
    Returns the name of the current data store user as
    identified by the operating system."

I am not sure how else we should interpret SYSTEM_USER -- if it isn't 
port->authn_id what else would you propose it should be?

-- 
Joe Conway
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Sat, Aug 06, 2022 at 10:59:26AM -0400, Joe Conway wrote:
> I am not sure how else we should interpret SYSTEM_USER -- if it isn't
> port->authn_id what else would you propose it should be?

What you say sounds rather right, but I was wondering mainly what
Oracle and SQL server report when it comes to other authentication
methods like SSPI or a cert, where we don't use a user name but some
data dependent on the auth method.  And I have no experience with
these.

Anyway, I was looking at Bertrand's patch, and I can see that it is
doing nothing to move away the connection information that we have in
Port away to a different structure passed down to the parallel
workers, which is what I understand is a cleanup worth on its own
based on the discussion of this thread.  Hence, I still see a good
argument for the introduction of ClientConnectionInfo that gets passed
down to the workers.  Based on that, I think that we'd better finish
v11-0002 (only ClientConnectionInfo, no SQL interface) as a first step
to build for the next ones, with authn being the first piece of
information given to the workers.  With a separate structure, the
auth_method can also be a second member in ClientConnectionInfo,
completing what would be needed to build SYSTEM_USER as the workers
would have access to it.

Am I getting that right?
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
"Drouvot, Bertrand"
Date:
Hi,

On 8/7/22 9:41 AM, Michael Paquier wrote:
> Anyway, I was looking at Bertrand's patch, and I can see that it is
> doing nothing to move away the connection information that we have in
> Port away to a different structure passed down to the parallel
> workers,

Thanks for looking at it!

That's right. The main reason is that in the v2-0003 SYSTEM_USER patch 
what is passed down to the parallel workers is not Port->authn_id but a 
new "SystemUser" (defined in miscinit.c with CurrentUserId and friends).

>   which is what I understand is a cleanup worth on its own
> based on the discussion of this thread.  Hence, I still see a good
> argument for the introduction of ClientConnectionInfo that gets passed
> down to the workers.

I agree that it could it be useful too.

> Based on that, I think that we'd better finish
> v11-0002 (only ClientConnectionInfo, no SQL interface)

I agree.

>   as a first step
> to build for the next ones, with authn being the first piece of
> information given to the workers.  With a separate structure, the
> auth_method can also be a second member in ClientConnectionInfo,
> completing what would be needed to build SYSTEM_USER as the workers
> would have access to it.

but I'm not sure we should do it as a first step (given the fact that 
this is not Port->authn_id that is passed down to the parallel workers 
in the SYSTEM_USER patch).

What do you think about working on both (aka a) v11-002 only 
ClientConnectionInfo and b) SYSTEM_USER) in parallel?

Thanks

-- 
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Mon, Aug 08, 2022 at 12:43:14PM +0200, Drouvot, Bertrand wrote:
> but I'm not sure we should do it as a first step (given the fact that this
> is not Port->authn_id that is passed down to the parallel workers in the
> SYSTEM_USER patch).
>
> What do you think about working on both (aka a) v11-002 only
> ClientConnectionInfo and b) SYSTEM_USER) in parallel?

It seems to me that completing ClientConnectionInfo first has the
advantage of not having to tweak twice the interface we are going to
use when passing down the full structure to the workers, so I would
choose for doing it first (with one field for the authn, and a second
field for the auth method so as the the workers can build SYSTEM_USER
by themselves when required).
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
"Drouvot, Bertrand"
Date:
Hi,

On 8/9/22 11:17 AM, Michael Paquier wrote:
> On Mon, Aug 08, 2022 at 12:43:14PM +0200, Drouvot, Bertrand wrote:
>> but I'm not sure we should do it as a first step (given the fact that this
>> is not Port->authn_id that is passed down to the parallel workers in the
>> SYSTEM_USER patch).
>>
>> What do you think about working on both (aka a) v11-002 only
>> ClientConnectionInfo and b) SYSTEM_USER) in parallel?
> It seems to me that completing ClientConnectionInfo first has the
> advantage of not having to tweak twice the interface we are going to
> use when passing down the full structure to the workers, so I would
> choose for doing it first (with one field for the authn, and a second
> field for the auth method so as the the workers can build SYSTEM_USER
> by themselves when required).

Yeah fair point.

Agree that it makes sense to work on those patches in this particular 
order then.

Thanks,

-- 

Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Tue, Aug 9, 2022 at 3:39 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
> Agree that it makes sense to work on those patches in this particular
> order then.

Sounds good. The ClientConnectionInfo patch (previously 0002) is
attached, with the SQL function removed.

Thanks,
--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
"Drouvot, Bertrand"
Date:
Hi,

On 8/10/22 5:09 PM, Jacob Champion wrote:
> On Tue, Aug 9, 2022 at 3:39 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
>> Agree that it makes sense to work on those patches in this particular
>> order then.
> Sounds good. The ClientConnectionInfo patch (previously 0002) is
> attached, with the SQL function removed.

Thanks for the patch!

Looking at:

+typedef struct
+{
+    /*
+     * Authenticated identity.  The meaning of this identifier is 
dependent on
+     * hba->auth_method; it is the identity (if any) that the user 
presented
+     * during the authentication cycle, before they were assigned a 
database
+     * role.  (It is effectively the "SYSTEM-USERNAME" of a pg_ident 
usermap
+     * -- though the exact string in use may be different, depending on 
pg_hba
+     * options.)
+     *
+     * authn_id is NULL if the user has not actually been 
authenticated, for
+     * example if the "trust" auth method is in use.
+     */
+    const char *authn_id;
+} ClientConnectionInfo;

What do you think about adding a second field in ClientConnectionInfo 
for the auth method (as suggested by Michael upthread)?

That will be needed by the SYSTEM_USER patch (that its current version 
implements as "auth_method:identity").

Thanks,

-- 
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Wed, Aug 10, 2022 at 10:48 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
> What do you think about adding a second field in ClientConnectionInfo
> for the auth method (as suggested by Michael upthread)?

Sure -- without a followup patch, it's not really tested, though.

v2 adjusts set_authn_id() to copy the auth_method over as well. It
"passes tests" but is otherwise unexercised.

Thanks,
--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
"Drouvot, Bertrand"
Date:
Hi,

On 8/12/22 12:28 AM, Jacob Champion wrote:
> On Wed, Aug 10, 2022 at 10:48 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
>> What do you think about adding a second field in ClientConnectionInfo
>> for the auth method (as suggested by Michael upthread)?
> Sure -- without a followup patch, it's not really tested, though.
>
> v2 adjusts set_authn_id() to copy the auth_method over as well. It
> "passes tests" but is otherwise unexercised.

Thank you!

To help with the testing I've just provided a new version (aka 
v2-0004-system_user-implementation.patch) of the SYSTEM_USER patch in 
[1] that can be applied on top of 
"v2-0001-Allow-parallel-workers-to-read-authn_id.patch".

But for this to work, the first comment below on your patch needs to be 
addressed.

Once the first comment is addressed and the new SYSTEM_USER patch 
applied (that adds new tap tests) then we can test the propagation to 
the parallel workers with:

make -C src/test/kerberos check PROVE_TESTS=t/001_auth.pl PROVE_FLAGS=-v

and

make -C src/test/authentication check PROVE_TESTS=t/001_password.pl 
PROVE_FLAGS=-v

Both are currently successful.

Regarding the comments on 
v2-0001-Allow-parallel-workers-to-read-authn_id.patch:

1)

This is the one to be applied before adding the new SYSTEM_USER one on top:

+typedef struct
+{
+       /*
+        * Authenticated identity.  The meaning of this identifier is 
dependent on

has to be replaced by:

+typedef struct ClientConnectionInfo
+{
+       /*
+        * Authenticated identity.  The meaning of this identifier is 
dependent on

2)

+        * Authenticated identity.  The meaning of this identifier is 
dependent on

There is one extra space before "The"

3)

+SerializeClientConnectionInfo(Size maxsize, char *start_address)
+{
+       /*
+        * First byte is an indication of whether or not authn_id has 
been set to
+        * non-NULL, to differentiate that case from the empty string.
+        */

is authn_id being an empty string possible?

4)

+ */
+
+ClientConnectionInfo MyClientConnectionInfo;
+
+/*
+ * Calculate the space needed to serialize MyClientConnectionInfo.
+ */
+Size
+EstimateClientConnectionInfoSpace(void)

 From a coding style point of view, shouldn't "ClientConnectionInfo 
MyClientConnectionInfo;" be moved to the top of the file?


[1]: https://commitfest.postgresql.org/39/3703/

Regards,

-- 
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Fri, Aug 12, 2022 at 03:34:04PM +0200, Drouvot, Bertrand wrote:
> 3)
>
> +SerializeClientConnectionInfo(Size maxsize, char *start_address)
> +{
> +       /*
> +        * First byte is an indication of whether or not authn_id has been
> set to
> +        * non-NULL, to differentiate that case from the empty string.
> +        */
>
> is authn_id being an empty string possible?

I don't recall that this can be the case yet, but we cannot discard
it.  One thing was itching me about the serialization and
deserialization logic though: could it be more readable if we used an
intermediate structure to store the length of the serialized strings?
We use this approach in other areas, like for the snapshot data in
snapmgr.c.  This would handle the case of an empty and NULL string, by
storing -1 as length for NULL and >= 0 for the string length if there
is something set, while making the addition of more fields a
no-brainer.
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
"Drouvot, Bertrand"
Date:
Hi,

On 8/14/22 11:57 AM, Michael Paquier wrote:
> On Fri, Aug 12, 2022 at 03:34:04PM +0200, Drouvot, Bertrand wrote:
>> 3)
>>
>> +SerializeClientConnectionInfo(Size maxsize, char *start_address)
>> +{
>> +       /*
>> +        * First byte is an indication of whether or not authn_id has been
>> set to
>> +        * non-NULL, to differentiate that case from the empty string.
>> +        */
>>
>> is authn_id being an empty string possible?
> I don't recall that this can be the case yet, but we cannot discard
> it.
Fair point.
>    One thing was itching me about the serialization and
> deserialization logic though: could it be more readable if we used an
> intermediate structure to store the length of the serialized strings?
> We use this approach in other areas, like for the snapshot data in
> snapmgr.c.  This would handle the case of an empty and NULL string, by
> storing -1 as length for NULL and >= 0 for the string length if there
> is something set, while making the addition of more fields a
> no-brainer.

I think that's a good idea and I think that would be more readable (as 
compare to storing a "hint" in the first byte).

Regards,

-- 
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
Hello,

On Fri, Aug 12, 2022 at 6:34 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
> +typedef struct
> +{
> +       /*
> +        * Authenticated identity.  The meaning of this identifier is
> dependent on
>
> has to be replaced by:
>
> +typedef struct ClientConnectionInfo
> +{
> +       /*
> +        * Authenticated identity.  The meaning of this identifier is
> dependent on

Okay, will do in the next patch (coming soon).

> +        * Authenticated identity.  The meaning of this identifier is
> dependent on
>
> There is one extra space before "The"

This comment block was just moved verbatim; the double-spaced
sentences were there before.

>  From a coding style point of view, shouldn't "ClientConnectionInfo
> MyClientConnectionInfo;" be moved to the top of the file?

The style in this file seems to be to declare the variables at the top
of the section in which they're used. See the sections for "User ID
state" and "Library preload support".

Thanks,
--Jacob



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Tue, Aug 16, 2022 at 2:02 AM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
> On 8/14/22 11:57 AM, Michael Paquier wrote:
> >    One thing was itching me about the serialization and
> > deserialization logic though: could it be more readable if we used an
> > intermediate structure to store the length of the serialized strings?
> > We use this approach in other areas, like for the snapshot data in
> > snapmgr.c.  This would handle the case of an empty and NULL string, by
> > storing -1 as length for NULL and >= 0 for the string length if there
> > is something set, while making the addition of more fields a
> > no-brainer.
>
> I think that's a good idea and I think that would be more readable (as
> compare to storing a "hint" in the first byte).

Sounds good. v3, attached, should make the requested changes:
- declare `struct ClientConnectionInfo`
- use an intermediate serialization struct
- switch to length-"prefixing" for the string

I do like the way this reads compared to before.

Thanks,
--Jacob

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
"Drouvot, Bertrand"
Date:
Hi,

On 8/16/22 6:58 PM, Jacob Champion wrote:
> Sounds good. v3, attached, should make the requested changes:
> - declare `struct ClientConnectionInfo`
> - use an intermediate serialization struct
> - switch to length-"prefixing" for the string
>
> I do like the way this reads compared to before.

Thanks for the new version!

+       /* Copy authn_id into the space after the struct. */
+       if (serialized.authn_id_len >= 0)

Maybe remove the "." at the end of the comment? (to be consistent with 
the other comment just above)

+/*
+ * Restore MyClientConnectionInfo from its serialized representation.
+ */
+void
+RestoreClientConnectionInfo(char *conninfo)
+{
+       SerializedClientConnectionInfo serialized;
+       char       *authn_id;

Move "char       *authn_id;" in the "if (serialized.authn_id_len >= 0)" 
below?

+
+       memcpy(&serialized, conninfo, sizeof(serialized));
+       authn_id = conninfo + sizeof(serialized);

Move "authn_id = conninfo + sizeof(serialized)" in the "if 
(serialized.authn_id_len >= 0)" below?

+
+       /* Copy the fields back into place. */

Remove the "." at the end of the comment?

+       MyClientConnectionInfo.authn_id = NULL;
+       MyClientConnectionInfo.auth_method = serialized.auth_method;
+
+       if (serialized.authn_id_len >= 0)
+               MyClientConnectionInfo.authn_id = 
MemoryContextStrdup(TopMemoryContext,
+ authn_id);

This instead?

     if (serialized.authn_id_len >= 0)
     {
         char       *authn_id;
         authn_id = conninfo + sizeof(serialized);
         MyClientConnectionInfo.authn_id = 
MemoryContextStrdup(TopMemoryContext,
authn_id);
     }

+ src/backend/utils/init/miscinit.c:RestoreClientConnectionInfo(char 
*conninfo)
+ src/include/miscadmin.h:extern void RestoreClientConnectionInfo(char 
*procinfo);

conninfo in both to be consistent?

Apart from the comments above, that looks good to me.

Regards,

-- 
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Wed, Aug 17, 2022 at 09:53:45AM +0200, Drouvot, Bertrand wrote:
> Thanks for the new version!
>
> +       /* Copy authn_id into the space after the struct. */
> +       if (serialized.authn_id_len >= 0)
>
> Maybe remove the "." at the end of the comment? (to be consistent with the
> other comment just above)

When it comes to such things, I usually apply the rule of consistency
with the surroundings, which sounds right here.

> +       memcpy(&serialized, conninfo, sizeof(serialized));
> +       authn_id = conninfo + sizeof(serialized);
>
> Move "authn_id = conninfo + sizeof(serialized)" in the "if
> (serialized.authn_id_len >= 0)" below?

Makes sense, so as never have something pointing to an area should
should not look at.  This should just be used when we know that there
is going to be a string.

> + src/backend/utils/init/miscinit.c:RestoreClientConnectionInfo(char
> *conninfo)
> + src/include/miscadmin.h:extern void RestoreClientConnectionInfo(char
> *procinfo);
>
> conninfo in both to be consistent?

Yep.  Looks like a copy-pasto, seen from here.

By the way, I have looked at the patch, tweaked a couple of things
with comments and the style, but overval that's fine.  First, I have
intended to apply this stuff today but I have lacked the time to do
so.  I should be able to get this wrapped tomorrow, though.
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On Mon, Aug 22, 2022 at 4:32 AM Michael Paquier <michael@paquier.xyz> wrote:
> By the way, I have looked at the patch, tweaked a couple of things
> with comments and the style, but overval that's fine.  First, I have
> intended to apply this stuff today but I have lacked the time to do
> so.  I should be able to get this wrapped tomorrow, though.

Thank you both for the reviews! Let me know if it would help for me to
issue a new patchset, otherwise I'll sit tight.

--Jacob



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Mon, Aug 22, 2022 at 08:10:10AM -0700, Jacob Champion wrote:
> otherwise I'll sit tight.

So am I.  I have done an extra round of checks around the
serialization/deserialization logic where I put some elog()'s to look
at the output passed down with some workers and a couple of auth
methods, and after an indentation and some comment polishing I finish
with the attached.

There was one thing that annoyed me with the patch, though, as of the
lack of initialization of MyClientConnectionInfo at backend startup,
as we may finish by not calling set_authn() to fill in some of its
data, so I have placed an extra memset(0) in InitProcessGlobals()
(note that Port does a calloc() much earlier than that, but I think
that we don't really want to do more in such code paths, especially
for the parallelized client information).

I have written a commit message, while on it.  Does that look fine to
you?
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
"Drouvot, Bertrand"
Date:
Hi,

On 8/23/22 4:25 AM, Michael Paquier wrote:
> On Mon, Aug 22, 2022 at 08:10:10AM -0700, Jacob Champion wrote:
>> otherwise I'll sit tight.
> So am I.  I have done an extra round of checks around the
> serialization/deserialization logic where I put some elog()'s to look
> at the output passed down with some workers and a couple of auth
> methods, and after an indentation and some comment polishing I finish
> with the attached.
>
> There was one thing that annoyed me with the patch, though, as of the
> lack of initialization of MyClientConnectionInfo at backend startup,
> as we may finish by not calling set_authn() to fill in some of its
> data, so I have placed an extra memset(0) in InitProcessGlobals()
Fair point.
> (note that Port does a calloc() much earlier than that, but I think
> that we don't really want to do more in such code paths, especially
> for the parallelized client information).
>
> I have written a commit message, while on it.  Does that look fine to
> you?
Thanks!

That sounds all good to me, except a typo for the author in the commit 
message: s/Jocob/Jacob/

Regards,

-- 
Bertrand Drouvot
Amazon Web Services: https://aws.amazon.com




Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Jacob Champion
Date:
On 8/23/22 01:53, Drouvot, Bertrand wrote:
> That sounds all good to me, except a typo for the author in the commit 
> message: s/Jocob/Jacob/

Thanks, I missed that on my readthrough! :D

Patch looks good to me, too, with one question:

> @@ -2688,6 +2689,7 @@ InitProcessGlobals(void)                                   
>     MyProcPid = getpid();                                                        
>     MyStartTimestamp = GetCurrentTimestamp();                                    
>     MyStartTime = timestamptz_to_time_t(MyStartTimestamp);                       
> +   memset(&MyClientConnectionInfo, 0, sizeof(MyClientConnectionInfo));          
>                                                                                  
>     /*                                                                           
>      * Set a different global seed in every process.  We want something 

When can we rely on static initialization, and when can't we? Is there a
concern that the memory could have been polluted from before the
postmaster's fork?

Thanks,
--Jacob



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Michael Paquier
Date:
On Tue, Aug 23, 2022 at 10:04:30AM -0700, Jacob Champion wrote:
> On 8/23/22 01:53, Drouvot, Bertrand wrote:
>> @@ -2688,6 +2689,7 @@ InitProcessGlobals(void)
>>     MyProcPid = getpid();
>>     MyStartTimestamp = GetCurrentTimestamp();
>>     MyStartTime = timestamptz_to_time_t(MyStartTimestamp);
>> +   memset(&MyClientConnectionInfo, 0, sizeof(MyClientConnectionInfo));
>>
>>     /*
>>      * Set a different global seed in every process.  We want something
>
> When can we rely on static initialization, and when can't we? Is there a
> concern that the memory could have been polluted from before the
> postmaster's fork?

My main worry here is EXEC_BACKEND, where we would just use our own
implementation of fork(), and it is a bad idea at the end to leave
that untouched while we could have code paths that attempt to access
it.  At the end, I have moved the initialization at the same place as
where we set MyProcPort for a backend in BackendInitialize(), mainly
as a matter of consistency because ClientConnectionInfo is aimed at
being a subset of that.  And applied.
--
Michael

Attachment

Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> On Tue, Aug 23, 2022 at 10:04:30AM -0700, Jacob Champion wrote:
>> When can we rely on static initialization, and when can't we? Is there a
>> concern that the memory could have been polluted from before the
>> postmaster's fork?

> My main worry here is EXEC_BACKEND, where we would just use our own
> implementation of fork(), and it is a bad idea at the end to leave
> that untouched while we could have code paths that attempt to access
> it.

Uh ... what?  EXEC_BACKEND is even more certain to correctly initialize
static/global variables in a child process.  I agree with Jacob that
this memset is probably useless, and therefore confusing.

            regards, tom lane



Re: [PATCH] Expose port->authn_id to extensions and triggers

From
Andy Fan
Date:
Hi,

Michael Paquier <michael@paquier.xyz> writes:

> [[PGP Signed Part:Undecided]]
> On Tue, Aug 23, 2022 at 10:04:30AM -0700, Jacob Champion wrote:
>
> My main worry here is EXEC_BACKEND, where we would just use our own
> implementation of fork(), and it is a bad idea at the end to leave
> that untouched while we could have code paths that attempt to access
> it.  At the end, I have moved the initialization at the same place as
> where we set MyProcPort for a backend in BackendInitialize(), mainly
> as a matter of consistency because ClientConnectionInfo is aimed at
> being a subset of that.  And applied.

I found a compiler complaint of this patch. The attached fix that.

-- 
Best Regards
Andy Fan


Attachment