Thread: [PoC] Federated Authn/z with OAUTHBEARER

[PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
Hi all,

We've been working on ways to expand the list of third-party auth
methods that Postgres provides. Some example use cases might be "I want
to let anyone with a Google account read this table" or "let anyone who
belongs to this GitHub organization connect as a superuser".

Attached is a proof of concept that implements pieces of OAuth 2.0
federated authorization, via the OAUTHBEARER SASL mechanism from RFC
7628 [1]. Currently, only Linux is supported due to some ugly hacks in
the backend.

The architecture can support the following use cases, as long as your
OAuth issuer of choice implements the necessary specs, and you know how
to write a validator for your issuer's bearer tokens:

- Authentication only, where an external validator uses the bearer
token to determine the end user's identity, and Postgres decides
whether that user ID is authorized to connect via the standard pg_ident
user mapping.

- Authorization only, where the validator uses the bearer token to
determine the allowed roles for the end user, and then checks to make
sure that the connection's role is one of those. This bypasses pg_ident
and allows pseudonymous connections, where Postgres doesn't care who
you are as long as the token proves you're allowed to assume the role
you want.

- A combination, where the validator provides both an authn_id (for
later audits of database access) and an authorization decision based on
the bearer token and role provided.

It looks kinda like this during use:

    $ psql 'host=example.org oauth_client_id=f02c6361-0635-...'
    Visit https://oauth.example.org/login and enter the code: FPQ2-M4BG

= Quickstart =

For anyone who likes building and seeing green tests ASAP.

Prerequisite software:
- iddawc v0.9.9 [2], library and dev headers, for client support
- Python 3, for the test suite only

(Some newer distributions have dev packages for iddawc, but mine did
not.)

Configure using --with-oauth (and, if you've installed iddawc into a
non-standard location, be sure to use --with-includes and --with-
libraries. Make sure either rpath or LD_LIBRARY_PATH will get you what
you need). Install as usual.

To run the test suite, make sure the contrib/authn_id extension is
installed, then init and start your dev cluster. No other configuration
is required; the test will do it for you. Switch to the src/test/python
directory, point your PG* envvars to a superuser connection on the
cluster (so that a "bare" psql will connect automatically), and run
`make installcheck`.

= Production Setup =

(but don't use this in production, please)

Actually setting up a "real" system requires knowing the specifics of
your third-party issuer of choice. Your issuer MUST implement OpenID
Discovery and the OAuth Device Authorization flow! Seriously, check
this before spending a lot of time writing a validator against an
issuer that can't actually talk to libpq.

The broad strokes are as follows:

1. Register a new public client with your issuer to get an OAuth client
ID for libpq. You'll use this as the oauth_client_id in the connection
string. (If your issuer doesn't support public clients and gives you a
client secret, you can use the oauth_client_secret connection parameter
to provide that too.)

The client you register must be able to use a device authorization
flow; some issuers require additional setup for that.

2. Set up your HBA with the 'oauth' auth method, and set the 'issuer'
and 'scope' options. 'issuer' is the base URL identifying your third-
party issuer (for example, https://accounts.google.com), and 'scope' is
the set of OAuth scopes that the client and server will need to
authenticate and/or authorize the user (e.g. "openid email").

So a sample HBA line might look like

    host all all samehost oauth issuer="https://accounts.google.com" scope="openid email"

3. In postgresql.conf, set up an oauth_validator_command that's capable
of verifying bearer tokens and implements the validator protocol. This
is the hardest part. See below.

= Design =

On the client side, I've implemented the Device Authorization flow (RFC
8628, [3]). What this means in practice is that libpq reaches out to a
third-party issuer (e.g. Google, Azure, etc.), identifies itself with a
client ID, and requests permission to act on behalf of the end user.
The issuer responds with a login URL and a one-time code, which libpq
presents to the user using the notice hook. The end user then navigates
to that URL, presents their code, authenticates to the issuer, and
grants permission for libpq to retrieve a bearer token. libpq grabs a
token and sends it to the server for verification.

(The bearer token, in this setup, is essentially a plaintext password,
and you must secure it like you would a plaintext password. The token
has an expiration date and can be explicitly revoked, which makes it
slightly better than a password, but this is still a step backwards
from something like SCRAM with channel binding. There are ways to bind
a bearer token to a client certificate [4], which would mitigate the
risk of token theft -- but your issuer has to support that, and I
haven't found much support in the wild.)

The server side is where things get more difficult for the DBA. The
OAUTHBEARER spec has this to say about the server side implementation:

   The server validates the response according to the specification for
   the OAuth Access Token Types used.

And here's what the Bearer Token specification [5] says:

   This document does not specify the encoding or the contents of the
   token; hence, detailed recommendations about the means of
   guaranteeing token integrity protection are outside the scope of
   this document.

It's the Wild West. Every issuer does their own thing in their own
special way. Some don't really give you a way to introspect information
about a bearer token at all, because they assume that the issuer of the
token and the consumer of the token are essentially the same service.
Some major players provide their own custom libraries, implemented in
your-language-of-choice, to deal with their particular brand of magic.

So I punted and added the oauth_validator_command GUC. A token
validator command reads the bearer token from a file descriptor that's
passed to it, then does whatever magic is necessary to validate that
token and find out who owns it. Optionally, it can look at the role
that's being connected and make sure that the token authorizes the user
to actually use that role. Then it says yea or nay to Postgres, and
optionally tells the server who the user is so that their ID can be
logged and mapped through pg_ident.

(See the commit message in 0005 for a full description of the protocol.
The test suite also has two toy implementations that illustrate the
protocol, but they provide zero security.)

This is easily the worst part of the patch, not only because my
implementation is a bad hack on OpenPipeStream(), but because it
balances the security of the entire system on the shoulders of a DBA
who does not have time to read umpteen OAuth specifications cover to
cover. More thought and coding effort is needed here, but I didn't want
to gold-plate a bad design. I'm not sure what alternatives there are
within the rules laid out by OAUTHBEARER. And the system is _extremely_
flexible, in the way that only code that's maintained by somebody else
can be.

= Patchset Roadmap =

The seven patches can be grouped into three:

1. Prep

  0001 decouples the SASL code from the SCRAM implementation.
  0002 makes it possible to use common/jsonapi from the frontend.
  0003 lets the json_errdetail() result be freed, to avoid leaks.

2. OAUTHBEARER Implementation

  0004 implements the client with libiddawc.
  0005 implements server HBA support and oauth_validator_command.

3. Testing

  0006 adds a simple test extension to retrieve the authn_id.
  0007 adds the Python test suite I've been developing against.

The first three patches are, hopefully, generally useful outside of
this implementation, and I'll plan to register them in the next
commitfest. The middle two patches are the "interesting" pieces, and
I've split them into client and server for ease of understanding,
though neither is particularly useful without the other.

The last two patches grew out of a test suite that I originally built
to be able to exercise NSS corner cases at the protocol/byte level. It
was incredibly helpful during implementation of this new SASL
mechanism, since I could write the client and server independently of
each other and get high coverage of broken/malicious implementations.
It's based on pytest and Construct, and the Python 3 requirement might
turn some away, but I wanted to include it in case anyone else wanted
to hack on the code. src/test/python/README explains more.

= Thoughts/Reflections =

...in no particular order.

I picked OAuth 2.0 as my first experiment in federated auth mostly
because I was already familiar with pieces of it. I think SAML (via the
SAML20 mechanism, RFC 6595) would be a good companion to this proof of
concept, if there is general interest in federated deployments.

I don't really like the OAUTHBEARER spec, but I'm not sure there's a
better alternative. Everything is left as an exercise for the reader.
It's not particularly extensible. Standard OAuth is built for
authorization, not authentication, and from reading the RFC's history,
it feels like it was a hack to just get something working. New
standards like OpenID Connect have begun to fill in the gaps, but the
SASL mechanisms have not kept up. (The OPENID20 mechanism is, to my
understanding, unrelated/obsolete.) And support for helpful OIDC
features seems to be spotty in the real world.

The iddawc dependency for client-side OAuth was extremely helpful to
develop this proof of concept quickly, but I don't think it would be an
appropriate component to build a real feature on. It's extremely
heavyweight -- it incorporates a huge stack of dependencies, including
a logging framework and a web server, to implement features we would
probably never use -- and it's fairly difficult to debug in practice.
If a device authorization flow were the only thing that libpq needed to
support natively, I think we should just depend on a widely used HTTP
client, like libcurl or neon, and implement the minimum spec directly
against the existing test suite.

There are a huge number of other authorization flows besides Device
Authorization; most would involve libpq automatically opening a web
browser for you. I felt like that wasn't an appropriate thing for a
library to do by default, especially when one of the most important
clients is a command-line application. Perhaps there could be a hook
for applications to be able to override the builtin flow and substitute
their own.

Since bearer tokens are essentially plaintext passwords, the relevant
specs require the use of transport-level protection, and I think it'd
be wise for the client to require TLS to be in place before performing
the initial handshake or sending a token.

Not every OAuth issuer is also an OpenID Discovery provider, so it's
frustrating that OAUTHBEARER (which is purportedly an OAuth 2.0
feature) requires OIDD for real-world implementations. Perhaps we could
hack around this with a data: URI or something.

The client currently performs the OAuth login dance every single time a
connection is made, but a proper OAuth client would cache its tokens to
reuse later, and keep an eye on their expiration times. This would make
daily use a little more like that of Kerberos, but we would have to
design a way to create and secure a token cache on disk.

If you've read this far, thank you for your interest, and I hope you
enjoy playing with it!

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc7628
[2] https://github.com/babelouest/iddawc
[3] https://datatracker.ietf.org/doc/html/rfc8628
[4] https://datatracker.ietf.org/doc/html/rfc8705
[5] https://datatracker.ietf.org/doc/html/rfc6750#section-5.2

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Michael Paquier
Date:
On Tue, Jun 08, 2021 at 04:37:46PM +0000, Jacob Champion wrote:
> 1. Prep
>
>   0001 decouples the SASL code from the SCRAM implementation.
>   0002 makes it possible to use common/jsonapi from the frontend.
>   0003 lets the json_errdetail() result be freed, to avoid leaks.
>
> The first three patches are, hopefully, generally useful outside of
> this implementation, and I'll plan to register them in the next
> commitfest. The middle two patches are the "interesting" pieces, and
> I've split them into client and server for ease of understanding,
> though neither is particularly useful without the other.

Beginning with the beginning, could you spawn two threads for the
jsonapi rework and the SASL/SCRAM business?  I agree that these look
independently useful.  Glad to see someone improving the code with
SASL and SCRAM which are too inter-dependent now.  I saw in the RFCs
dedicated to OAUTH the need for the JSON part as well.

+#  define check_stack_depth()
+#  ifdef JSONAPI_NO_LOG
+#    define json_log_and_abort(...) \
+   do { fprintf(stderr, __VA_ARGS__); exit(1); } while(0)
+#  else
In patch 0002, this is the wrong approach.  libpq will not be able to
feed on such reports, and you cannot use any of the APIs from the
palloc() family either as these just fail on OOM.  libpq should be
able to know about the error, and would fill in the error back to the
application.  This abstraction is not necessary on HEAD as
pg_verifybackup is fine with this level of reporting.  My rough guess
is that we will need to split the existing jsonapi.c into two files,
one that can be used in shared libraries and a second that handles the
errors.

+           /* TODO: SASL_EXCHANGE_FAILURE with output is forbidden in SASL */
            if (result == SASL_EXCHANGE_SUCCESS)
                sendAuthRequest(port,
                            AUTH_REQ_SASL_FIN,
                            output,
                            outputlen);
Perhaps that's an issue we need to worry on its own?  I didn't recall
this part..
--
Michael

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Heikki Linnakangas
Date:
On 08/06/2021 19:37, Jacob Champion wrote:
> We've been working on ways to expand the list of third-party auth
> methods that Postgres provides. Some example use cases might be "I want
> to let anyone with a Google account read this table" or "let anyone who
> belongs to this GitHub organization connect as a superuser".

Cool!

> The iddawc dependency for client-side OAuth was extremely helpful to
> develop this proof of concept quickly, but I don't think it would be an
> appropriate component to build a real feature on. It's extremely
> heavyweight -- it incorporates a huge stack of dependencies, including
> a logging framework and a web server, to implement features we would
> probably never use -- and it's fairly difficult to debug in practice.
> If a device authorization flow were the only thing that libpq needed to
> support natively, I think we should just depend on a widely used HTTP
> client, like libcurl or neon, and implement the minimum spec directly
> against the existing test suite.

You could punt and let the application implement that stuff. I'm 
imagining that the application code would look something like this:

conn = PQconnectStartParams(...);
for (;;)
{
     status = PQconnectPoll(conn)
     switch (status)
     {
         case CONNECTION_SASL_TOKEN_REQUIRED:
             /* open a browser for the user, get token */
             token = open_browser()
             PQauthResponse(token);
             break;
         ...
     }
}

It would be nice to have a simple default implementation, though, for 
psql and all the other client applications that come with PostgreSQL itself.

> If you've read this far, thank you for your interest, and I hope you
> enjoy playing with it!

A few small things caught my eye in the backend oauth_exchange function:

> +       /* Handle the client's initial message. */
> +       p = strdup(input);

this strdup() should be pstrdup().

In the same function, there are a bunch of reports like this:

>                    ereport(ERROR,
> +                          (errcode(ERRCODE_PROTOCOL_VIOLATION),
> +                           errmsg("malformed OAUTHBEARER message"),
> +                           errdetail("Comma expected, but found character \"%s\".",
> +                                     sanitize_char(*p))));

I don't think the double quotes are needed here, because sanitize_char 
will return quotes if it's a single character. So it would end up 
looking like this: ... found character "'x'".

- Heikki



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
> On 08/06/2021 19:37, Jacob Champion wrote:
> > We've been working on ways to expand the list of third-party auth
> > methods that Postgres provides. Some example use cases might be "I want
> > to let anyone with a Google account read this table" or "let anyone who
> > belongs to this GitHub organization connect as a superuser".
> 
> Cool!

Glad you think so! :D

> > The iddawc dependency for client-side OAuth was extremely helpful to
> > develop this proof of concept quickly, but I don't think it would be an
> > appropriate component to build a real feature on. It's extremely
> > heavyweight -- it incorporates a huge stack of dependencies, including
> > a logging framework and a web server, to implement features we would
> > probably never use -- and it's fairly difficult to debug in practice.
> > If a device authorization flow were the only thing that libpq needed to
> > support natively, I think we should just depend on a widely used HTTP
> > client, like libcurl or neon, and implement the minimum spec directly
> > against the existing test suite.
> 
> You could punt and let the application implement that stuff. I'm 
> imagining that the application code would look something like this:
> 
> conn = PQconnectStartParams(...);
> for (;;)
> {
>      status = PQconnectPoll(conn)
>      switch (status)
>      {
>          case CONNECTION_SASL_TOKEN_REQUIRED:
>              /* open a browser for the user, get token */
>              token = open_browser()
>              PQauthResponse(token);
>              break;
>          ...
>      }
> }

I was toying with the idea of having a callback for libpq clients,
where they could take full control of the OAuth flow if they wanted to.
Doing it inline with PQconnectPoll seems like it would work too. It has
a couple of drawbacks that I can see:

- If a client isn't currently using a poll loop, they'd have to switch
to one to be able to use OAuth connections. Not a difficult change, but
considering all the other hurdles to making this work, I'm hoping to
minimize the hoop-jumping.

- A client would still have to receive a bunch of OAuth parameters from
some new libpq API in order to construct the correct URL to visit, so
the overall complexity for implementers might be higher than if we just
passed those params directly in a callback.

> It would be nice to have a simple default implementation, though, for 
> psql and all the other client applications that come with PostgreSQL itself.

I agree. I think having a bare-bones implementation in libpq itself
would make initial adoption *much* easier, and then if specific
applications wanted to have richer control over an authorization flow,
then they could implement that themselves with the aforementioned
callback.

The Device Authorization flow was the most minimal working
implementation I could find, since it doesn't require a web browser on
the system, just the ability to print a prompt to the console. But if
anyone knows of a better flow for this use case, I'm all ears.

> > If you've read this far, thank you for your interest, and I hope you
> > enjoy playing with it!
> 
> A few small things caught my eye in the backend oauth_exchange function:
> 
> > +       /* Handle the client's initial message. */
> > +       p = strdup(input);
> 
> this strdup() should be pstrdup().

Thanks, I'll fix that in the next re-roll.

> In the same function, there are a bunch of reports like this:
> 
> >                    ereport(ERROR,
> > +                          (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > +                           errmsg("malformed OAUTHBEARER message"),
> > +                           errdetail("Comma expected, but found character \"%s\".",
> > +                                     sanitize_char(*p))));
> 
> I don't think the double quotes are needed here, because sanitize_char 
> will return quotes if it's a single character. So it would end up 
> looking like this: ... found character "'x'".

I'll fix this too. Thanks!

--Jacob

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, 2021-06-18 at 13:07 +0900, Michael Paquier wrote:
> On Tue, Jun 08, 2021 at 04:37:46PM +0000, Jacob Champion wrote:
> > 1. Prep
> > 
> >   0001 decouples the SASL code from the SCRAM implementation.
> >   0002 makes it possible to use common/jsonapi from the frontend.
> >   0003 lets the json_errdetail() result be freed, to avoid leaks.
> > 
> > The first three patches are, hopefully, generally useful outside of
> > this implementation, and I'll plan to register them in the next
> > commitfest. The middle two patches are the "interesting" pieces, and
> > I've split them into client and server for ease of understanding,
> > though neither is particularly useful without the other.
> 
> Beginning with the beginning, could you spawn two threads for the
> jsonapi rework and the SASL/SCRAM business?

Done [1, 2]. I've copied your comments into those threads with my
responses, and I'll have them registered in commitfest shortly.

Thanks!
--Jacob

[1] https://www.postgresql.org/message-id/3d2a6f5d50e741117d6baf83eb67ebf1a8a35a11.camel%40vmware.com
[2] https://www.postgresql.org/message-id/a250d475ba1c0cc0efb7dfec8e538fcc77cdcb8e.camel%40vmware.com

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Michael Paquier
Date:
On Tue, Jun 22, 2021 at 11:26:03PM +0000, Jacob Champion wrote:
> Done [1, 2]. I've copied your comments into those threads with my
> responses, and I'll have them registered in commitfest shortly.

Thanks!
--
Michael

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, 2021-06-22 at 23:22 +0000, Jacob Champion wrote:
> On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
> > 
> > A few small things caught my eye in the backend oauth_exchange function:
> > 
> > > +       /* Handle the client's initial message. */
> > > +       p = strdup(input);
> > 
> > this strdup() should be pstrdup().
> 
> Thanks, I'll fix that in the next re-roll.
> 
> > In the same function, there are a bunch of reports like this:
> > 
> > >                    ereport(ERROR,
> > > +                          (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > > +                           errmsg("malformed OAUTHBEARER message"),
> > > +                           errdetail("Comma expected, but found character \"%s\".",
> > > +                                     sanitize_char(*p))));
> > 
> > I don't think the double quotes are needed here, because sanitize_char 
> > will return quotes if it's a single character. So it would end up 
> > looking like this: ... found character "'x'".
> 
> I'll fix this too. Thanks!

v2, attached, incorporates Heikki's suggested fixes and also rebases on
top of latest HEAD, which had the SASL refactoring changes committed
last month.

The biggest change from the last patchset is 0001, an attempt at
enabling jsonapi in the frontend without the use of palloc(), based on
suggestions by Michael and Tom from last commitfest. I've also made
some improvements to the pytest suite. No major changes to the OAuth
implementation yet.

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Zhihong Yu
Date:


On Wed, Aug 25, 2021 at 11:42 AM Jacob Champion <pchampion@vmware.com> wrote:
On Tue, 2021-06-22 at 23:22 +0000, Jacob Champion wrote:
> On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
> >
> > A few small things caught my eye in the backend oauth_exchange function:
> >
> > > +       /* Handle the client's initial message. */
> > > +       p = strdup(input);
> >
> > this strdup() should be pstrdup().
>
> Thanks, I'll fix that in the next re-roll.
>
> > In the same function, there are a bunch of reports like this:
> >
> > >                    ereport(ERROR,
> > > +                          (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > > +                           errmsg("malformed OAUTHBEARER message"),
> > > +                           errdetail("Comma expected, but found character \"%s\".",
> > > +                                     sanitize_char(*p))));
> >
> > I don't think the double quotes are needed here, because sanitize_char
> > will return quotes if it's a single character. So it would end up
> > looking like this: ... found character "'x'".
>
> I'll fix this too. Thanks!

v2, attached, incorporates Heikki's suggested fixes and also rebases on
top of latest HEAD, which had the SASL refactoring changes committed
last month.

The biggest change from the last patchset is 0001, an attempt at
enabling jsonapi in the frontend without the use of palloc(), based on
suggestions by Michael and Tom from last commitfest. I've also made
some improvements to the pytest suite. No major changes to the OAuth
implementation yet.

--Jacob
Hi,
For v2-0001-common-jsonapi-support-FRONTEND-clients.patch :

+   /* Clean up. */
+   termJsonLexContext(&lex); 

At the end of termJsonLexContext(), empty is copied to lex. For stack based JsonLexContext, the copy seems unnecessary.
Maybe introduce a boolean parameter for termJsonLexContext() to signal that the copy can be omitted ?

+#ifdef FRONTEND
+       /* make sure initialization succeeded */
+       if (lex->strval == NULL)
+           return JSON_OUT_OF_MEMORY;

Should PQExpBufferBroken(lex->strval) be used for the check ?

Thanks

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Zhihong Yu
Date:


On Wed, Aug 25, 2021 at 3:25 PM Zhihong Yu <zyu@yugabyte.com> wrote:


On Wed, Aug 25, 2021 at 11:42 AM Jacob Champion <pchampion@vmware.com> wrote:
On Tue, 2021-06-22 at 23:22 +0000, Jacob Champion wrote:
> On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
> >
> > A few small things caught my eye in the backend oauth_exchange function:
> >
> > > +       /* Handle the client's initial message. */
> > > +       p = strdup(input);
> >
> > this strdup() should be pstrdup().
>
> Thanks, I'll fix that in the next re-roll.
>
> > In the same function, there are a bunch of reports like this:
> >
> > >                    ereport(ERROR,
> > > +                          (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > > +                           errmsg("malformed OAUTHBEARER message"),
> > > +                           errdetail("Comma expected, but found character \"%s\".",
> > > +                                     sanitize_char(*p))));
> >
> > I don't think the double quotes are needed here, because sanitize_char
> > will return quotes if it's a single character. So it would end up
> > looking like this: ... found character "'x'".
>
> I'll fix this too. Thanks!

v2, attached, incorporates Heikki's suggested fixes and also rebases on
top of latest HEAD, which had the SASL refactoring changes committed
last month.

The biggest change from the last patchset is 0001, an attempt at
enabling jsonapi in the frontend without the use of palloc(), based on
suggestions by Michael and Tom from last commitfest. I've also made
some improvements to the pytest suite. No major changes to the OAuth
implementation yet.

--Jacob
Hi,
For v2-0001-common-jsonapi-support-FRONTEND-clients.patch :

+   /* Clean up. */
+   termJsonLexContext(&lex); 

At the end of termJsonLexContext(), empty is copied to lex. For stack based JsonLexContext, the copy seems unnecessary.
Maybe introduce a boolean parameter for termJsonLexContext() to signal that the copy can be omitted ?

+#ifdef FRONTEND
+       /* make sure initialization succeeded */
+       if (lex->strval == NULL)
+           return JSON_OUT_OF_MEMORY;

Should PQExpBufferBroken(lex->strval) be used for the check ?

Thanks
Hi,
For v2-0002-libpq-add-OAUTHBEARER-SASL-mechanism.patch :

+   i_init_session(&session);
+
+   if (!conn->oauth_client_id)
+   {
+       /* We can't talk to a server without a client identifier. */
+       appendPQExpBufferStr(&conn->errorMessage,
+                            libpq_gettext("no oauth_client_id is set for the connection"));
+       goto cleanup;

Can conn->oauth_client_id check be performed ahead of i_init_session() ? That way, ```goto cleanup``` can be replaced with return.

+       if (!error_code || (strcmp(error_code, "authorization_pending")
+                           && strcmp(error_code, "slow_down")))

What if, in the future, there is error code different from the above two which doesn't represent "OAuth token retrieval failed" scenario ?

For client_initial_response(),

+   token_buf = createPQExpBuffer();
+   if (!token_buf)
+       goto cleanup;

If token_buf is NULL, there doesn't seem to be anything to free. We can return directly.

Cheers

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Wed, 2021-08-25 at 15:25 -0700, Zhihong Yu wrote:
> 
> Hi,
> For v2-0001-common-jsonapi-support-FRONTEND-clients.patch :
> 
> +   /* Clean up. */
> +   termJsonLexContext(&lex); 
> 
> At the end of termJsonLexContext(), empty is copied to lex. For stack
> based JsonLexContext, the copy seems unnecessary.
> Maybe introduce a boolean parameter for termJsonLexContext() to
> signal that the copy can be omitted ?

Do you mean heap-based? i.e. destroyJsonLexContext() does an
unnecessary copy before free? Yeah, in that case it's not super useful,
but I think I'd want some evidence that the performance hit matters
before optimizing it.

Are there any other internal APIs that take a boolean parameter like
that? If not, I think we'd probably just want to remove the copy
entirely if it's a problem.

> +#ifdef FRONTEND
> +       /* make sure initialization succeeded */
> +       if (lex->strval == NULL)
> +           return JSON_OUT_OF_MEMORY;
> 
> Should PQExpBufferBroken(lex->strval) be used for the check ?

It should be okay to continue if the strval is broken but non-NULL,
since it's about to be reset. That has the fringe benefit of allowing
the function to go as far as possible without failing, though that's
probably a pretty weak justification.

In practice, do you think that the probability of success is low enough
that we should just short-circuit and be done with it?

On Wed, 2021-08-25 at 16:24 -0700, Zhihong Yu wrote:
> 
> For v2-0002-libpq-add-OAUTHBEARER-SASL-mechanism.patch :
> 
> +   i_init_session(&session);
> +
> +   if (!conn->oauth_client_id)
> +   {
> +       /* We can't talk to a server without a client identifier. */
> +       appendPQExpBufferStr(&conn->errorMessage,
> +                            libpq_gettext("no oauth_client_id is set for the connection"));
> +       goto cleanup;
> 
> Can conn->oauth_client_id check be performed ahead
> of i_init_session() ? That way, ```goto cleanup``` can be replaced
> with return.

Yeah, I think that makes sense. FYI, this is probably one of the
functions that will be rewritten completely once iddawc is removed.

> +       if (!error_code || (strcmp(error_code, "authorization_pending")
> +                           && strcmp(error_code, "slow_down")))
> 
> What if, in the future, there is error code different from the above
> two which doesn't represent "OAuth token retrieval failed" scenario ?

We'd have to update our code; that would be a breaking change to the
Device Authorization spec. Here's what it says today [1]:

   The "authorization_pending" and "slow_down" error codes define
   particularly unique behavior, as they indicate that the OAuth client
   should continue to poll the token endpoint by repeating the token
   request (implementing the precise behavior defined above).  If the
   client receives an error response with any other error code, it MUST
   stop polling and SHOULD react accordingly, for example, by displaying
   an error to the user.

> For client_initial_response(),
> 
> +   token_buf = createPQExpBuffer();
> +   if (!token_buf)
> +       goto cleanup;
> 
> If token_buf is NULL, there doesn't seem to be anything to free. We
> can return directly.

That's true today, but implementations have a habit of changing. I
personally prefer not to introduce too many exit points from a function
that's already using goto. In my experience, that makes future
maintenance harder.

Thanks for the reviews! Have you been able to give the patchset a try
with an OAuth deployment?

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc8628#section-3.5

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Zhihong Yu
Date:


On Thu, Aug 26, 2021 at 9:13 AM Jacob Champion <pchampion@vmware.com> wrote:
On Wed, 2021-08-25 at 15:25 -0700, Zhihong Yu wrote:
>
> Hi,
> For v2-0001-common-jsonapi-support-FRONTEND-clients.patch :
>
> +   /* Clean up. */
> +   termJsonLexContext(&lex);
>
> At the end of termJsonLexContext(), empty is copied to lex. For stack
> based JsonLexContext, the copy seems unnecessary.
> Maybe introduce a boolean parameter for termJsonLexContext() to
> signal that the copy can be omitted ?

Do you mean heap-based? i.e. destroyJsonLexContext() does an
unnecessary copy before free? Yeah, in that case it's not super useful,
but I think I'd want some evidence that the performance hit matters
before optimizing it.

Are there any other internal APIs that take a boolean parameter like
that? If not, I think we'd probably just want to remove the copy
entirely if it's a problem.

> +#ifdef FRONTEND
> +       /* make sure initialization succeeded */
> +       if (lex->strval == NULL)
> +           return JSON_OUT_OF_MEMORY;
>
> Should PQExpBufferBroken(lex->strval) be used for the check ?

It should be okay to continue if the strval is broken but non-NULL,
since it's about to be reset. That has the fringe benefit of allowing
the function to go as far as possible without failing, though that's
probably a pretty weak justification.

In practice, do you think that the probability of success is low enough
that we should just short-circuit and be done with it?

On Wed, 2021-08-25 at 16:24 -0700, Zhihong Yu wrote:
>
> For v2-0002-libpq-add-OAUTHBEARER-SASL-mechanism.patch :
>
> +   i_init_session(&session);
> +
> +   if (!conn->oauth_client_id)
> +   {
> +       /* We can't talk to a server without a client identifier. */
> +       appendPQExpBufferStr(&conn->errorMessage,
> +                            libpq_gettext("no oauth_client_id is set for the connection"));
> +       goto cleanup;
>
> Can conn->oauth_client_id check be performed ahead
> of i_init_session() ? That way, ```goto cleanup``` can be replaced
> with return.

Yeah, I think that makes sense. FYI, this is probably one of the
functions that will be rewritten completely once iddawc is removed.

> +       if (!error_code || (strcmp(error_code, "authorization_pending")
> +                           && strcmp(error_code, "slow_down")))
>
> What if, in the future, there is error code different from the above
> two which doesn't represent "OAuth token retrieval failed" scenario ?

We'd have to update our code; that would be a breaking change to the
Device Authorization spec. Here's what it says today [1]:

   The "authorization_pending" and "slow_down" error codes define
   particularly unique behavior, as they indicate that the OAuth client
   should continue to poll the token endpoint by repeating the token
   request (implementing the precise behavior defined above).  If the
   client receives an error response with any other error code, it MUST
   stop polling and SHOULD react accordingly, for example, by displaying
   an error to the user.

> For client_initial_response(),
>
> +   token_buf = createPQExpBuffer();
> +   if (!token_buf)
> +       goto cleanup;
>
> If token_buf is NULL, there doesn't seem to be anything to free. We
> can return directly.

That's true today, but implementations have a habit of changing. I
personally prefer not to introduce too many exit points from a function
that's already using goto. In my experience, that makes future
maintenance harder.

Thanks for the reviews! Have you been able to give the patchset a try
with an OAuth deployment?

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc8628#section-3.5
Hi,
bq. destroyJsonLexContext() does an unnecessary copy before free? Yeah, in that case it's not super useful,
but I think I'd want some evidence that the performance hit matters before optimizing it. 

Yes I agree.

bq. In practice, do you think that the probability of success is low enough that we should just short-circuit and be done with it?

Haven't had a chance to try your patches out yet.
I will leave this to people who are more familiar with OAuth implementation(s).

bq.  I personally prefer not to introduce too many exit points from a function that's already using goto.

Fair enough.

Cheers

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Michael Paquier
Date:
On Thu, Aug 26, 2021 at 04:13:08PM +0000, Jacob Champion wrote:
> Do you mean heap-based? i.e. destroyJsonLexContext() does an
> unnecessary copy before free? Yeah, in that case it's not super useful,
> but I think I'd want some evidence that the performance hit matters
> before optimizing it.

As an authentication code path, the impact is minimal and my take on
that would be to keep the code simple.  Now if you'd really wish to
stress that without relying on the backend, one simple way is to use
pgbench -C -n with a mostly-empty script (one meta-command) coupled
with some profiling.
--
Michael

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, 2021-08-27 at 11:32 +0900, Michael Paquier wrote:
> Now if you'd really wish to
> stress that without relying on the backend, one simple way is to use
> pgbench -C -n with a mostly-empty script (one meta-command) coupled
> with some profiling.

Ah, thanks! I'll add that to the toolbox.

--Jacob

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
Hi all,

v3 rebases this patchset over the top of Samay's pluggable auth
provider API [1], included here as patches 0001-3. The final patch in
the set ports the server implementation from a core feature to a
contrib module; to switch between the two approaches, simply leave out
that final patch.

There are still some backend changes that must be made to get this
working, as pointed out in 0009, and obviously libpq support still
requires code changes.

--Jacob

[1] https://www.postgresql.org/message-id/flat/CAJxrbyxTRn5P8J-p%2BwHLwFahK5y56PhK28VOb55jqMO05Y-DJw%40mail.gmail.com

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
samay sharma
Date:
Hi Jacob,

Thank you for porting this on top of the pluggable auth methods API. I've addressed the feedback around other backend changes in my latest patch, but the client side changes still remain. I had a few questions to understand them better.

(a) What specifically do the client side changes in the patch implement?
(b) Are the changes you made on the client side specific to OAUTH or are they about making SASL more generic? As an additional question, if someone wanted to implement something similar on top of your patch, would they still have to make client side changes?

Regards,
Samay

On Fri, Mar 4, 2022 at 11:13 AM Jacob Champion <pchampion@vmware.com> wrote:
Hi all,

v3 rebases this patchset over the top of Samay's pluggable auth
provider API [1], included here as patches 0001-3. The final patch in
the set ports the server implementation from a core feature to a
contrib module; to switch between the two approaches, simply leave out
that final patch.

There are still some backend changes that must be made to get this
working, as pointed out in 0009, and obviously libpq support still
requires code changes.

--Jacob

[1] https://www.postgresql.org/message-id/flat/CAJxrbyxTRn5P8J-p%2BwHLwFahK5y56PhK28VOb55jqMO05Y-DJw%40mail.gmail.com

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, 2022-03-22 at 14:48 -0700, samay sharma wrote:
> Thank you for porting this on top of the pluggable auth methods API.
> I've addressed the feedback around other backend changes in my latest
> patch, but the client side changes still remain. I had a few
> questions to understand them better.
> 
> (a) What specifically do the client side changes in the patch implement?

Hi Samay,

The client-side changes are an implementation of the OAuth 2.0 Device
Authorization Grant [1] in libpq. The majority of the OAuth logic is
handled by the third-party iddawc library.

The server tells the client what OIDC provider to contact, and then
libpq prompts you to log into that provider on your
smartphone/browser/etc. using a one-time code. After you give libpq
permission to act on your behalf, the Bearer token gets sent to libpq
via a direct connection, and libpq forwards it to the server so that
the server can determine whether you're allowed in.

> (b) Are the changes you made on the client side specific to OAUTH or
> are they about making SASL more generic?

The original patchset included changes to make SASL more generic. Many
of those changes have since been merged, and the remaining code is
mostly OAuth-specific, but there are still improvements to be made.
(And there's some JSON crud to sift through in the first couple of
patches. I'm still mad that the OAUTHBEARER spec requires clients to
parse JSON in the first place.)

> As an additional question,
> if someone wanted to implement something similar on top of your
> patch, would they still have to make client side changes?

Any new SASL mechanisms require changes to libpq at this point. You
need to implement a new pg_sasl_mech, modify pg_SASL_init() to select
the mechanism correctly, and add whatever connection string options you
need, along with the associated state in pg_conn. Patch 0004 has all
the client-side magic for OAUTHBEARER.

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc8628

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, 2022-03-04 at 19:13 +0000, Jacob Champion wrote:
> v3 rebases this patchset over the top of Samay's pluggable auth
> provider API [1], included here as patches 0001-3.

v4 rebases over the latest version of the pluggable auth patchset
(included as 0001-4). Note that there's a recent conflict as
of d4781d887; use an older commit as the base (or wait for the other
thread to be updated).

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
mahendrakar s
Date:
Hi  Hackers,

We are trying to implement AAD(Azure AD) support in PostgreSQL and it
can be achieved with support of the OAuth method. To support AAD on
top of OAuth in a generic fashion (i.e for all other OAuth providers),
we are proposing this patch. It basically exposes two new hooks (one
for error reporting and one for OAuth provider specific token
validation) and passing OAuth bearer token to backend. It also adds
support for client credentials flow of OAuth additional to device code
flow which Jacob has proposed.

The changes for each component are summarized below.

1.     Provider-specific extension:
        Each OAuth provider implements their own token validator as an
extension. Extension registers an OAuth provider hook which is matched
to a line in the HBA file.

2.     Add support to pass on the OAuth bearer token. In this
obtaining the bearer token is left to 3rd party application or user.

        ./psql -U <username> -d 'dbname=postgres
oauth_client_id=<client_id> oauth_bearer_token=<token>

3.     HBA: An additional param ‘provider’ is added for the oauth method.
        Defining "oauth" as method + passing provider, issuer endpoint
and expected audience

        * * * * oauth   provider=<token validation extension>
issuer=.... scope=....

4.     Engine Backend:
        Support for generic OAUTHBEARER type, requesting client to
provide token and passing to token for provider-specific extension.

5.     Engine Frontend: Two-tiered approach.
           a)      libpq transparently passes on the token received
from 3rd party client as is to the backend.
           b)      libpq optionally compiled for the clients which
explicitly need libpq to orchestrate OAuth communication with the
issuer (it depends heavily on 3rd party library iddawc as Jacob
already pointed out. The library seems to be supporting all the OAuth
flows.)

Please let us know your thoughts as the proposed method supports
different OAuth flows with the use of provider specific hooks. We
think that the proposal would be useful for various OAuth providers.

Thanks,
Mahendrakar.


On Tue, 20 Sept 2022 at 10:18, Jacob Champion <pchampion@vmware.com> wrote:
>
> On Tue, 2021-06-22 at 23:22 +0000, Jacob Champion wrote:
> > On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
> > >
> > > A few small things caught my eye in the backend oauth_exchange function:
> > >
> > > > +       /* Handle the client's initial message. */
> > > > +       p = strdup(input);
> > >
> > > this strdup() should be pstrdup().
> >
> > Thanks, I'll fix that in the next re-roll.
> >
> > > In the same function, there are a bunch of reports like this:
> > >
> > > >                    ereport(ERROR,
> > > > +                          (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > > > +                           errmsg("malformed OAUTHBEARER message"),
> > > > +                           errdetail("Comma expected, but found character \"%s\".",
> > > > +                                     sanitize_char(*p))));
> > >
> > > I don't think the double quotes are needed here, because sanitize_char
> > > will return quotes if it's a single character. So it would end up
> > > looking like this: ... found character "'x'".
> >
> > I'll fix this too. Thanks!
>
> v2, attached, incorporates Heikki's suggested fixes and also rebases on
> top of latest HEAD, which had the SASL refactoring changes committed
> last month.
>
> The biggest change from the last patchset is 0001, an attempt at
> enabling jsonapi in the frontend without the use of palloc(), based on
> suggestions by Michael and Tom from last commitfest. I've also made
> some improvements to the pytest suite. No major changes to the OAuth
> implementation yet.
>
> --Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
Hi Mahendrakar, thanks for your interest and for the patch!

On Mon, Sep 19, 2022 at 10:03 PM mahendrakar s
<mahendrakarforpg@gmail.com> wrote:
> The changes for each component are summarized below.
>
> 1.     Provider-specific extension:
>         Each OAuth provider implements their own token validator as an
> extension. Extension registers an OAuth provider hook which is matched
> to a line in the HBA file.

How easy is it to write a Bearer validator using C? My limited
understanding was that most providers were publishing libraries in
higher-level languages.

Along those lines, sample validators will need to be provided, both to
help in review and to get the pytest suite green again. (And coverage
for the new code is important, too.)

> 2.     Add support to pass on the OAuth bearer token. In this
> obtaining the bearer token is left to 3rd party application or user.
>
>         ./psql -U <username> -d 'dbname=postgres
> oauth_client_id=<client_id> oauth_bearer_token=<token>

This hurts, but I think people are definitely going to ask for it, given
the frightening practice of copy-pasting these (incredibly sensitive
secret) tokens all over the place... Ideally I'd like to implement
sender constraints for the Bearer token, to *prevent* copy-pasting (or,
you know, outright theft). But I'm not sure that sender constraints are
well-implemented yet for the major providers.

> 3.     HBA: An additional param ‘provider’ is added for the oauth method.
>         Defining "oauth" as method + passing provider, issuer endpoint
> and expected audience
>
>         * * * * oauth   provider=<token validation extension>
> issuer=.... scope=....

Naming aside (this conflicts with Samay's previous proposal, I think), I
have concerns about the implementation. There's this code:

> +        if (oauth_provider && oauth_provider->name)
> +        {
> +            ereport(ERROR,
> +                (errmsg("OAuth provider \"%s\" is already loaded.",
> +                    oauth_provider->name)));
> +        }

which appears to prevent loading more than one global provider. But
there's also code that deals with a provider list? (Again, it'd help to
have test code covering the new stuff.)

>            b)      libpq optionally compiled for the clients which
> explicitly need libpq to orchestrate OAuth communication with the
> issuer (it depends heavily on 3rd party library iddawc as Jacob
> already pointed out. The library seems to be supporting all the OAuth
> flows.)

Speaking of iddawc, I don't think it's a dependency we should choose to
rely on. For all the code that it has, it doesn't seem to provide
compatibility with several real-world providers.

Google, for one, chose not to follow the IETF spec it helped author, and
iddawc doesn't support its flavor of Device Authorization. At another
point, I think iddawc tried to decode Azure's Bearer tokens, which is
incorrect...

I haven't been able to check if those problems have been fixed in a
recent version, but if we're going to tie ourselves to a huge
dependency, I'd at least like to believe that said dependency is
battle-tested and solid, and personally I don't feel like iddawc is.

> -    auth_method = I_TOKEN_AUTH_METHOD_NONE;
> -    if (conn->oauth_client_secret && *conn->oauth_client_secret)
> -        auth_method = I_TOKEN_AUTH_METHOD_SECRET_BASIC;

This code got moved, but I'm not sure why? It doesn't appear to have
made a change to the logic.

> +    if (conn->oauth_client_secret && *conn->oauth_client_secret)
> +    {
> +        session_response_type = I_RESPONSE_TYPE_CLIENT_CREDENTIALS;
> +    }

Is this an Azure-specific requirement? Ideally a public client (which
psql is) shouldn't have to provide a secret to begin with, if I
understand that bit of the protocol correctly. I think Google also
required provider-specific changes in this part of the code, and
unfortunately I don't think they looked the same as yours.

We'll have to figure all that out... Standards are great; everyone has
one of their own. :)

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Sep 20, 2022 at 4:19 PM Jacob Champion <jchampion@timescale.com> wrote:
> > 2.     Add support to pass on the OAuth bearer token. In this
> > obtaining the bearer token is left to 3rd party application or user.
> >
> >         ./psql -U <username> -d 'dbname=postgres
> > oauth_client_id=<client_id> oauth_bearer_token=<token>
>
> This hurts, but I think people are definitely going to ask for it, given
> the frightening practice of copy-pasting these (incredibly sensitive
> secret) tokens all over the place...

After some further thought -- in this case, you already have an opaque
Bearer token (and therefore you already know, out of band, which
provider needs to be used), you're willing to copy-paste it from
whatever service you got it from, and you have an extension plugged
into Postgres on the backend that verifies this Bearer blob using some
procedure that Postgres knows nothing about.

Why do you need the OAUTHBEARER mechanism logic at that point? Isn't
that identical to a custom password scheme? It seems like that could
be handled completely by Samay's pluggable auth proposal.

--Jacob



RE: [EXTERNAL] Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovskiy
Date:
We can support both passing the token from an upstream client and libpq implementing OAUTH2 protocol to obtaining one.

Libpq implementing OAUTHBEARER is needed for community/3rd party tools to have user-friendly authentication experience:
1. For community client tools, like pg_admin, psql etc.
   Example experience: pg_admin would be able to open a popup dialog to authenticate customer and keep refresh token to
avoidasking the user frequently. 
2. For 3rd party connectors supporting generic OAUTH with any provider. Useful for datawiz clients, like Tableau or ETL
tools.Those can support both user and client OAUTH flows. 

Libpq passing toked directly from an upstream client is useful in other scenarios:
1. Enterprise clients, built with .Net / Java and using provider-specific authentication libraries, like MSAL for AAD.
Thosecan also support more advance provider-specific token acquisition flows. 
2. Resource-tight (like IoT) clients. Those can be compiled without optional libpq flag not including the iddawc or
otherdependency. 

Thanks!
Andrey.

-----Original Message-----
From: Jacob Champion <jchampion@timescale.com>
Sent: Wednesday, September 21, 2022 9:03 AM
To: mahendrakar s <mahendrakarforpg@gmail.com>
Cc: pgsql-hackers@postgresql.org; smilingsamay@gmail.com; andres@anarazel.de; Andrey Chudnovskiy
<Andrey.Chudnovskiy@microsoft.com>;Mahendrakar Srinivasarao <mahendrakars@microsoft.com> 
Subject: [EXTERNAL] Re: [PoC] Federated Authn/z with OAUTHBEARER

[You don't often get email from jchampion@timescale.com. Learn why this is important at
https://aka.ms/LearnAboutSenderIdentification] 

On Tue, Sep 20, 2022 at 4:19 PM Jacob Champion <jchampion@timescale.com> wrote:
> > 2.     Add support to pass on the OAuth bearer token. In this
> > obtaining the bearer token is left to 3rd party application or user.
> >
> >         ./psql -U <username> -d 'dbname=postgres
> > oauth_client_id=<client_id> oauth_bearer_token=<token>
>
> This hurts, but I think people are definitely going to ask for it,
> given the frightening practice of copy-pasting these (incredibly
> sensitive
> secret) tokens all over the place...

After some further thought -- in this case, you already have an opaque Bearer token (and therefore you already know,
outof band, which provider needs to be used), you're willing to copy-paste it from whatever service you got it from,
andyou have an extension plugged into Postgres on the backend that verifies this Bearer blob using some procedure that
Postgresknows nothing about. 

Why do you need the OAUTHBEARER mechanism logic at that point? Isn't that identical to a custom password scheme? It
seemslike that could be handled completely by Samay's pluggable auth proposal. 

--Jacob



Re: [EXTERNAL] Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Wed, Sep 21, 2022 at 3:10 PM Andrey Chudnovskiy
<Andrey.Chudnovskiy@microsoft.com> wrote:
> We can support both passing the token from an upstream client and libpq implementing OAUTH2 protocol to obtaining
one.

Right, I agree that we could potentially do both.

> Libpq passing toked directly from an upstream client is useful in other scenarios:
> 1. Enterprise clients, built with .Net / Java and using provider-specific authentication libraries, like MSAL for
AAD.Those can also support more advance provider-specific token acquisition flows.
 
> 2. Resource-tight (like IoT) clients. Those can be compiled without optional libpq flag not including the iddawc or
otherdependency.
 

What I don't understand is how the OAUTHBEARER mechanism helps you in
this case. You're short-circuiting the negotiation where the server
tells the client what provider to use and what scopes to request, and
instead you're saying "here's a secret string, just take it and
validate it with magic."

I realize the ability to pass an opaque token may be useful, but from
the server's perspective, I don't see what differentiates it from the
password auth method plus a custom authenticator plugin. Why pay for
the additional complexity of OAUTHBEARER if you're not going to use
it?

--Jacob



Re: [EXTERNAL] Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
First, My message from corp email wasn't displayed in the thread,
That is what Jacob replied to, let me post it here for context:

> We can support both passing the token from an upstream client and libpq implementing OAUTH2 protocol to obtain one.
>
> Libpq implementing OAUTHBEARER is needed for community/3rd party tools to have user-friendly authentication
experience:
>
> 1. For community client tools, like pg_admin, psql etc.
>   Example experience: pg_admin would be able to open a popup dialog to authenticate customers and keep refresh tokens
toavoid asking the user frequently.
 
> 2. For 3rd party connectors supporting generic OAUTH with any provider. Useful for datawiz clients, like Tableau or
ETLtools. Those can support both user and client OAUTH flows.
 
>
> Libpq passing toked directly from an upstream client is useful in other scenarios:
> 1. Enterprise clients, built with .Net / Java and using provider-specific authentication libraries, like MSAL for
AAD.Those can also support more advanced provider-specific token acquisition flows.
 
> 2. Resource-tight (like IoT) clients. Those can be compiled without the optional libpq flag not including the iddawc
orother dependency.
 

-----------------------------------------------------------------------------------------------------
On this:

> What I don't understand is how the OAUTHBEARER mechanism helps you in
> this case. You're short-circuiting the negotiation where the server
> tells the client what provider to use and what scopes to request, and
> instead you're saying "here's a secret string, just take it and
> validate it with magic."
>
> I realize the ability to pass an opaque token may be useful, but from
> the server's perspective, I don't see what differentiates it from the
> password auth method plus a custom authenticator plugin. Why pay for
> the additional complexity of OAUTHBEARER if you're not going to use
> it?

Yes, passing a token as a new auth method won't make much sense in
isolation. However:
1. Since OAUTHBEARER is supported in the ecosystem, passing a token as
a way to authenticate with OAUTHBEARER is more consistent (IMO), then
passing it as a password.
2. Validation on the backend side doesn't depend on whether the token
is obtained by libpq or transparently passed by the upstream client.
3. Single OAUTH auth method on the server side for both scenarios,
would allow both enterprise clients with their own Token acquisition
and community clients using libpq flows to connect as the same PG
users/roles.

On Wed, Sep 21, 2022 at 8:36 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Wed, Sep 21, 2022 at 3:10 PM Andrey Chudnovskiy
> <Andrey.Chudnovskiy@microsoft.com> wrote:
> > We can support both passing the token from an upstream client and libpq implementing OAUTH2 protocol to obtaining
one.
>
> Right, I agree that we could potentially do both.
>
> > Libpq passing toked directly from an upstream client is useful in other scenarios:
> > 1. Enterprise clients, built with .Net / Java and using provider-specific authentication libraries, like MSAL for
AAD.Those can also support more advance provider-specific token acquisition flows.
 
> > 2. Resource-tight (like IoT) clients. Those can be compiled without optional libpq flag not including the iddawc or
otherdependency.
 
>
> What I don't understand is how the OAUTHBEARER mechanism helps you in
> this case. You're short-circuiting the negotiation where the server
> tells the client what provider to use and what scopes to request, and
> instead you're saying "here's a secret string, just take it and
> validate it with magic."
>
> I realize the ability to pass an opaque token may be useful, but from
> the server's perspective, I don't see what differentiates it from the
> password auth method plus a custom authenticator plugin. Why pay for
> the additional complexity of OAUTHBEARER if you're not going to use
> it?
>
> --Jacob
>
>
>
>



Re: [EXTERNAL] Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On 9/21/22 21:55, Andrey Chudnovsky wrote:
> First, My message from corp email wasn't displayed in the thread,

I see it on the public archives [1]. Your client is choosing some pretty
confusing quoting tactics, though, which you may want to adjust. :D

I have what I'll call some "skeptical curiosity" here -- you don't need
to defend your use cases to me by any means, but I'd love to understand
more about them.

> Yes, passing a token as a new auth method won't make much sense in
> isolation. However:
> 1. Since OAUTHBEARER is supported in the ecosystem, passing a token as
> a way to authenticate with OAUTHBEARER is more consistent (IMO), then
> passing it as a password.

Agreed. It's probably not a very strong argument for the new mechanism,
though, especially if you're not using the most expensive code inside it.

> 2. Validation on the backend side doesn't depend on whether the token
> is obtained by libpq or transparently passed by the upstream client.

Sure.

> 3. Single OAUTH auth method on the server side for both scenarios,
> would allow both enterprise clients with their own Token acquisition
> and community clients using libpq flows to connect as the same PG
> users/roles.

Okay, this is a stronger argument. With that in mind, I want to revisit
your examples and maybe provide some counterproposals:

>> Libpq passing toked directly from an upstream client is useful in other scenarios:
>> 1. Enterprise clients, built with .Net / Java and using provider-specific authentication libraries, like MSAL for
AAD.Those can also support more advanced provider-specific token acquisition flows.
 

I can see that providing a token directly would help you work around
limitations in libpq's "standard" OAuth flows, whether we use iddawc or
not. And it's cheap in terms of implementation. But I have a feeling it
would fall apart rapidly with error cases, where the server is giving
libpq information via the OAUTHBEARER mechanism, but libpq can only
communicate to your wrapper through human-readable error messages on stderr.

This seems like clear motivation for client-side SASL plugins (which
were also discussed on Samay's proposal thread). That's a lot more
expensive to implement in libpq, but if it were hypothetically
available, wouldn't you rather your provider-specific code be able to
speak OAUTHBEARER directly with the server?

>> 2. Resource-tight (like IoT) clients. Those can be compiled without the optional libpq flag not including the iddawc
orother dependency.
 

I want to dig into this much more; resource-constrained systems are near
and dear to me. I can see two cases here:

Case 1: The device is an IoT client that wants to connect on its own
behalf. Why would you want to use OAuth in that case? And how would the
IoT device get its Bearer token to begin with? I'm much more used to
architectures that provision high-entropy secrets for this, whether
they're incredibly long passwords per device (in which case,
channel-bound SCRAM should be a fairly strong choice?) or client certs
(which can be better decentralized, but make for a lot of bookkeeping).

If the answer to that is, "we want an IoT client to be able to connect
using the same role as a person", then I think that illustrates a clear
need for SASL negotiation. That would let the IoT client choose
SCRAM-*-PLUS or EXTERNAL, and the person at the keyboard can choose
OAUTHBEARER. Then we have incredible flexibility, because you don't have
to engineer one mechanism to handle them all.

Case 2: The constrained device is being used as a jump point. So there's
an actual person at a keyboard, trying to get into a backend server
(maybe behind a firewall layer, etc.), and the middlebox is either not
web-connected or is incredibly tiny for some reason. That might be a
good use case for a copy-pasted Bearer token, but is there actual demand
for that use case? What motivation would you (or your end user) have for
choosing a fairly heavy, web-centric authentication method in such a
constrained environment?

Are there other resource-constrained use cases I've missed?

Thanks,
--Jacob

[1]
https://www.postgresql.org/message-id/MN0PR21MB31694BAC193ECE1807FD45358F4F9%40MN0PR21MB3169.namprd21.prod.outlook.com




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Mar 25, 2022 at 5:00 PM Jacob Champion <pchampion@vmware.com> wrote:
> v4 rebases over the latest version of the pluggable auth patchset
> (included as 0001-4). Note that there's a recent conflict as
> of d4781d887; use an older commit as the base (or wait for the other
> thread to be updated).

Here's a newly rebased v5. (They're all zipped now, which I probably
should have done a while back, sorry.)

- As before, 0001-4 are the pluggable auth set; they've now diverged
from the official version over on the other thread [1].
- I'm not sure that 0005 is still completely coherent after the
rebase, given the recent changes to jsonapi.c. But for now, the tests
are green, and that should be enough to keep the conversation going.
- 0008 will hopefully be obsoleted when the SYSTEM_USER proposal [2] lands.

Thanks,
--Jacob

[1] https://www.postgresql.org/message-id/CAJxrbyxgFzfqby%2BVRCkeAhJnwVZE50%2BZLPx0JT2TDg9LbZtkCg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/7e692b8c-0b11-45db-1cad-3afc5b57409f@amazon.com

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
>>> Libpq passing toked directly from an upstream client is useful in other scenarios:
>>> 1. Enterprise clients, built with .Net / Java and using provider-specific authentication libraries, like MSAL for
AAD.Those can also support more advanced provider-specific token acquisition flows.
 

> I can see that providing a token directly would help you work around
> limitations in libpq's "standard" OAuth flows, whether we use iddawc or
> not. And it's cheap in terms of implementation. But I have a feeling it
> would fall apart rapidly with error cases, where the server is giving
> libpq information via the OAUTHBEARER mechanism, but libpq can only
> communicate to your wrapper through human-readable error messages on stderr.

For the providing token directly, that would be primarily used for
scenarios where the same party controls both the server and the client
side wrapper.
I.e. The client knows how to get a token for a particular principal
and doesn't need any additional information other than human readable
messages.
Please clarify the scenarios where you see this falling apart.

I can provide an example in the cloud world. We (Azure) as well as
other providers offer ways to obtain OAUTH tokens for
Service-to-Service communication at IAAS / PAAS level.
on Azure "Managed Identity" feature integrated in Compute VM allows a
client to make a local http call to get a token. VM itself manages the
certificate livecycle, as well as implements the corresponding OAUTH
flow.
This capability is used by both our 1st party PAAS offerings, as well
as 3rd party services deploying on VMs or managed K8S clusters.
Here, the client doesn't need libpq assistance in obtaining the token.

> This seems like clear motivation for client-side SASL plugins (which
> were also discussed on Samay's proposal thread). That's a lot more
> expensive to implement in libpq, but if it were hypothetically
> available, wouldn't you rather your provider-specific code be able to
> speak OAUTHBEARER directly with the server?

I generally agree that pluggable auth layers in libpq could be
beneficial. However, as you pointed out in Samay's thread, that would
require a new distribution model for libpq / clients to optionally
include provider-specific logic.

My optimistic plan here would be to implement several core OAUTH flows
in libpq core which would be generic enough to support major
enterprise OAUTH providers:
1. Client Credentials flow (Client_id + Client_secret) for backend applications.
2. Authorization Code Flow with PKCE and/or Device code flow for GUI
applications.

(2.) above would require a protocol between libpq and upstream clients
to exchange several messages.
Your patch includes a way for libpq to deliver to the client a message
about the next authentication steps, so planned to build on top of
that.

A little about scenarios, we look at.
What we're trying to achieve here is an easy integration path for
multiple players in the ecosystem:
- Managed PaaS Postgres providers (both us and multi-cloud solutions)
- SaaS providers deploying postgres on IaaS/PaaS providers' clouds
- Tools - pg_admin, psql and other ones.
- BI, ETL, Federation and other scenarios where postgres is used as
the data source.

If we can offer a provider agnostic solution for Backend <=> libpq <=>
Upstreal client path, we can have all players above build support for
OAUTH credentials, managed by the cloud provider of their choice.

For us, that would mean:
- Better administrator experience with pg_admin / psql handling of the
AAD (Azure Active Directory) authentication flows.
- Path for integration solutions using Postgres to build AAD
authentication in their management experience.
- Ability to use AAD identity provider for any Postgres deployments
other than our 1st party PaaS offering.
- Ability to offer github as the identity provider for PaaS Postgres offering.

Other players in the ecosystem above would be able to get the same benefits.

Does that make sense and possible without provider specific libpq plugin?

-------------------------
On resource constrained scenarios.
> I want to dig into this much more; resource-constrained systems are near
> and dear to me. I can see two cases here:

I just referred to the ability to compile libpq without extra
dependencies to save some kilobytes.
Not sure if OAUTH is widely used in those cases. It involves overhead
anyway, and requires the device to talk to an additional party (OAUTH
provider).
Likely Cert authentication is easier.
If needed, it can get libpq with full OAUTH support and use a client
code. But I didn't think about this scenario.

On Fri, Sep 23, 2022 at 3:39 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Fri, Mar 25, 2022 at 5:00 PM Jacob Champion <pchampion@vmware.com> wrote:
> > v4 rebases over the latest version of the pluggable auth patchset
> > (included as 0001-4). Note that there's a recent conflict as
> > of d4781d887; use an older commit as the base (or wait for the other
> > thread to be updated).
>
> Here's a newly rebased v5. (They're all zipped now, which I probably
> should have done a while back, sorry.)
>
> - As before, 0001-4 are the pluggable auth set; they've now diverged
> from the official version over on the other thread [1].
> - I'm not sure that 0005 is still completely coherent after the
> rebase, given the recent changes to jsonapi.c. But for now, the tests
> are green, and that should be enough to keep the conversation going.
> - 0008 will hopefully be obsoleted when the SYSTEM_USER proposal [2] lands.
>
> Thanks,
> --Jacob
>
> [1] https://www.postgresql.org/message-id/CAJxrbyxgFzfqby%2BVRCkeAhJnwVZE50%2BZLPx0JT2TDg9LbZtkCg%40mail.gmail.com
> [2] https://www.postgresql.org/message-id/flat/7e692b8c-0b11-45db-1cad-3afc5b57409f@amazon.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Sep 26, 2022 at 6:39 PM Andrey Chudnovsky
<achudnovskij@gmail.com> wrote:
> For the providing token directly, that would be primarily used for
> scenarios where the same party controls both the server and the client
> side wrapper.
> I.e. The client knows how to get a token for a particular principal
> and doesn't need any additional information other than human readable
> messages.
> Please clarify the scenarios where you see this falling apart.

The most concrete example I can see is with the OAUTHBEARER error
response. If you want to eventually handle differing scopes per role,
or different error statuses (which the proof-of-concept currently
hardcodes as `invalid_token`), then the client can't assume it knows
what the server is going to say there. I think that's true even if you
control both sides and are hardcoding the provider.

How should we communicate those pieces to a custom client when it's
passing a token directly? The easiest way I can see is for the custom
client to speak the OAUTHBEARER protocol directly (e.g. SASL plugin).
If you had to parse the libpq error message, I don't think that'd be
particularly maintainable.

> I can provide an example in the cloud world. We (Azure) as well as
> other providers offer ways to obtain OAUTH tokens for
> Service-to-Service communication at IAAS / PAAS level.
> on Azure "Managed Identity" feature integrated in Compute VM allows a
> client to make a local http call to get a token. VM itself manages the
> certificate livecycle, as well as implements the corresponding OAUTH
> flow.
> This capability is used by both our 1st party PAAS offerings, as well
> as 3rd party services deploying on VMs or managed K8S clusters.
> Here, the client doesn't need libpq assistance in obtaining the token.

Cool. To me that's the strongest argument yet for directly providing
tokens to libpq.

> My optimistic plan here would be to implement several core OAUTH flows
> in libpq core which would be generic enough to support major
> enterprise OAUTH providers:
> 1. Client Credentials flow (Client_id + Client_secret) for backend applications.
> 2. Authorization Code Flow with PKCE and/or Device code flow for GUI
> applications.

As long as it's clear to DBAs when to use which flow (because existing
documentation for that is hit-and-miss), I think it's reasonable to
eventually support multiple flows. Personally my preference would be
to start with one or two core flows, and expand outward once we're
sure that we do those perfectly. Otherwise the explosion of knobs and
buttons might be overwhelming, both to users and devs.

Related to the question of flows is the client implementation library.
I've mentioned that I don't think iddawc is production-ready. As far
as I'm aware, there is only one certified OpenID relying party written
in C, and that's... an Apache server plugin. That leaves us either
choosing an untested library, scouring the web for a "tested" library
(and hoping we're right in our assessment), or implementing our own
(which is going to tamp down enthusiasm for supporting many flows,
though that has its own set of benefits). If you know of any reliable
implementations with a C API, please let me know.

> (2.) above would require a protocol between libpq and upstream clients
> to exchange several messages.
> Your patch includes a way for libpq to deliver to the client a message
> about the next authentication steps, so planned to build on top of
> that.

Specifically it delivers that message to an end user. If you want a
generic machine client to be able to use that, then we'll need to talk
about how.

> A little about scenarios, we look at.
> What we're trying to achieve here is an easy integration path for
> multiple players in the ecosystem:
> - Managed PaaS Postgres providers (both us and multi-cloud solutions)
> - SaaS providers deploying postgres on IaaS/PaaS providers' clouds
> - Tools - pg_admin, psql and other ones.
> - BI, ETL, Federation and other scenarios where postgres is used as
> the data source.
>
> If we can offer a provider agnostic solution for Backend <=> libpq <=>
> Upstreal client path, we can have all players above build support for
> OAUTH credentials, managed by the cloud provider of their choice.

Well... I don't quite understand why we'd go to the trouble of
providing a provider-agnostic communication solution only to have
everyone write their own provider-specific client support. Unless
you're saying Microsoft would provide an officially blessed plugin for
the *server* side only, and Google would provide one of their own, and
so on.

The server side authorization is the only place where I think it makes
sense to specialize by default. libpq should remain agnostic, with the
understanding that we'll need to make hard decisions when a major
provider decides not to follow a spec.

> For us, that would mean:
> - Better administrator experience with pg_admin / psql handling of the
> AAD (Azure Active Directory) authentication flows.
> - Path for integration solutions using Postgres to build AAD
> authentication in their management experience.
> - Ability to use AAD identity provider for any Postgres deployments
> other than our 1st party PaaS offering.
> - Ability to offer github as the identity provider for PaaS Postgres offering.

GitHub is unfortunately a bit tricky, unless they've started
supporting OpenID recently?

> Other players in the ecosystem above would be able to get the same benefits.
>
> Does that make sense and possible without provider specific libpq plugin?

If the players involved implement the flows and follow the specs, yes.
That's a big "if", unfortunately. I think GitHub and Google are two
major players who are currently doing things their own way.

> I just referred to the ability to compile libpq without extra
> dependencies to save some kilobytes.
> Not sure if OAUTH is widely used in those cases. It involves overhead
> anyway, and requires the device to talk to an additional party (OAUTH
> provider).
> Likely Cert authentication is easier.
> If needed, it can get libpq with full OAUTH support and use a client
> code. But I didn't think about this scenario.

Makes sense. Thanks!

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> The most concrete example I can see is with the OAUTHBEARER error
> response. If you want to eventually handle differing scopes per role,
> or different error statuses (which the proof-of-concept currently
> hardcodes as `invalid_token`), then the client can't assume it knows
> what the server is going to say there. I think that's true even if you
> control both sides and are hardcoding the provider.

Ok, I see the point. It's related to the topic of communication
between libpq and the upstream client.


> How should we communicate those pieces to a custom client when it's
> passing a token directly? The easiest way I can see is for the custom
> client to speak the OAUTHBEARER protocol directly (e.g. SASL plugin).
> If you had to parse the libpq error message, I don't think that'd be
> particularly maintainable.

I agree that parsing the message is not a sustainable way.
Could you provide more details on the SASL plugin approach you propose?

Specifically, is this basically a set of extension hooks for the client side?
With the need for the client to be compiled with the plugins based on
the set of providers it needs.


> Well... I don't quite understand why we'd go to the trouble of
> providing a provider-agnostic communication solution only to have
> everyone write their own provider-specific client support. Unless
> you're saying Microsoft would provide an officially blessed plugin for
> the *server* side only, and Google would provide one of their own, and
> so on.

Yes, via extensions. Identity providers can open source extensions to
use their auth services outside of first party PaaS offerings.
For 3rd party Postgres PaaS or on premise deployments.


> The server side authorization is the only place where I think it makes
> sense to specialize by default. libpq should remain agnostic, with the
> understanding that we'll need to make hard decisions when a major
> provider decides not to follow a spec.

Completely agree with agnostic libpq. Though needs validation with
several major providers to know if this is possible.


> Specifically it delivers that message to an end user. If you want a
> generic machine client to be able to use that, then we'll need to talk
> about how.

Yes, that's what needs to be decided.
In both Device code and Authorization code scenarios, libpq and the
client would need to exchange a couple of pieces of metadata.
Plus, after success, the client should be able to access a refresh token for further use.

Can we implement a generic protocol like for this between libpq and the clients?

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Sep 30, 2022 at 7:47 AM Andrey Chudnovsky
<achudnovskij@gmail.com> wrote:
> > How should we communicate those pieces to a custom client when it's
> > passing a token directly? The easiest way I can see is for the custom
> > client to speak the OAUTHBEARER protocol directly (e.g. SASL plugin).
> > If you had to parse the libpq error message, I don't think that'd be
> > particularly maintainable.
>
> I agree that parsing the message is not a sustainable way.
> Could you provide more details on the SASL plugin approach you propose?
>
> Specifically, is this basically a set of extension hooks for the client side?
> With the need for the client to be compiled with the plugins based on
> the set of providers it needs.

That's a good question. I can see two broad approaches, with maybe
some ability to combine them into a hybrid:

1. If there turns out to be serious interest in having libpq itself
handle OAuth natively (with all of the web-facing code that implies,
and all of the questions still left to answer), then we might be able
to provide a "token hook" in the same way that we currently provide a
passphrase hook for OpenSSL keys. By default, libpq would use its
internal machinery to take the provider details, navigate its builtin
flow, and return the Bearer token. If you wanted to override that
behavior as a client, you could replace the builtin flow with your
own, by registering a set of callbacks.

2. Alternatively, OAuth support could be provided via a mechanism
plugin for some third-party SASL library (GNU libgsasl, Cyrus
libsasl2). We could provide an OAuth plugin in contrib that handles
the default flow. Other providers could publish their alternative
plugins to completely replace the OAUTHBEARER mechanism handling.

Approach (2) would make for some duplicated effort since every
provider has to write code to speak the OAUTHBEARER protocol. It might
simplify provider-specific distribution, since (at least for Cyrus) I
think you could build a single plugin that supports both the client
and server side. But it would be a lot easier to unknowingly (or
knowingly) break the spec, since you'd control both the client and
server sides. There would be less incentive to interoperate.

Finally, we could potentially take pieces from both, by having an
official OAuth mechanism plugin that provides a client-side hook to
override the flow. I have no idea if the benefits would offset the
costs of a plugin-for-a-plugin style architecture. And providers would
still be free to ignore it and just provide a full mechanism plugin
anyway.

> > Well... I don't quite understand why we'd go to the trouble of
> > providing a provider-agnostic communication solution only to have
> > everyone write their own provider-specific client support. Unless
> > you're saying Microsoft would provide an officially blessed plugin for
> > the *server* side only, and Google would provide one of their own, and
> > so on.
>
> Yes, via extensions. Identity providers can open source extensions to
> use their auth services outside of first party PaaS offerings.
> For 3rd party Postgres PaaS or on premise deployments.

Sounds reasonable.

> > The server side authorization is the only place where I think it makes
> > sense to specialize by default. libpq should remain agnostic, with the
> > understanding that we'll need to make hard decisions when a major
> > provider decides not to follow a spec.
>
> Completely agree with agnostic libpq. Though needs validation with
> several major providers to know if this is possible.

Agreed.

> > Specifically it delivers that message to an end user. If you want a
> > generic machine client to be able to use that, then we'll need to talk
> > about how.
>
> Yes, that's what needs to be decided.
> In both Device code and Authorization code scenarios, libpq and the
> client would need to exchange a couple of pieces of metadata.
> Plus, after success, the client should be able to access a refresh token for further use.
>
> Can we implement a generic protocol like for this between libpq and the clients?

I think we can probably prototype a callback hook for approach (1)
pretty quickly. (2) is a lot more work and investigation, but it's
work that I'm interested in doing (when I get the time). I think there
are other very good reasons to consider a third-party SASL library,
and some good lessons to be learned, even if the community decides not
to go down that road.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> I think we can probably prototype a callback hook for approach (1)
> pretty quickly. (2) is a lot more work and investigation, but it's
> work that I'm interested in doing (when I get the time). I think there
> are other very good reasons to consider a third-party SASL library,
> and some good lessons to be learned, even if the community decides not
> to go down that road.

Makes sense. We will work on (1.) and do some check if there are any
blockers for a shared solution to support github and google.

On Fri, Sep 30, 2022 at 1:45 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Fri, Sep 30, 2022 at 7:47 AM Andrey Chudnovsky
> <achudnovskij@gmail.com> wrote:
> > > How should we communicate those pieces to a custom client when it's
> > > passing a token directly? The easiest way I can see is for the custom
> > > client to speak the OAUTHBEARER protocol directly (e.g. SASL plugin).
> > > If you had to parse the libpq error message, I don't think that'd be
> > > particularly maintainable.
> >
> > I agree that parsing the message is not a sustainable way.
> > Could you provide more details on the SASL plugin approach you propose?
> >
> > Specifically, is this basically a set of extension hooks for the client side?
> > With the need for the client to be compiled with the plugins based on
> > the set of providers it needs.
>
> That's a good question. I can see two broad approaches, with maybe
> some ability to combine them into a hybrid:
>
> 1. If there turns out to be serious interest in having libpq itself
> handle OAuth natively (with all of the web-facing code that implies,
> and all of the questions still left to answer), then we might be able
> to provide a "token hook" in the same way that we currently provide a
> passphrase hook for OpenSSL keys. By default, libpq would use its
> internal machinery to take the provider details, navigate its builtin
> flow, and return the Bearer token. If you wanted to override that
> behavior as a client, you could replace the builtin flow with your
> own, by registering a set of callbacks.
>
> 2. Alternatively, OAuth support could be provided via a mechanism
> plugin for some third-party SASL library (GNU libgsasl, Cyrus
> libsasl2). We could provide an OAuth plugin in contrib that handles
> the default flow. Other providers could publish their alternative
> plugins to completely replace the OAUTHBEARER mechanism handling.
>
> Approach (2) would make for some duplicated effort since every
> provider has to write code to speak the OAUTHBEARER protocol. It might
> simplify provider-specific distribution, since (at least for Cyrus) I
> think you could build a single plugin that supports both the client
> and server side. But it would be a lot easier to unknowingly (or
> knowingly) break the spec, since you'd control both the client and
> server sides. There would be less incentive to interoperate.
>
> Finally, we could potentially take pieces from both, by having an
> official OAuth mechanism plugin that provides a client-side hook to
> override the flow. I have no idea if the benefits would offset the
> costs of a plugin-for-a-plugin style architecture. And providers would
> still be free to ignore it and just provide a full mechanism plugin
> anyway.
>
> > > Well... I don't quite understand why we'd go to the trouble of
> > > providing a provider-agnostic communication solution only to have
> > > everyone write their own provider-specific client support. Unless
> > > you're saying Microsoft would provide an officially blessed plugin for
> > > the *server* side only, and Google would provide one of their own, and
> > > so on.
> >
> > Yes, via extensions. Identity providers can open source extensions to
> > use their auth services outside of first party PaaS offerings.
> > For 3rd party Postgres PaaS or on premise deployments.
>
> Sounds reasonable.
>
> > > The server side authorization is the only place where I think it makes
> > > sense to specialize by default. libpq should remain agnostic, with the
> > > understanding that we'll need to make hard decisions when a major
> > > provider decides not to follow a spec.
> >
> > Completely agree with agnostic libpq. Though needs validation with
> > several major providers to know if this is possible.
>
> Agreed.
>
> > > Specifically it delivers that message to an end user. If you want a
> > > generic machine client to be able to use that, then we'll need to talk
> > > about how.
> >
> > Yes, that's what needs to be decided.
> > In both Device code and Authorization code scenarios, libpq and the
> > client would need to exchange a couple of pieces of metadata.
> > Plus, after success, the client should be able to access a refresh token for further use.
> >
> > Can we implement a generic protocol like for this between libpq and the clients?
>
> I think we can probably prototype a callback hook for approach (1)
> pretty quickly. (2) is a lot more work and investigation, but it's
> work that I'm interested in doing (when I get the time). I think there
> are other very good reasons to consider a third-party SASL library,
> and some good lessons to be learned, even if the community decides not
> to go down that road.
>
> Thanks,
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
mahendrakar s
Date:
Hi,


We validated on  libpq handling OAuth natively with different flows
with different OIDC certified providers.

Flows: Device Code, Client Credentials and Refresh Token.
Providers: Microsoft, Google and Okta.
Also validated with OAuth provider Github.

We propose using OpenID Connect (OIDC) as the protocol, instead of
OAuth, as it is:
- Discovery mechanism to bridge the differences and provide metadata.
- Stricter protocol and certification process to reliably identify
which providers can be supported.
- OIDC is designed for authentication, while the main purpose of OAUTH is to
authorize applications on behalf of the user.

Github is not OIDC certified, so won’t be supported with this proposal.
However, it may be supported in the future through the ability for the
extension to provide custom discovery document content.

OpenID configuration has a well-known discovery mechanism
for the provider configuration URI which is
defined in OpenID Connect. It allows libpq to fetch
metadata about provider (i.e endpoints, supported grants, response types, etc).

In the attached patch (based on V2 patch in the thread and does not
contain Samay's changes):
- Provider can configure issuer url and scope through the options hook.)
- Server passes on an open discovery url and scope to libpq.
- Libpq handles OAuth flow based on the flow_type sent in the
connection string [1].
- Added callbacks to notify a structure to client tools if OAuth flow
requires user interaction.
- Pg backend uses hooks to validate bearer token.

Note that authentication code flow with PKCE for GUI clients is not
implemented yet.

Proposed next steps:
- Broaden discussion to reach agreement on the approach.
- Implement libpq changes without iddawc
- Prototype GUI flow with pgAdmin

Thanks,
Mahendrakar.

[1]:
connection string for refresh token flow:
./psql -U <user> -d 'dbname=postgres oauth_client_id=<client_id>
oauth_flow_type=<flowtype>  oauth_refresh_token=<refresh token>'

On Mon, 3 Oct 2022 at 23:34, Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
>
> > I think we can probably prototype a callback hook for approach (1)
> > pretty quickly. (2) is a lot more work and investigation, but it's
> > work that I'm interested in doing (when I get the time). I think there
> > are other very good reasons to consider a third-party SASL library,
> > and some good lessons to be learned, even if the community decides not
> > to go down that road.
>
> Makes sense. We will work on (1.) and do some check if there are any
> blockers for a shared solution to support github and google.
>
> On Fri, Sep 30, 2022 at 1:45 PM Jacob Champion <jchampion@timescale.com> wrote:
> >
> > On Fri, Sep 30, 2022 at 7:47 AM Andrey Chudnovsky
> > <achudnovskij@gmail.com> wrote:
> > > > How should we communicate those pieces to a custom client when it's
> > > > passing a token directly? The easiest way I can see is for the custom
> > > > client to speak the OAUTHBEARER protocol directly (e.g. SASL plugin).
> > > > If you had to parse the libpq error message, I don't think that'd be
> > > > particularly maintainable.
> > >
> > > I agree that parsing the message is not a sustainable way.
> > > Could you provide more details on the SASL plugin approach you propose?
> > >
> > > Specifically, is this basically a set of extension hooks for the client side?
> > > With the need for the client to be compiled with the plugins based on
> > > the set of providers it needs.
> >
> > That's a good question. I can see two broad approaches, with maybe
> > some ability to combine them into a hybrid:
> >
> > 1. If there turns out to be serious interest in having libpq itself
> > handle OAuth natively (with all of the web-facing code that implies,
> > and all of the questions still left to answer), then we might be able
> > to provide a "token hook" in the same way that we currently provide a
> > passphrase hook for OpenSSL keys. By default, libpq would use its
> > internal machinery to take the provider details, navigate its builtin
> > flow, and return the Bearer token. If you wanted to override that
> > behavior as a client, you could replace the builtin flow with your
> > own, by registering a set of callbacks.
> >
> > 2. Alternatively, OAuth support could be provided via a mechanism
> > plugin for some third-party SASL library (GNU libgsasl, Cyrus
> > libsasl2). We could provide an OAuth plugin in contrib that handles
> > the default flow. Other providers could publish their alternative
> > plugins to completely replace the OAUTHBEARER mechanism handling.
> >
> > Approach (2) would make for some duplicated effort since every
> > provider has to write code to speak the OAUTHBEARER protocol. It might
> > simplify provider-specific distribution, since (at least for Cyrus) I
> > think you could build a single plugin that supports both the client
> > and server side. But it would be a lot easier to unknowingly (or
> > knowingly) break the spec, since you'd control both the client and
> > server sides. There would be less incentive to interoperate.
> >
> > Finally, we could potentially take pieces from both, by having an
> > official OAuth mechanism plugin that provides a client-side hook to
> > override the flow. I have no idea if the benefits would offset the
> > costs of a plugin-for-a-plugin style architecture. And providers would
> > still be free to ignore it and just provide a full mechanism plugin
> > anyway.
> >
> > > > Well... I don't quite understand why we'd go to the trouble of
> > > > providing a provider-agnostic communication solution only to have
> > > > everyone write their own provider-specific client support. Unless
> > > > you're saying Microsoft would provide an officially blessed plugin for
> > > > the *server* side only, and Google would provide one of their own, and
> > > > so on.
> > >
> > > Yes, via extensions. Identity providers can open source extensions to
> > > use their auth services outside of first party PaaS offerings.
> > > For 3rd party Postgres PaaS or on premise deployments.
> >
> > Sounds reasonable.
> >
> > > > The server side authorization is the only place where I think it makes
> > > > sense to specialize by default. libpq should remain agnostic, with the
> > > > understanding that we'll need to make hard decisions when a major
> > > > provider decides not to follow a spec.
> > >
> > > Completely agree with agnostic libpq. Though needs validation with
> > > several major providers to know if this is possible.
> >
> > Agreed.
> >
> > > > Specifically it delivers that message to an end user. If you want a
> > > > generic machine client to be able to use that, then we'll need to talk
> > > > about how.
> > >
> > > Yes, that's what needs to be decided.
> > > In both Device code and Authorization code scenarios, libpq and the
> > > client would need to exchange a couple of pieces of metadata.
> > > Plus, after success, the client should be able to access a refresh token for further use.
> > >
> > > Can we implement a generic protocol like for this between libpq and the clients?
> >
> > I think we can probably prototype a callback hook for approach (1)
> > pretty quickly. (2) is a lot more work and investigation, but it's
> > work that I'm interested in doing (when I get the time). I think there
> > are other very good reasons to consider a third-party SASL library,
> > and some good lessons to be learned, even if the community decides not
> > to go down that road.
> >
> > Thanks,
> > --Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On 11/23/22 01:58, mahendrakar s wrote:
> We validated on  libpq handling OAuth natively with different flows
> with different OIDC certified providers.
> 
> Flows: Device Code, Client Credentials and Refresh Token.
> Providers: Microsoft, Google and Okta.

Great, thank you!

> Also validated with OAuth provider Github.

(How did you get discovery working? I tried this and had to give up
eventually.)

> We propose using OpenID Connect (OIDC) as the protocol, instead of
> OAuth, as it is:
> - Discovery mechanism to bridge the differences and provide metadata.
> - Stricter protocol and certification process to reliably identify
> which providers can be supported.
> - OIDC is designed for authentication, while the main purpose of OAUTH is to
> authorize applications on behalf of the user.

How does this differ from the previous proposal? The OAUTHBEARER SASL
mechanism already relies on OIDC for discovery. (I think that decision
is confusing from an architectural and naming standpoint, but I don't
think they really had an alternative...)

> Github is not OIDC certified, so won’t be supported with this proposal.
> However, it may be supported in the future through the ability for the
> extension to provide custom discovery document content.

Right.

> OpenID configuration has a well-known discovery mechanism
> for the provider configuration URI which is
> defined in OpenID Connect. It allows libpq to fetch
> metadata about provider (i.e endpoints, supported grants, response types, etc).

Sure, but this is already how the original PoC works. The test suite
implements an OIDC provider, for instance. Is there something different
to this that I'm missing?

> In the attached patch (based on V2 patch in the thread and does not
> contain Samay's changes):
> - Provider can configure issuer url and scope through the options hook.)
> - Server passes on an open discovery url and scope to libpq.
> - Libpq handles OAuth flow based on the flow_type sent in the
> connection string [1].
> - Added callbacks to notify a structure to client tools if OAuth flow
> requires user interaction.
> - Pg backend uses hooks to validate bearer token.

Thank you for the sample!

> Note that authentication code flow with PKCE for GUI clients is not
> implemented yet.
> 
> Proposed next steps:
> - Broaden discussion to reach agreement on the approach.

High-level thoughts on this particular patch (I assume you're not
looking for low-level implementation comments yet):

0) The original hook proposal upthread, I thought, was about allowing
libpq's flow implementation to be switched out by the application. I
don't see that approach taken here. It's fine if that turned out to be a
bad idea, of course, but this patch doesn't seem to match what we were
talking about.

1) I'm really concerned about the sudden explosion of flows. We went
from one flow (Device Authorization) to six. It's going to be hard
enough to validate that *one* flow is useful and can be securely
deployed by end users; I don't think we're going to be able to maintain
six, especially in combination with my statement that iddawc is not an
appropriate dependency for us.

I'd much rather give applications the ability to use their own OAuth
code, and then maintain within libpq only the flows that are broadly
useful. This ties back to (0) above.

2) Breaking the refresh token into its own pseudoflow is, I think,
passing the buck onto the user for something that's incredibly security
sensitive. The refresh token is powerful; I don't really want it to be
printed anywhere, let alone copy-pasted by the user. Imagine the
phishing opportunities.

If we want to support refresh tokens, I believe we should be developing
a plan to cache and secure them within the client. They should be used
as an accelerator for other flows, not as their own flow.

3) I don't like the departure from the OAUTHBEARER mechanism that's
presented here. For one, since I can't see a sample plugin that makes
use of the "flow type" magic numbers that have been added, I don't
really understand why the extension to the mechanism is necessary.

For two, if we think OAUTHBEARER is insufficient, the people who wrote
it would probably like to hear about it. Claiming support for a spec,
and then implementing an extension without review from the people who
wrote the spec, is not something I'm personally interested in doing.

4) The test suite is still broken, so it's difficult to see these things
in practice for review purposes.

> - Implement libpq changes without iddawc

This in particular will be much easier with a functioning test suite,
and with a smaller number of flows.

> - Prototype GUI flow with pgAdmin

Cool!

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> How does this differ from the previous proposal? The OAUTHBEARER SASL
> mechanism already relies on OIDC for discovery. (I think that decision
> is confusing from an architectural and naming standpoint, but I don't
> think they really had an alternative...)
Mostly terminology questions here. OAUTHBEARER SASL appears to be the
spec about using OAUTH2 tokens for Authentication.
While any OAUTH2 can generally work, we propose to specifically
highlight that only OIDC providers can be supported, as we need the
discovery document.
And we won't be able to support Github under that requirement.
Since the original patch used that too - no change on that, just
confirmation that we need OIDC compliance.

> 0) The original hook proposal upthread, I thought, was about allowing
> libpq's flow implementation to be switched out by the application. I
> don't see that approach taken here. It's fine if that turned out to be a
> bad idea, of course, but this patch doesn't seem to match what we were
> talking about.
We still plan to allow the client to pass the token. Which is a
generic way to implement its own OAUTH flows.

> 1) I'm really concerned about the sudden explosion of flows. We went
> from one flow (Device Authorization) to six. It's going to be hard
> enough to validate that *one* flow is useful and can be securely
> deployed by end users; I don't think we're going to be able to maintain
> six, especially in combination with my statement that iddawc is not an
> appropriate dependency for us.

> I'd much rather give applications the ability to use their own OAuth
> code, and then maintain within libpq only the flows that are broadly
> useful. This ties back to (0) above.
We consider the following set of flows to be minimum required:
- Client Credentials - For Service to Service scenarios.
- Authorization Code with PKCE - For rich clients,including pgAdmin.
- Device code - for psql (and possibly other non-GUI clients).
- Refresh code (separate discussion)
Which is pretty much the list described here:
https://oauth.net/2/grant-types/ and in OAUTH2 specs.
Client Credentials is very simple, so does Refresh Code.
If you prefer to pick one of the richer flows, Authorization code for
GUI scenarios is probably much more widely used.
Plus it's easier to implement too, as interaction goes through a
series of callbacks. No polling required.

> 2) Breaking the refresh token into its own pseudoflow is, I think,
> passing the buck onto the user for something that's incredibly security
> sensitive. The refresh token is powerful; I don't really want it to be
> printed anywhere, let alone copy-pasted by the user. Imagine the
> phishing opportunities.

> If we want to support refresh tokens, I believe we should be developing
> a plan to cache and secure them within the client. They should be used
> as an accelerator for other flows, not as their own flow.
It's considered a separate "grant_type" in the specs / APIs.
https://openid.net/specs/openid-connect-core-1_0.html#RefreshTokens

For the clients, it would be storing the token and using it to authenticate.
On the question of sensitivity, secure credentials stores are
different for each platform, with a lot of cloud offerings for this.
pgAdmin, for example, has its own way to secure credentials to avoid
asking users for passwords every time the app is opened.
I believe we should delegate the refresh token management to the clients.

>3) I don't like the departure from the OAUTHBEARER mechanism that's
> presented here. For one, since I can't see a sample plugin that makes
> use of the "flow type" magic numbers that have been added, I don't
> really understand why the extension to the mechanism is necessary.
I don't think it's much of a departure, but rather a separation of
responsibilities between libpq and upstream clients.
As libpq can be used in different apps, the client would need
different types of flows/grants.
I.e. those need to be provided to libpq at connection initialization
or some other point.
We will change to "grant_type" though and use string to be closer to the spec.
What do you think is the best way for the client to signal which OAUTH
flow should be used?

On Wed, Nov 23, 2022 at 12:05 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On 11/23/22 01:58, mahendrakar s wrote:
> > We validated on  libpq handling OAuth natively with different flows
> > with different OIDC certified providers.
> >
> > Flows: Device Code, Client Credentials and Refresh Token.
> > Providers: Microsoft, Google and Okta.
>
> Great, thank you!
>
> > Also validated with OAuth provider Github.
>
> (How did you get discovery working? I tried this and had to give up
> eventually.)
>
> > We propose using OpenID Connect (OIDC) as the protocol, instead of
> > OAuth, as it is:
> > - Discovery mechanism to bridge the differences and provide metadata.
> > - Stricter protocol and certification process to reliably identify
> > which providers can be supported.
> > - OIDC is designed for authentication, while the main purpose of OAUTH is to
> > authorize applications on behalf of the user.
>
> How does this differ from the previous proposal? The OAUTHBEARER SASL
> mechanism already relies on OIDC for discovery. (I think that decision
> is confusing from an architectural and naming standpoint, but I don't
> think they really had an alternative...)
>
> > Github is not OIDC certified, so won’t be supported with this proposal.
> > However, it may be supported in the future through the ability for the
> > extension to provide custom discovery document content.
>
> Right.
>
> > OpenID configuration has a well-known discovery mechanism
> > for the provider configuration URI which is
> > defined in OpenID Connect. It allows libpq to fetch
> > metadata about provider (i.e endpoints, supported grants, response types, etc).
>
> Sure, but this is already how the original PoC works. The test suite
> implements an OIDC provider, for instance. Is there something different
> to this that I'm missing?
>
> > In the attached patch (based on V2 patch in the thread and does not
> > contain Samay's changes):
> > - Provider can configure issuer url and scope through the options hook.)
> > - Server passes on an open discovery url and scope to libpq.
> > - Libpq handles OAuth flow based on the flow_type sent in the
> > connection string [1].
> > - Added callbacks to notify a structure to client tools if OAuth flow
> > requires user interaction.
> > - Pg backend uses hooks to validate bearer token.
>
> Thank you for the sample!
>
> > Note that authentication code flow with PKCE for GUI clients is not
> > implemented yet.
> >
> > Proposed next steps:
> > - Broaden discussion to reach agreement on the approach.
>
> High-level thoughts on this particular patch (I assume you're not
> looking for low-level implementation comments yet):
>
> 0) The original hook proposal upthread, I thought, was about allowing
> libpq's flow implementation to be switched out by the application. I
> don't see that approach taken here. It's fine if that turned out to be a
> bad idea, of course, but this patch doesn't seem to match what we were
> talking about.
>
> 1) I'm really concerned about the sudden explosion of flows. We went
> from one flow (Device Authorization) to six. It's going to be hard
> enough to validate that *one* flow is useful and can be securely
> deployed by end users; I don't think we're going to be able to maintain
> six, especially in combination with my statement that iddawc is not an
> appropriate dependency for us.
>
> I'd much rather give applications the ability to use their own OAuth
> code, and then maintain within libpq only the flows that are broadly
> useful. This ties back to (0) above.
>
> 2) Breaking the refresh token into its own pseudoflow is, I think,
> passing the buck onto the user for something that's incredibly security
> sensitive. The refresh token is powerful; I don't really want it to be
> printed anywhere, let alone copy-pasted by the user. Imagine the
> phishing opportunities.
>
> If we want to support refresh tokens, I believe we should be developing
> a plan to cache and secure them within the client. They should be used
> as an accelerator for other flows, not as their own flow.
>
> 3) I don't like the departure from the OAUTHBEARER mechanism that's
> presented here. For one, since I can't see a sample plugin that makes
> use of the "flow type" magic numbers that have been added, I don't
> really understand why the extension to the mechanism is necessary.
>
> For two, if we think OAUTHBEARER is insufficient, the people who wrote
> it would probably like to hear about it. Claiming support for a spec,
> and then implementing an extension without review from the people who
> wrote the spec, is not something I'm personally interested in doing.
>
> 4) The test suite is still broken, so it's difficult to see these things
> in practice for review purposes.
>
> > - Implement libpq changes without iddawc
>
> This in particular will be much easier with a functioning test suite,
> and with a smaller number of flows.
>
> > - Prototype GUI flow with pgAdmin
>
> Cool!
>
> Thanks,
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
mahendrakar s
Date:
Hi Jacob,

I had validated Github by skipping the discovery mechanism and letting
the provider extension pass on the endpoints. This is just for
validation purposes.
If it needs to be supported, then need a way to send the discovery
document from extension.


Thanks,
Mahendrakar.

On Thu, 24 Nov 2022 at 09:16, Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
>
> > How does this differ from the previous proposal? The OAUTHBEARER SASL
> > mechanism already relies on OIDC for discovery. (I think that decision
> > is confusing from an architectural and naming standpoint, but I don't
> > think they really had an alternative...)
> Mostly terminology questions here. OAUTHBEARER SASL appears to be the
> spec about using OAUTH2 tokens for Authentication.
> While any OAUTH2 can generally work, we propose to specifically
> highlight that only OIDC providers can be supported, as we need the
> discovery document.
> And we won't be able to support Github under that requirement.
> Since the original patch used that too - no change on that, just
> confirmation that we need OIDC compliance.
>
> > 0) The original hook proposal upthread, I thought, was about allowing
> > libpq's flow implementation to be switched out by the application. I
> > don't see that approach taken here. It's fine if that turned out to be a
> > bad idea, of course, but this patch doesn't seem to match what we were
> > talking about.
> We still plan to allow the client to pass the token. Which is a
> generic way to implement its own OAUTH flows.
>
> > 1) I'm really concerned about the sudden explosion of flows. We went
> > from one flow (Device Authorization) to six. It's going to be hard
> > enough to validate that *one* flow is useful and can be securely
> > deployed by end users; I don't think we're going to be able to maintain
> > six, especially in combination with my statement that iddawc is not an
> > appropriate dependency for us.
>
> > I'd much rather give applications the ability to use their own OAuth
> > code, and then maintain within libpq only the flows that are broadly
> > useful. This ties back to (0) above.
> We consider the following set of flows to be minimum required:
> - Client Credentials - For Service to Service scenarios.
> - Authorization Code with PKCE - For rich clients,including pgAdmin.
> - Device code - for psql (and possibly other non-GUI clients).
> - Refresh code (separate discussion)
> Which is pretty much the list described here:
> https://oauth.net/2/grant-types/ and in OAUTH2 specs.
> Client Credentials is very simple, so does Refresh Code.
> If you prefer to pick one of the richer flows, Authorization code for
> GUI scenarios is probably much more widely used.
> Plus it's easier to implement too, as interaction goes through a
> series of callbacks. No polling required.
>
> > 2) Breaking the refresh token into its own pseudoflow is, I think,
> > passing the buck onto the user for something that's incredibly security
> > sensitive. The refresh token is powerful; I don't really want it to be
> > printed anywhere, let alone copy-pasted by the user. Imagine the
> > phishing opportunities.
>
> > If we want to support refresh tokens, I believe we should be developing
> > a plan to cache and secure them within the client. They should be used
> > as an accelerator for other flows, not as their own flow.
> It's considered a separate "grant_type" in the specs / APIs.
> https://openid.net/specs/openid-connect-core-1_0.html#RefreshTokens
>
> For the clients, it would be storing the token and using it to authenticate.
> On the question of sensitivity, secure credentials stores are
> different for each platform, with a lot of cloud offerings for this.
> pgAdmin, for example, has its own way to secure credentials to avoid
> asking users for passwords every time the app is opened.
> I believe we should delegate the refresh token management to the clients.
>
> >3) I don't like the departure from the OAUTHBEARER mechanism that's
> > presented here. For one, since I can't see a sample plugin that makes
> > use of the "flow type" magic numbers that have been added, I don't
> > really understand why the extension to the mechanism is necessary.
> I don't think it's much of a departure, but rather a separation of
> responsibilities between libpq and upstream clients.
> As libpq can be used in different apps, the client would need
> different types of flows/grants.
> I.e. those need to be provided to libpq at connection initialization
> or some other point.
> We will change to "grant_type" though and use string to be closer to the spec.
> What do you think is the best way for the client to signal which OAUTH
> flow should be used?
>
> On Wed, Nov 23, 2022 at 12:05 PM Jacob Champion <jchampion@timescale.com> wrote:
> >
> > On 11/23/22 01:58, mahendrakar s wrote:
> > > We validated on  libpq handling OAuth natively with different flows
> > > with different OIDC certified providers.
> > >
> > > Flows: Device Code, Client Credentials and Refresh Token.
> > > Providers: Microsoft, Google and Okta.
> >
> > Great, thank you!
> >
> > > Also validated with OAuth provider Github.
> >
> > (How did you get discovery working? I tried this and had to give up
> > eventually.)
> >
> > > We propose using OpenID Connect (OIDC) as the protocol, instead of
> > > OAuth, as it is:
> > > - Discovery mechanism to bridge the differences and provide metadata.
> > > - Stricter protocol and certification process to reliably identify
> > > which providers can be supported.
> > > - OIDC is designed for authentication, while the main purpose of OAUTH is to
> > > authorize applications on behalf of the user.
> >
> > How does this differ from the previous proposal? The OAUTHBEARER SASL
> > mechanism already relies on OIDC for discovery. (I think that decision
> > is confusing from an architectural and naming standpoint, but I don't
> > think they really had an alternative...)
> >
> > > Github is not OIDC certified, so won’t be supported with this proposal.
> > > However, it may be supported in the future through the ability for the
> > > extension to provide custom discovery document content.
> >
> > Right.
> >
> > > OpenID configuration has a well-known discovery mechanism
> > > for the provider configuration URI which is
> > > defined in OpenID Connect. It allows libpq to fetch
> > > metadata about provider (i.e endpoints, supported grants, response types, etc).
> >
> > Sure, but this is already how the original PoC works. The test suite
> > implements an OIDC provider, for instance. Is there something different
> > to this that I'm missing?
> >
> > > In the attached patch (based on V2 patch in the thread and does not
> > > contain Samay's changes):
> > > - Provider can configure issuer url and scope through the options hook.)
> > > - Server passes on an open discovery url and scope to libpq.
> > > - Libpq handles OAuth flow based on the flow_type sent in the
> > > connection string [1].
> > > - Added callbacks to notify a structure to client tools if OAuth flow
> > > requires user interaction.
> > > - Pg backend uses hooks to validate bearer token.
> >
> > Thank you for the sample!
> >
> > > Note that authentication code flow with PKCE for GUI clients is not
> > > implemented yet.
> > >
> > > Proposed next steps:
> > > - Broaden discussion to reach agreement on the approach.
> >
> > High-level thoughts on this particular patch (I assume you're not
> > looking for low-level implementation comments yet):
> >
> > 0) The original hook proposal upthread, I thought, was about allowing
> > libpq's flow implementation to be switched out by the application. I
> > don't see that approach taken here. It's fine if that turned out to be a
> > bad idea, of course, but this patch doesn't seem to match what we were
> > talking about.
> >
> > 1) I'm really concerned about the sudden explosion of flows. We went
> > from one flow (Device Authorization) to six. It's going to be hard
> > enough to validate that *one* flow is useful and can be securely
> > deployed by end users; I don't think we're going to be able to maintain
> > six, especially in combination with my statement that iddawc is not an
> > appropriate dependency for us.
> >
> > I'd much rather give applications the ability to use their own OAuth
> > code, and then maintain within libpq only the flows that are broadly
> > useful. This ties back to (0) above.
> >
> > 2) Breaking the refresh token into its own pseudoflow is, I think,
> > passing the buck onto the user for something that's incredibly security
> > sensitive. The refresh token is powerful; I don't really want it to be
> > printed anywhere, let alone copy-pasted by the user. Imagine the
> > phishing opportunities.
> >
> > If we want to support refresh tokens, I believe we should be developing
> > a plan to cache and secure them within the client. They should be used
> > as an accelerator for other flows, not as their own flow.
> >
> > 3) I don't like the departure from the OAUTHBEARER mechanism that's
> > presented here. For one, since I can't see a sample plugin that makes
> > use of the "flow type" magic numbers that have been added, I don't
> > really understand why the extension to the mechanism is necessary.
> >
> > For two, if we think OAUTHBEARER is insufficient, the people who wrote
> > it would probably like to hear about it. Claiming support for a spec,
> > and then implementing an extension without review from the people who
> > wrote the spec, is not something I'm personally interested in doing.
> >
> > 4) The test suite is still broken, so it's difficult to see these things
> > in practice for review purposes.
> >
> > > - Implement libpq changes without iddawc
> >
> > This in particular will be much easier with a functioning test suite,
> > and with a smaller number of flows.
> >
> > > - Prototype GUI flow with pgAdmin
> >
> > Cool!
> >
> > Thanks,
> > --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On 11/23/22 19:45, Andrey Chudnovsky wrote:
> Mostly terminology questions here. OAUTHBEARER SASL appears to be the
> spec about using OAUTH2 tokens for Authentication.
> While any OAUTH2 can generally work, we propose to specifically
> highlight that only OIDC providers can be supported, as we need the
> discovery document.

*If* you're using in-band discovery, yes. But I thought your use case
was explicitly tailored to out-of-band token retrieval:

> The client knows how to get a token for a particular principal
> and doesn't need any additional information other than human readable
> messages.

In that case, isn't OAuth sufficient? There's definitely a need to
document the distinction, but I don't think we have to require OIDC as
long as the client application makes up for the missing information.
(OAUTHBEARER makes the openid-configuration error member optional,
presumably for this reason.)

>> 0) The original hook proposal upthread, I thought, was about allowing
>> libpq's flow implementation to be switched out by the application. I
>> don't see that approach taken here. It's fine if that turned out to be a
>> bad idea, of course, but this patch doesn't seem to match what we were
>> talking about.
> We still plan to allow the client to pass the token. Which is a
> generic way to implement its own OAUTH flows.

Okay. But why push down the implementation into the server?

To illustrate what I mean, here's the architecture of my proposed patchset:

  +-------+                                          +----------+
  |       | -------------- Empty Token ------------> |          |
  | libpq | <----- Error Result (w/ Discovery ) ---- |          |
  |       |                                          |          |
  | +--------+                     +--------------+  |          |
  | | iddawc | <--- [ Flow ] ----> | Issuer/      |  | Postgres |
  | |        | <-- Access Token -- | Authz Server |  |          |
  | +--------+                     +--------------+  |   +-----------+
  |       |                                          |   |           |
  |       | -------------- Access Token -----------> | > | Validator |
  |       | <---- Authorization Success/Failure ---- | < |           |
  |       |                                          |   +-----------+
  +-------+                                          +----------+

In this implementation, there's only one black box: the validator, which
is responsible for taking an access token from an untrusted client,
verifying that it was issued correctly for the Postgres service, and
either 1) determining whether the bearer is authorized to access the
database, or 2) determining the authenticated ID of the bearer so that
the HBA can decide whether they're authorized. (Or both.)

This approach is limited by the flows that we explicitly enable within
libpq and its OAuth implementation library. You mentioned that you
wanted to support other flows, including clients with out-of-band
knowledge, and I suggested:

> If you wanted to override [iddawc's]
> behavior as a client, you could replace the builtin flow with your
> own, by registering a set of callbacks.

In other words, the hooks would replace iddawc in the above diagram.
In my mind, something like this:

     +-------+                                       +----------+
  +------+   | ----------- Empty Token ------------> | Postgres |
  |      | < | <---------- Error Result ------------ |          |
  | Hook |   |                                       |   +-----------+
  |      |   |                                       |   |           |
  +------+ > | ------------ Access Token ----------> | > | Validator |
     |       | <--- Authorization Success/Failure -- | < |           |
     | libpq |                                       |   +-----------+
     +-------+                                       +----------+

Now there's a second black box -- the client hook -- which takes an
OAUTHBEARER error result (which may or may not have OIDC discovery
information) and returns the access token. How it does this is
unspecified -- it'll probably use some OAuth 2.0 flow, but maybe not.
Maybe it sends the user to a web browser; maybe it uses some of the
magic provider-specific libraries you mentioned upthread. It might have
a refresh token cached so it doesn't have to involve the user at all.

Crucially, though, the two black boxes remain independent of each other.
They have well-defined inputs and outputs (the client hook could be
roughly described as "implement get_auth_token()"). Their correctness
can be independently verified against published OAuth specs and/or
provider documentation. And the client application still makes a single
call to PQconnect*().

Compare this to the architecture proposed by your patch:

  Client App
  +----------------------+
  |             +-------+                                +----------+
  |             | libpq |                                | Postgres |
  | PQconnect > |       |                                |   +-------+
  |          +------+   | ------- Flow Type (!) -------> | > |       |
  |     +- < | Hook | < | <------- Error Result -------- | < |       |
  | [ get    +------+   |                                |   |       |
  |   token ]   |       |                                |   |       |
  |     |       |       |                                |   | Hooks |
  |     v       |       |                                |   |       |
  | PQconnect > | ----> | ------ Access Token ---------> | > |       |
  |             |       | <--- Authz Success/Failure --- | < |       |
  |             +-------+                                |   +-------+
  +----------------------+                               +----------+

Rather than decouple things, I think this proposal drives a spike
through the client app, libpq, and the server. Please correct me if I've
misunderstood pieces of the patch, but the following is my view of it:

What used to be a validator hook on the server side now actively
participates in the client-side flow for some reason. (I still don't
understand what the server is supposed to do with that knowledge.
Changing your authz requirements based on the flow the client wants to
use seems like a good way to introduce bugs.)

The client-side hook is now coupled to the application logic: you have
to know to expect an error from the first PQconnect*() call, then check
whatever magic your hook has done for you to be able to set up the
second call to PQconnect*() with the correctly scoped bearer token. So
if you want to switch between the internal libpq OAuth implementation
and your own hook, you have to rewrite your app logic.

On top of all that, the "flow type code" being sent is a custom
extension to OAUTHBEARER that appears to be incompatible with the RFC's
discovery exchange (which is done by sending an empty auth token during
the first round trip).

> We consider the following set of flows to be minimum required:
> - Client Credentials - For Service to Service scenarios.

Okay, that's simple enough that I think it could probably be maintained
inside libpq with minimal cost. At the same time, is it complicated
enough that you need libpq to do it for you?

Maybe once we get the hooks ironed out, it'll be more obvious what the
tradeoff is...

> If you prefer to pick one of the richer flows, Authorization code for
> GUI scenarios is probably much more widely used.
> Plus it's easier to implement too, as interaction goes through a
> series of callbacks. No polling required.

I don't think flows requiring the invocation of web browsers and custom
URL handlers are a clear fit for libpq. For a first draft, at least, I
think that use case should be pushed upward into the client application
via a custom hook.

>> If we want to support refresh tokens, I believe we should be developing
>> a plan to cache and secure them within the client. They should be used
>> as an accelerator for other flows, not as their own flow.
> It's considered a separate "grant_type" in the specs / APIs.
> https://openid.net/specs/openid-connect-core-1_0.html#RefreshTokens

Yes, but that doesn't mean we have to expose it to users via a
connection option. You don't get a refresh token out of the blue; you
get it by going through some other flow, and then you use it in
preference to going through that flow again later.

> For the clients, it would be storing the token and using it to authenticate.
> On the question of sensitivity, secure credentials stores are
> different for each platform, with a lot of cloud offerings for this.
> pgAdmin, for example, has its own way to secure credentials to avoid
> asking users for passwords every time the app is opened.
> I believe we should delegate the refresh token management to the clients.

Delegating to client apps would be fine (and implicitly handled by a
token hook, because the client app would receive the refresh token
directly rather than going through libpq). Delegating to end users, not
so much. Printing a refresh token to stderr as proposed here is, I
think, making things unnecessarily difficult (and/or dangerous) for  users.

>> 3) I don't like the departure from the OAUTHBEARER mechanism that's
>> presented here. For one, since I can't see a sample plugin that makes
>> use of the "flow type" magic numbers that have been added, I don't
>> really understand why the extension to the mechanism is necessary.
> I don't think it's much of a departure, but rather a separation of
> responsibilities between libpq and upstream clients.

Given the proposed architectures above, 1) I think this is further
coupling the components, not separating them; and 2) I can't agree that
an incompatible discovery mechanism is "not much of a departure". If
OAUTHBEARER's functionality isn't good enough for some reason, let's
talk about why.

> As libpq can be used in different apps, the client would need
> different types of flows/grants.
> I.e. those need to be provided to libpq at connection initialization
> or some other point.

Why do libpq (or the server!) need to know those things at all, if
they're not going to implement the flow?

> We will change to "grant_type" though and use string to be closer to the spec.
> What do you think is the best way for the client to signal which OAUTH
> flow should be used?

libpq should not need to know the grant type in use if the client is
bypassing its internal implementation entirely.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On 11/24/22 00:20, mahendrakar s wrote:
> I had validated Github by skipping the discovery mechanism and letting
> the provider extension pass on the endpoints. This is just for
> validation purposes.
> If it needs to be supported, then need a way to send the discovery
> document from extension.

Yeah. I had originally bounced around the idea that we could send a
data:// URL, but I think that opens up problems.

You're supposed to be able to link the issuer URI with the URI you got
the configuration from, and if they're different, you bail out. If a
server makes up its own OpenID configuration, we'd have to bypass that
safety check, and decide what the risks and mitigations are... Not sure
it's worth it.

Especially if you could just lobby GitHub to, say, provide an OpenID
config. (Maybe there's a security-related reason they don't.)

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
Jacob,
Thanks for your feedback.
I think we can focus on the roles and responsibilities of the components first.
Details of the patch can be elaborated. Like "flow type code" is a
mistake on our side, and we will use the term "grant_type" which is
defined by OIDC spec. As well as details of usage of refresh_token.

> Rather than decouple things, I think this proposal drives a spike
> through the client app, libpq, and the server. Please correct me if I've
> misunderstood pieces of the patch, but the following is my view of it:

> What used to be a validator hook on the server side now actively
> participates in the client-side flow for some reason. (I still don't
> understand what the server is supposed to do with that knowledge.
> Changing your authz requirements based on the flow the client wants to
> use seems like a good way to introduce bugs.)

> The client-side hook is now coupled to the application logic: you have
> to know to expect an error from the first PQconnect*() call, then check
> whatever magic your hook has done for you to be able to set up the
> second call to PQconnect*() with the correctly scoped bearer token. So
> if you want to switch between the internal libpq OAuth implementation
> and your own hook, you have to rewrite your app logic.

Basically Yes. We propose an increase of the server side hook responsibility.
From just validating the token, to also return the provider root URL
and required audience. And possibly provide more metadata in the
future.
Which is in our opinion aligned with SASL protocol, where the server
side is responsible for telling the client auth requirements based on
the requested role in the startup packet.

Our understanding is that in the original patch that information came
purely from hba, and we propose extension being able to control that
metadata.
As we see extension as being owned by the identity provider, compared
to HBA which is owned by the server administrator or cloud provider.

This change of the roles is based on the vision of 4 independent actor
types in the ecosystem:
1. Identity Providers (Okta, Google, Microsoft, other OIDC providers).
   - Publish open source extensions for PostgreSQL.
   - Don't have to own the server deployments, and must ensure their
extensions can work in any environment. This is where we think
additional hook responsibility helps.
2. Server Owners / PAAS providers (On premise admins, Cloud providers,
multi-cloud PAAS providers).
   - Install extensions and configure HBA to allow clients to
authenticate with the identity providers of their choice.
3. Client Application Developers (Data Wis, integration tools,
PgAdmin, monitoring tools, e.t.c.)
   - Independent from specific Identity providers or server providers.
Write one code for all identity providers.
   - Rely on application deployment owners to configure which OIDC
provider to use across client and server setups.
4. Application Deployment Owners (End customers setting up applications)
   - The only actor actually aware of which identity provider to use.
Configures the stack based on the Identity and PostgreSQL deployments
they have.

The critical piece of the vision is (3.) above is applications
agnostic of the identity providers. Those applications rely on
properly configured servers and rich driver logic (libpq,
com.postgresql, npgsql) to allow their application to popup auth
windows or do service-to-service authentication with any provider. In
our view that would significantly democratize the deployment of OAUTH
authentication in the community.

In order to allow this separation, we propose:
1. HBA + Extension is the single source of truth of Provider root URL
+ Required Audience for each role. If some backfill for missing OIDC
discovery is needed, the provider-specific extension would be
providing it.
2. Client Application knows which grant_type to use in which scenario.
But can be coded without knowledge of a specific provider. So can't
provide discovery details.
3. Driver (libpq, others) - coordinate the authentication flow based
on client grant_type and identity provider metadata to allow client
applications to use any flow with any provider in a unified way.

Yes, this would require a little more complicated flow between
components than in your original patch. And yes, more complexity comes
with more opportunity to make bugs.
However, I see PG Server and Libpq as the places which can have more
complexity. For the purpose of making work for the community
participants easier and simplify adoption.

Does this make sense to you?


On Tue, Nov 29, 2022 at 1:20 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On 11/24/22 00:20, mahendrakar s wrote:
> > I had validated Github by skipping the discovery mechanism and letting
> > the provider extension pass on the endpoints. This is just for
> > validation purposes.
> > If it needs to be supported, then need a way to send the discovery
> > document from extension.
>
> Yeah. I had originally bounced around the idea that we could send a
> data:// URL, but I think that opens up problems.
>
> You're supposed to be able to link the issuer URI with the URI you got
> the configuration from, and if they're different, you bail out. If a
> server makes up its own OpenID configuration, we'd have to bypass that
> safety check, and decide what the risks and mitigations are... Not sure
> it's worth it.
>
> Especially if you could just lobby GitHub to, say, provide an OpenID
> config. (Maybe there's a security-related reason they don't.)
>
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Dec 5, 2022 at 4:15 PM Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
> I think we can focus on the roles and responsibilities of the components first.
> Details of the patch can be elaborated. Like "flow type code" is a
> mistake on our side, and we will use the term "grant_type" which is
> defined by OIDC spec. As well as details of usage of refresh_token.

(For the record, whether we call it "flow type" or "grant type"
doesn't address my concern.)

> Basically Yes. We propose an increase of the server side hook responsibility.
> From just validating the token, to also return the provider root URL
> and required audience. And possibly provide more metadata in the
> future.

I think it's okay to have the extension and HBA collaborate to provide
discovery information. Your proposal goes further than that, though,
and makes the server aware of the chosen client flow. That appears to
be an architectural violation: why does an OAuth resource server need
to know the client flow at all?

> Which is in our opinion aligned with SASL protocol, where the server
> side is responsible for telling the client auth requirements based on
> the requested role in the startup packet.

You've proposed an alternative SASL mechanism. There's nothing wrong
with that, per se, but I think it should be clear why we've chosen
something nonstandard.

> Our understanding is that in the original patch that information came
> purely from hba, and we propose extension being able to control that
> metadata.
> As we see extension as being owned by the identity provider, compared
> to HBA which is owned by the server administrator or cloud provider.

That seems reasonable, considering how tightly coupled the Issuer and
the token validation process are.

> 2. Server Owners / PAAS providers (On premise admins, Cloud providers,
> multi-cloud PAAS providers).
>    - Install extensions and configure HBA to allow clients to
> authenticate with the identity providers of their choice.

(For a future conversation: they need to set up authorization, too,
with custom scopes or some other magic. It's not enough to check who
the token belongs to; even if Postgres is just using the verified
email from OpenID as an authenticator, you have to also know that the
user authorized the token -- and therefore the client -- to access
Postgres on their behalf.)

> 3. Client Application Developers (Data Wis, integration tools,
> PgAdmin, monitoring tools, e.t.c.)
>    - Independent from specific Identity providers or server providers.
> Write one code for all identity providers.

Ideally, yes, but that only works if all identity providers implement
the same flows in compatible ways. We're already seeing instances
where that's not the case and we'll necessarily have to deal with that
up front.

>    - Rely on application deployment owners to configure which OIDC
> provider to use across client and server setups.
> 4. Application Deployment Owners (End customers setting up applications)
>    - The only actor actually aware of which identity provider to use.
> Configures the stack based on the Identity and PostgreSQL deployments
> they have.

(I have doubts that the roles will be as decoupled in practice as you
have described them, but I'd rather defer that for now.)

> The critical piece of the vision is (3.) above is applications
> agnostic of the identity providers. Those applications rely on
> properly configured servers and rich driver logic (libpq,
> com.postgresql, npgsql) to allow their application to popup auth
> windows or do service-to-service authentication with any provider. In
> our view that would significantly democratize the deployment of OAUTH
> authentication in the community.

That seems to be restating the goal of OAuth and OIDC. Can you explain
how the incompatible change allows you to accomplish this better than
standard implementations?

> In order to allow this separation, we propose:
> 1. HBA + Extension is the single source of truth of Provider root URL
> + Required Audience for each role. If some backfill for missing OIDC
> discovery is needed, the provider-specific extension would be
> providing it.
> 2. Client Application knows which grant_type to use in which scenario.
> But can be coded without knowledge of a specific provider. So can't
> provide discovery details.
> 3. Driver (libpq, others) - coordinate the authentication flow based
> on client grant_type and identity provider metadata to allow client
> applications to use any flow with any provider in a unified way.
>
> Yes, this would require a little more complicated flow between
> components than in your original patch.

Why? I claim that standard OAUTHBEARER can handle all of that. What
does your proposed architecture (the third diagram) enable that my
proposed hook (the second diagram) doesn't?

> And yes, more complexity comes
> with more opportunity to make bugs.
> However, I see PG Server and Libpq as the places which can have more
> complexity. For the purpose of making work for the community
> participants easier and simplify adoption.
>
> Does this make sense to you?

Some of it, but it hasn't really addressed the questions from my last mail.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> I think it's okay to have the extension and HBA collaborate to provide
> discovery information. Your proposal goes further than that, though,
> and makes the server aware of the chosen client flow. That appears to
> be an architectural violation: why does an OAuth resource server need
> to know the client flow at all?

Ok. It may have left there from intermediate iterations. We did
consider making extension drive the flow for specific grant_type, but
decided against that idea. For the same reason you point to.
Is it correct that your main concern about use of grant_type was that
it's propagated to the server? Then yes, we will remove sending it to
the server.

> Ideally, yes, but that only works if all identity providers implement
> the same flows in compatible ways. We're already seeing instances
> where that's not the case and we'll necessarily have to deal with that
> up front.

Yes, based on our analysis OIDC spec is detailed enough, that
providers implementing that one, can be supported with generic code in
libpq / client.
Github specifically won't fit there though. Microsoft Azure AD,
Google, Okta (including Auth0) will.
Theoretically discovery documents can be returned from the extension
(server-side) which is provider specific. Though we didn't plan to
prioritize that.

> That seems to be restating the goal of OAuth and OIDC. Can you explain
> how the incompatible change allows you to accomplish this better than
> standard implementations?

Do you refer to passing grant_type to the server? Which we will get
rid of in the next iteration. Or other incompatible changes as well?

> Why? I claim that standard OAUTHBEARER can handle all of that. What
> does your proposed architecture (the third diagram) enable that my
> proposed hook (the second diagram) doesn't?

The hook proposed on the 2nd diagram effectively delegates all Oauth
flows implementations to the client.
We propose libpq takes care of pulling OpenId discovery and coordination.
Which is effectively Diagram 1 + more flows + server hook providing
root url/audience.

Created the diagrams with all components for 3 flows:
1. Authorization code grant (Clients with Browser access):
  +----------------------+                                         +----------+
  |             +-------+                                          |
       |
  | PQconnect   |       |                                          |
       |
  | [auth_code] |       |                                          |
+-----------+
  |          -> |       | -------------- Empty Token ------------> | >
|           |
  |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
| Pre-Auth  |
  |             |       |                                          |
|  Hook     |
  |             |       |                                          |
+-----------+
  |             |       |                        +--------------+  |          |
  |             |       | -------[GET]---------> | OIDC         |  | Postgres |
  |          +------+   | <--Provider Metadata-- | Discovery    |  |          |
  |     +- < | Hook | < |                        +--------------+  |
       |
  |     |    +------+   |                                          |
       |
  |     v       |       |                                          |
       |
  |  [get auth  |       |                                          |
       |
  |    code]    |       |                                          |
       |
  |<user action>|       |                                          |
       |
  |     |       |       |                                          |
       |
  |     +       |       |                                          |
       |
  | PQconnect > | +--------+                     +--------------+  |
       |
  |             | | iddawc | <-- [ Auth code ]-> | Issuer/      |  |          |
  |             | |        | <-- Access Token -- | Authz Server |  |          |
  |             | +--------+                     +--------------+  |          |
  |             |       |                                          |
+-----------+
  |             |       | -------------- Access Token -----------> | >
| Validator |
  |             |       | <---- Authorization Success/Failure ---- | <
|   Hook    |
  |          +------+   |                                          |
+-----------+
  |      +-< | Hook |   |                                          |
       |
  |      v   +------+   |                                          |
       |
  |[store       +-------+                                          |
       |
  |  refresh_token]                                                +----------+
  +----------------------+

2. Device code grant
  +----------------------+                                         +----------+
  |             +-------+                                          |
       |
  | PQconnect   |       |                                          |
       |
  | [auth_code] |       |                                          |
+-----------+
  |          -> |       | -------------- Empty Token ------------> | >
|           |
  |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
| Pre-Auth  |
  |             |       |                                          |
|  Hook     |
  |             |       |                                          |
+-----------+
  |             |       |                        +--------------+  |          |
  |             |       | -------[GET]---------> | OIDC         |  | Postgres |
  |          +------+   | <--Provider Metadata-- | Discovery    |  |          |
  |     +- < | Hook | < |                        +--------------+  |
       |
  |     |    +------+   |                                          |
       |
  |     v       |       |                                          |
       |
  |  [device    | +---------+                     +--------------+ |
       |
  |    code]    | | iddawc  |                     | Issuer/      | |
       |
  |<user action>| |         | --[ Device code ]-> | Authz Server | |
       |
  |             | |<polling>| --[ Device code ]-> |              | |
       |
  |             | |         | --[ Device code ]-> |              | |
       |
  |             | |         |                     |              | |          |
  |             | |         | <-- Access Token -- |              | |          |
  |             | +---------+                     +--------------+ |          |
  |             |       |                                          |
+-----------+
  |             |       | -------------- Access Token -----------> | >
| Validator |
  |             |       | <---- Authorization Success/Failure ---- | <
|   Hook    |
  |          +------+   |                                          |
+-----------+
  |      +-< | Hook |   |                                          |
       |
  |      v   +------+   |                                          |
       |
  |[store       +-------+                                          |
       |
  |  refresh_token]                                                +----------+
  +----------------------+

3. Non-interactive flows (Client Secret / Refresh_Token)
  +----------------------+                                         +----------+
  |             +-------+                                          |
       |
  | PQconnect   |       |                                          |
       |
  | [grant_type]|       |                                          |          |
  |          -> |       |                                          |
+-----------+
  |             |       | -------------- Empty Token ------------> | >
|           |
  |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
| Pre-Auth  |
  |             |       |                                          |
|  Hook     |
  |             |       |                                          |
+-----------+
  |             |       |                        +--------------+  |          |
  |             |       | -------[GET]---------> | OIDC         |  | Postgres |
  |             |       | <--Provider Metadata-- | Discovery    |  |          |
  |             |       |                        +--------------+  |
       |
  |             |       |                                          |
       |
  |             | +--------+                     +--------------+  |
       |
  |             | | iddawc | <-- [ Secret ]----> | Issuer/      |  |          |
  |             | |        | <-- Access Token -- | Authz Server |  |          |
  |             | +--------+                     +--------------+  |          |
  |             |       |                                          |
+-----------+
  |             |       | -------------- Access Token -----------> | >
| Validator |
  |             |       | <---- Authorization Success/Failure ---- | <
|   Hook    |
  |             |       |                                          |
+-----------+
  |             +-------+                                          +----------+
  +----------------------+

I think what was the most confusing in our latest patch is that
flow_type was passed to the server.
We are not proposing this going forward.

> (For a future conversation: they need to set up authorization, too,
> with custom scopes or some other magic. It's not enough to check who
> the token belongs to; even if Postgres is just using the verified
> email from OpenID as an authenticator, you have to also know that the
> user authorized the token -- and therefore the client -- to access
> Postgres on their behalf.)

My understanding is that metadata in the tokens is provider specific,
so server side hook would be the right place to handle that.
Plus I can envision for some providers it can make sense to make a
remote call to pull some information.

The way we implement Azure AD auth today in PAAS PostgreSQL offering:
- Server administrator uses special extension functions to create
Azure AD enabled PostgreSQL roles.
- PostgreSQL extension maps Roles to unique identity Ids (UID) in the Directory.
- Connection flow: If the token is valid and Role => UID mapping
matches, we authenticate as the Role.
- Then its native PostgreSQL role based access control takes care of privileges.

This is the same for both User- and System-to-system authorization.
Though I assume different providers may treat user- and system-
identities differently. So their extension would handle that.

Thanks!
Andrey.

On Wed, Dec 7, 2022 at 11:06 AM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Mon, Dec 5, 2022 at 4:15 PM Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
> > I think we can focus on the roles and responsibilities of the components first.
> > Details of the patch can be elaborated. Like "flow type code" is a
> > mistake on our side, and we will use the term "grant_type" which is
> > defined by OIDC spec. As well as details of usage of refresh_token.
>
> (For the record, whether we call it "flow type" or "grant type"
> doesn't address my concern.)
>
> > Basically Yes. We propose an increase of the server side hook responsibility.
> > From just validating the token, to also return the provider root URL
> > and required audience. And possibly provide more metadata in the
> > future.
>
> I think it's okay to have the extension and HBA collaborate to provide
> discovery information. Your proposal goes further than that, though,
> and makes the server aware of the chosen client flow. That appears to
> be an architectural violation: why does an OAuth resource server need
> to know the client flow at all?
>
> > Which is in our opinion aligned with SASL protocol, where the server
> > side is responsible for telling the client auth requirements based on
> > the requested role in the startup packet.
>
> You've proposed an alternative SASL mechanism. There's nothing wrong
> with that, per se, but I think it should be clear why we've chosen
> something nonstandard.
>
> > Our understanding is that in the original patch that information came
> > purely from hba, and we propose extension being able to control that
> > metadata.
> > As we see extension as being owned by the identity provider, compared
> > to HBA which is owned by the server administrator or cloud provider.
>
> That seems reasonable, considering how tightly coupled the Issuer and
> the token validation process are.
>
> > 2. Server Owners / PAAS providers (On premise admins, Cloud providers,
> > multi-cloud PAAS providers).
> >    - Install extensions and configure HBA to allow clients to
> > authenticate with the identity providers of their choice.
>
> (For a future conversation: they need to set up authorization, too,
> with custom scopes or some other magic. It's not enough to check who
> the token belongs to; even if Postgres is just using the verified
> email from OpenID as an authenticator, you have to also know that the
> user authorized the token -- and therefore the client -- to access
> Postgres on their behalf.)
>
> > 3. Client Application Developers (Data Wis, integration tools,
> > PgAdmin, monitoring tools, e.t.c.)
> >    - Independent from specific Identity providers or server providers.
> > Write one code for all identity providers.
>
> Ideally, yes, but that only works if all identity providers implement
> the same flows in compatible ways. We're already seeing instances
> where that's not the case and we'll necessarily have to deal with that
> up front.
>
> >    - Rely on application deployment owners to configure which OIDC
> > provider to use across client and server setups.
> > 4. Application Deployment Owners (End customers setting up applications)
> >    - The only actor actually aware of which identity provider to use.
> > Configures the stack based on the Identity and PostgreSQL deployments
> > they have.
>
> (I have doubts that the roles will be as decoupled in practice as you
> have described them, but I'd rather defer that for now.)
>
> > The critical piece of the vision is (3.) above is applications
> > agnostic of the identity providers. Those applications rely on
> > properly configured servers and rich driver logic (libpq,
> > com.postgresql, npgsql) to allow their application to popup auth
> > windows or do service-to-service authentication with any provider. In
> > our view that would significantly democratize the deployment of OAUTH
> > authentication in the community.
>
> That seems to be restating the goal of OAuth and OIDC. Can you explain
> how the incompatible change allows you to accomplish this better than
> standard implementations?
>
> > In order to allow this separation, we propose:
> > 1. HBA + Extension is the single source of truth of Provider root URL
> > + Required Audience for each role. If some backfill for missing OIDC
> > discovery is needed, the provider-specific extension would be
> > providing it.
> > 2. Client Application knows which grant_type to use in which scenario.
> > But can be coded without knowledge of a specific provider. So can't
> > provide discovery details.
> > 3. Driver (libpq, others) - coordinate the authentication flow based
> > on client grant_type and identity provider metadata to allow client
> > applications to use any flow with any provider in a unified way.
> >
> > Yes, this would require a little more complicated flow between
> > components than in your original patch.
>
> Why? I claim that standard OAUTHBEARER can handle all of that. What
> does your proposed architecture (the third diagram) enable that my
> proposed hook (the second diagram) doesn't?
>
> > And yes, more complexity comes
> > with more opportunity to make bugs.
> > However, I see PG Server and Libpq as the places which can have more
> > complexity. For the purpose of making work for the community
> > participants easier and simplify adoption.
> >
> > Does this make sense to you?
>
> Some of it, but it hasn't really addressed the questions from my last mail.
>
> Thanks,
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
That being said, the Diagram 2 would look like this with our proposal:
  +----------------------+                                         +----------+
  |             +-------+                                          | Postgres |
  | PQconnect ->|       |                                          |
       |
  |             |       |                                          |
+-----------+
  |             |       | -------------- Empty Token ------------> | >
|           |
  |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
| Pre-Auth  |
  |          +------+   |                                          |
|  Hook     |
  |     +- < | Hook |   |                                          |
+-----------+
  |     |    +------+   |                                          |          |
  |     v       |       |                                          |
       |
  |  [get token]|       |                                          |
       |
  |     |       |       |                                          |
       |
  |     +       |       |                                          |
+-----------+
  | PQconnect > |       | -------------- Access Token -----------> | >
| Validator |
  |             |       | <---- Authorization Success/Failure ---- | <
|   Hook    |
  |             |       |                                          |
+-----------+
  |             +-------+                                          |
       | +----------------------+
+----------+


With the application taking care of all Token acquisition logic. While
the server-side hook is participating in the pre-authentication reply.

That is definitely a required scenario for the long term and the
easiest to implement in the client core.
And if we can do at least that flow in PG16 it will be a strong
foundation to provide more support for specific grants in libpq going
forward.

Does the diagram above look good to you? We can then start cleaning up
the patch to get that in first.

Thanks!
Andrey.


On Wed, Dec 7, 2022 at 3:22 PM Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
>
> > I think it's okay to have the extension and HBA collaborate to provide
> > discovery information. Your proposal goes further than that, though,
> > and makes the server aware of the chosen client flow. That appears to
> > be an architectural violation: why does an OAuth resource server need
> > to know the client flow at all?
>
> Ok. It may have left there from intermediate iterations. We did
> consider making extension drive the flow for specific grant_type, but
> decided against that idea. For the same reason you point to.
> Is it correct that your main concern about use of grant_type was that
> it's propagated to the server? Then yes, we will remove sending it to
> the server.
>
> > Ideally, yes, but that only works if all identity providers implement
> > the same flows in compatible ways. We're already seeing instances
> > where that's not the case and we'll necessarily have to deal with that
> > up front.
>
> Yes, based on our analysis OIDC spec is detailed enough, that
> providers implementing that one, can be supported with generic code in
> libpq / client.
> Github specifically won't fit there though. Microsoft Azure AD,
> Google, Okta (including Auth0) will.
> Theoretically discovery documents can be returned from the extension
> (server-side) which is provider specific. Though we didn't plan to
> prioritize that.
>
> > That seems to be restating the goal of OAuth and OIDC. Can you explain
> > how the incompatible change allows you to accomplish this better than
> > standard implementations?
>
> Do you refer to passing grant_type to the server? Which we will get
> rid of in the next iteration. Or other incompatible changes as well?
>
> > Why? I claim that standard OAUTHBEARER can handle all of that. What
> > does your proposed architecture (the third diagram) enable that my
> > proposed hook (the second diagram) doesn't?
>
> The hook proposed on the 2nd diagram effectively delegates all Oauth
> flows implementations to the client.
> We propose libpq takes care of pulling OpenId discovery and coordination.
> Which is effectively Diagram 1 + more flows + server hook providing
> root url/audience.
>
> Created the diagrams with all components for 3 flows:
> 1. Authorization code grant (Clients with Browser access):
>   +----------------------+                                         +----------+
>   |             +-------+                                          |
>        |
>   | PQconnect   |       |                                          |
>        |
>   | [auth_code] |       |                                          |
> +-----------+
>   |          -> |       | -------------- Empty Token ------------> | >
> |           |
>   |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
> | Pre-Auth  |
>   |             |       |                                          |
> |  Hook     |
>   |             |       |                                          |
> +-----------+
>   |             |       |                        +--------------+  |          |
>   |             |       | -------[GET]---------> | OIDC         |  | Postgres |
>   |          +------+   | <--Provider Metadata-- | Discovery    |  |          |
>   |     +- < | Hook | < |                        +--------------+  |
>        |
>   |     |    +------+   |                                          |
>        |
>   |     v       |       |                                          |
>        |
>   |  [get auth  |       |                                          |
>        |
>   |    code]    |       |                                          |
>        |
>   |<user action>|       |                                          |
>        |
>   |     |       |       |                                          |
>        |
>   |     +       |       |                                          |
>        |
>   | PQconnect > | +--------+                     +--------------+  |
>        |
>   |             | | iddawc | <-- [ Auth code ]-> | Issuer/      |  |          |
>   |             | |        | <-- Access Token -- | Authz Server |  |          |
>   |             | +--------+                     +--------------+  |          |
>   |             |       |                                          |
> +-----------+
>   |             |       | -------------- Access Token -----------> | >
> | Validator |
>   |             |       | <---- Authorization Success/Failure ---- | <
> |   Hook    |
>   |          +------+   |                                          |
> +-----------+
>   |      +-< | Hook |   |                                          |
>        |
>   |      v   +------+   |                                          |
>        |
>   |[store       +-------+                                          |
>        |
>   |  refresh_token]                                                +----------+
>   +----------------------+
>
> 2. Device code grant
>   +----------------------+                                         +----------+
>   |             +-------+                                          |
>        |
>   | PQconnect   |       |                                          |
>        |
>   | [auth_code] |       |                                          |
> +-----------+
>   |          -> |       | -------------- Empty Token ------------> | >
> |           |
>   |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
> | Pre-Auth  |
>   |             |       |                                          |
> |  Hook     |
>   |             |       |                                          |
> +-----------+
>   |             |       |                        +--------------+  |          |
>   |             |       | -------[GET]---------> | OIDC         |  | Postgres |
>   |          +------+   | <--Provider Metadata-- | Discovery    |  |          |
>   |     +- < | Hook | < |                        +--------------+  |
>        |
>   |     |    +------+   |                                          |
>        |
>   |     v       |       |                                          |
>        |
>   |  [device    | +---------+                     +--------------+ |
>        |
>   |    code]    | | iddawc  |                     | Issuer/      | |
>        |
>   |<user action>| |         | --[ Device code ]-> | Authz Server | |
>        |
>   |             | |<polling>| --[ Device code ]-> |              | |
>        |
>   |             | |         | --[ Device code ]-> |              | |
>        |
>   |             | |         |                     |              | |          |
>   |             | |         | <-- Access Token -- |              | |          |
>   |             | +---------+                     +--------------+ |          |
>   |             |       |                                          |
> +-----------+
>   |             |       | -------------- Access Token -----------> | >
> | Validator |
>   |             |       | <---- Authorization Success/Failure ---- | <
> |   Hook    |
>   |          +------+   |                                          |
> +-----------+
>   |      +-< | Hook |   |                                          |
>        |
>   |      v   +------+   |                                          |
>        |
>   |[store       +-------+                                          |
>        |
>   |  refresh_token]                                                +----------+
>   +----------------------+
>
> 3. Non-interactive flows (Client Secret / Refresh_Token)
>   +----------------------+                                         +----------+
>   |             +-------+                                          |
>        |
>   | PQconnect   |       |                                          |
>        |
>   | [grant_type]|       |                                          |          |
>   |          -> |       |                                          |
> +-----------+
>   |             |       | -------------- Empty Token ------------> | >
> |           |
>   |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
> | Pre-Auth  |
>   |             |       |                                          |
> |  Hook     |
>   |             |       |                                          |
> +-----------+
>   |             |       |                        +--------------+  |          |
>   |             |       | -------[GET]---------> | OIDC         |  | Postgres |
>   |             |       | <--Provider Metadata-- | Discovery    |  |          |
>   |             |       |                        +--------------+  |
>        |
>   |             |       |                                          |
>        |
>   |             | +--------+                     +--------------+  |
>        |
>   |             | | iddawc | <-- [ Secret ]----> | Issuer/      |  |          |
>   |             | |        | <-- Access Token -- | Authz Server |  |          |
>   |             | +--------+                     +--------------+  |          |
>   |             |       |                                          |
> +-----------+
>   |             |       | -------------- Access Token -----------> | >
> | Validator |
>   |             |       | <---- Authorization Success/Failure ---- | <
> |   Hook    |
>   |             |       |                                          |
> +-----------+
>   |             +-------+                                          +----------+
>   +----------------------+
>
> I think what was the most confusing in our latest patch is that
> flow_type was passed to the server.
> We are not proposing this going forward.
>
> > (For a future conversation: they need to set up authorization, too,
> > with custom scopes or some other magic. It's not enough to check who
> > the token belongs to; even if Postgres is just using the verified
> > email from OpenID as an authenticator, you have to also know that the
> > user authorized the token -- and therefore the client -- to access
> > Postgres on their behalf.)
>
> My understanding is that metadata in the tokens is provider specific,
> so server side hook would be the right place to handle that.
> Plus I can envision for some providers it can make sense to make a
> remote call to pull some information.
>
> The way we implement Azure AD auth today in PAAS PostgreSQL offering:
> - Server administrator uses special extension functions to create
> Azure AD enabled PostgreSQL roles.
> - PostgreSQL extension maps Roles to unique identity Ids (UID) in the Directory.
> - Connection flow: If the token is valid and Role => UID mapping
> matches, we authenticate as the Role.
> - Then its native PostgreSQL role based access control takes care of privileges.
>
> This is the same for both User- and System-to-system authorization.
> Though I assume different providers may treat user- and system-
> identities differently. So their extension would handle that.
>
> Thanks!
> Andrey.
>
> On Wed, Dec 7, 2022 at 11:06 AM Jacob Champion <jchampion@timescale.com> wrote:
> >
> > On Mon, Dec 5, 2022 at 4:15 PM Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
> > > I think we can focus on the roles and responsibilities of the components first.
> > > Details of the patch can be elaborated. Like "flow type code" is a
> > > mistake on our side, and we will use the term "grant_type" which is
> > > defined by OIDC spec. As well as details of usage of refresh_token.
> >
> > (For the record, whether we call it "flow type" or "grant type"
> > doesn't address my concern.)
> >
> > > Basically Yes. We propose an increase of the server side hook responsibility.
> > > From just validating the token, to also return the provider root URL
> > > and required audience. And possibly provide more metadata in the
> > > future.
> >
> > I think it's okay to have the extension and HBA collaborate to provide
> > discovery information. Your proposal goes further than that, though,
> > and makes the server aware of the chosen client flow. That appears to
> > be an architectural violation: why does an OAuth resource server need
> > to know the client flow at all?
> >
> > > Which is in our opinion aligned with SASL protocol, where the server
> > > side is responsible for telling the client auth requirements based on
> > > the requested role in the startup packet.
> >
> > You've proposed an alternative SASL mechanism. There's nothing wrong
> > with that, per se, but I think it should be clear why we've chosen
> > something nonstandard.
> >
> > > Our understanding is that in the original patch that information came
> > > purely from hba, and we propose extension being able to control that
> > > metadata.
> > > As we see extension as being owned by the identity provider, compared
> > > to HBA which is owned by the server administrator or cloud provider.
> >
> > That seems reasonable, considering how tightly coupled the Issuer and
> > the token validation process are.
> >
> > > 2. Server Owners / PAAS providers (On premise admins, Cloud providers,
> > > multi-cloud PAAS providers).
> > >    - Install extensions and configure HBA to allow clients to
> > > authenticate with the identity providers of their choice.
> >
> > (For a future conversation: they need to set up authorization, too,
> > with custom scopes or some other magic. It's not enough to check who
> > the token belongs to; even if Postgres is just using the verified
> > email from OpenID as an authenticator, you have to also know that the
> > user authorized the token -- and therefore the client -- to access
> > Postgres on their behalf.)
> >
> > > 3. Client Application Developers (Data Wis, integration tools,
> > > PgAdmin, monitoring tools, e.t.c.)
> > >    - Independent from specific Identity providers or server providers.
> > > Write one code for all identity providers.
> >
> > Ideally, yes, but that only works if all identity providers implement
> > the same flows in compatible ways. We're already seeing instances
> > where that's not the case and we'll necessarily have to deal with that
> > up front.
> >
> > >    - Rely on application deployment owners to configure which OIDC
> > > provider to use across client and server setups.
> > > 4. Application Deployment Owners (End customers setting up applications)
> > >    - The only actor actually aware of which identity provider to use.
> > > Configures the stack based on the Identity and PostgreSQL deployments
> > > they have.
> >
> > (I have doubts that the roles will be as decoupled in practice as you
> > have described them, but I'd rather defer that for now.)
> >
> > > The critical piece of the vision is (3.) above is applications
> > > agnostic of the identity providers. Those applications rely on
> > > properly configured servers and rich driver logic (libpq,
> > > com.postgresql, npgsql) to allow their application to popup auth
> > > windows or do service-to-service authentication with any provider. In
> > > our view that would significantly democratize the deployment of OAUTH
> > > authentication in the community.
> >
> > That seems to be restating the goal of OAuth and OIDC. Can you explain
> > how the incompatible change allows you to accomplish this better than
> > standard implementations?
> >
> > > In order to allow this separation, we propose:
> > > 1. HBA + Extension is the single source of truth of Provider root URL
> > > + Required Audience for each role. If some backfill for missing OIDC
> > > discovery is needed, the provider-specific extension would be
> > > providing it.
> > > 2. Client Application knows which grant_type to use in which scenario.
> > > But can be coded without knowledge of a specific provider. So can't
> > > provide discovery details.
> > > 3. Driver (libpq, others) - coordinate the authentication flow based
> > > on client grant_type and identity provider metadata to allow client
> > > applications to use any flow with any provider in a unified way.
> > >
> > > Yes, this would require a little more complicated flow between
> > > components than in your original patch.
> >
> > Why? I claim that standard OAUTHBEARER can handle all of that. What
> > does your proposed architecture (the third diagram) enable that my
> > proposed hook (the second diagram) doesn't?
> >
> > > And yes, more complexity comes
> > > with more opportunity to make bugs.
> > > However, I see PG Server and Libpq as the places which can have more
> > > complexity. For the purpose of making work for the community
> > > participants easier and simplify adoption.
> > >
> > > Does this make sense to you?
> >
> > Some of it, but it hasn't really addressed the questions from my last mail.
> >
> > Thanks,
> > --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Wed, Dec 7, 2022 at 3:22 PM Andrey Chudnovsky
<achudnovskij@gmail.com> wrote:
> 
>> I think it's okay to have the extension and HBA collaborate to
>> provide discovery information. Your proposal goes further than
>> that, though, and makes the server aware of the chosen client flow.
>> That appears to be an architectural violation: why does an OAuth
>> resource server need to know the client flow at all?
> 
> Ok. It may have left there from intermediate iterations. We did 
> consider making extension drive the flow for specific grant_type,
> but decided against that idea. For the same reason you point to. Is
> it correct that your main concern about use of grant_type was that 
> it's propagated to the server? Then yes, we will remove sending it
> to the server.

Okay. Yes, that was my primary concern.

>> Ideally, yes, but that only works if all identity providers
>> implement the same flows in compatible ways. We're already seeing
>> instances where that's not the case and we'll necessarily have to
>> deal with that up front.
> 
> Yes, based on our analysis OIDC spec is detailed enough, that 
> providers implementing that one, can be supported with generic code
> in libpq / client. Github specifically won't fit there though.
> Microsoft Azure AD, Google, Okta (including Auth0) will. 
> Theoretically discovery documents can be returned from the extension 
> (server-side) which is provider specific. Though we didn't plan to 
> prioritize that.

As another example, Google's device authorization grant is incompatible
with the spec (which they co-authored). I want to say I had problems
with Azure AD not following that spec either, but I don't remember
exactly what they were. I wouldn't be surprised to find more tiny
departures once we get deeper into implementation.

>> That seems to be restating the goal of OAuth and OIDC. Can you
>> explain how the incompatible change allows you to accomplish this
>> better than standard implementations?
> 
> Do you refer to passing grant_type to the server? Which we will get 
> rid of in the next iteration. Or other incompatible changes as well?

Just the grant type, yeah.

>> Why? I claim that standard OAUTHBEARER can handle all of that.
>> What does your proposed architecture (the third diagram) enable
>> that my proposed hook (the second diagram) doesn't?
> 
> The hook proposed on the 2nd diagram effectively delegates all Oauth 
> flows implementations to the client. We propose libpq takes care of
> pulling OpenId discovery and coordination. Which is effectively
> Diagram 1 + more flows + server hook providing root url/audience.
> 
> Created the diagrams with all components for 3 flows: [snip]

(I'll skip ahead to your later mail on this.)

>> (For a future conversation: they need to set up authorization,
>> too, with custom scopes or some other magic. It's not enough to
>> check who the token belongs to; even if Postgres is just using the
>> verified email from OpenID as an authenticator, you have to also
>> know that the user authorized the token -- and therefore the client
>> -- to access Postgres on their behalf.)
> 
> My understanding is that metadata in the tokens is provider
> specific, so server side hook would be the right place to handle
> that. Plus I can envision for some providers it can make sense to
> make a remote call to pull some information.

The server hook is the right place to check the scopes, yes, but I think
the DBA should be able to specify what those scopes are to begin with.
The provider of the extension shouldn't be expected by the architecture
to hardcode those decisions, even if Azure AD chooses to short-circuit
that choice and provide magic instead.

On 12/7/22 20:25, Andrey Chudnovsky wrote:
> That being said, the Diagram 2 would look like this with our proposal:
> [snip]
> 
> With the application taking care of all Token acquisition logic. While
> the server-side hook is participating in the pre-authentication reply.
> 
> That is definitely a required scenario for the long term and the
> easiest to implement in the client core.> And if we can do at least that flow in PG16 it will be a strong
> foundation to provide more support for specific grants in libpq going
> forward.

Agreed.
> Does the diagram above look good to you? We can then start cleaning up
> the patch to get that in first.

I maintain that the hook doesn't need to hand back artifacts to the
client for a second PQconnect call. It can just use those artifacts to
obtain the access token and hand that right back to libpq. (I think any
requirement that clients be rewritten to call PQconnect twice will
probably be a sticking point for adoption of an OAuth patch.)

That said, now that your proposal is also compatible with OAUTHBEARER, I
can pony up some code to hopefully prove my point. (I don't know if I'll
be able to do that by the holidays though.)

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> The server hook is the right place to check the scopes, yes, but I think
> the DBA should be able to specify what those scopes are to begin with.
> The provider of the extension shouldn't be expected by the architecture
> to hardcode those decisions, even if Azure AD chooses to short-circuit
> that choice and provide magic instead.

Hardcode is definitely not expected, but customization for identity
provider specific, I think, should be allowed.
I can provide a couple of advanced use cases which happen in the cloud
deployments world, and require per-role management:
- Multi-tenant deployments, when root provider URL would be different
for different roles, based on which tenant they come from.
- Federation to multiple providers. Solutions like Amazon Cognito
which offer a layer of abstraction with several providers
transparently supported.

If your concern is extension not honoring the DBA configured values:
Would a server-side logic to prefer HBA value over extension-provided
resolve this concern?
We are definitely biased towards the cloud deployment scenarios, where
direct access to .hba files is usually not offered at all.
Let's find the middle ground here.

A separate reason for creating this pre-authentication hook is further
extensibility to support more metadata.
Specifically when we add support for OAUTH flows to libpq, server-side
extensions can help bridge the gap between the identity provider
implementation and OAUTH/OIDC specs.
For example, that could allow the Github extension to provide an OIDC
discovery document.

I definitely see identity providers as institutional actors here which
can be given some power through the extension hooks to customize the
behavior within the framework.

> I maintain that the hook doesn't need to hand back artifacts to the
> client for a second PQconnect call. It can just use those artifacts to
> obtain the access token and hand that right back to libpq. (I think any
> requirement that clients be rewritten to call PQconnect twice will
> probably be a sticking point for adoption of an OAuth patch.)

Obtaining a token is an asynchronous process with a human in the loop.
Not sure if expecting a hook function to return a token synchronously
is the best option here.
Can that be an optional return value of the hook in cases when a token
can be obtained synchronously?

On Thu, Dec 8, 2022 at 4:41 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Wed, Dec 7, 2022 at 3:22 PM Andrey Chudnovsky
> <achudnovskij@gmail.com> wrote:
> >
> >> I think it's okay to have the extension and HBA collaborate to
> >> provide discovery information. Your proposal goes further than
> >> that, though, and makes the server aware of the chosen client flow.
> >> That appears to be an architectural violation: why does an OAuth
> >> resource server need to know the client flow at all?
> >
> > Ok. It may have left there from intermediate iterations. We did
> > consider making extension drive the flow for specific grant_type,
> > but decided against that idea. For the same reason you point to. Is
> > it correct that your main concern about use of grant_type was that
> > it's propagated to the server? Then yes, we will remove sending it
> > to the server.
>
> Okay. Yes, that was my primary concern.
>
> >> Ideally, yes, but that only works if all identity providers
> >> implement the same flows in compatible ways. We're already seeing
> >> instances where that's not the case and we'll necessarily have to
> >> deal with that up front.
> >
> > Yes, based on our analysis OIDC spec is detailed enough, that
> > providers implementing that one, can be supported with generic code
> > in libpq / client. Github specifically won't fit there though.
> > Microsoft Azure AD, Google, Okta (including Auth0) will.
> > Theoretically discovery documents can be returned from the extension
> > (server-side) which is provider specific. Though we didn't plan to
> > prioritize that.
>
> As another example, Google's device authorization grant is incompatible
> with the spec (which they co-authored). I want to say I had problems
> with Azure AD not following that spec either, but I don't remember
> exactly what they were. I wouldn't be surprised to find more tiny
> departures once we get deeper into implementation.
>
> >> That seems to be restating the goal of OAuth and OIDC. Can you
> >> explain how the incompatible change allows you to accomplish this
> >> better than standard implementations?
> >
> > Do you refer to passing grant_type to the server? Which we will get
> > rid of in the next iteration. Or other incompatible changes as well?
>
> Just the grant type, yeah.
>
> >> Why? I claim that standard OAUTHBEARER can handle all of that.
> >> What does your proposed architecture (the third diagram) enable
> >> that my proposed hook (the second diagram) doesn't?
> >
> > The hook proposed on the 2nd diagram effectively delegates all Oauth
> > flows implementations to the client. We propose libpq takes care of
> > pulling OpenId discovery and coordination. Which is effectively
> > Diagram 1 + more flows + server hook providing root url/audience.
> >
> > Created the diagrams with all components for 3 flows: [snip]
>
> (I'll skip ahead to your later mail on this.)
>
> >> (For a future conversation: they need to set up authorization,
> >> too, with custom scopes or some other magic. It's not enough to
> >> check who the token belongs to; even if Postgres is just using the
> >> verified email from OpenID as an authenticator, you have to also
> >> know that the user authorized the token -- and therefore the client
> >> -- to access Postgres on their behalf.)
> >
> > My understanding is that metadata in the tokens is provider
> > specific, so server side hook would be the right place to handle
> > that. Plus I can envision for some providers it can make sense to
> > make a remote call to pull some information.
>
> The server hook is the right place to check the scopes, yes, but I think
> the DBA should be able to specify what those scopes are to begin with.
> The provider of the extension shouldn't be expected by the architecture
> to hardcode those decisions, even if Azure AD chooses to short-circuit
> that choice and provide magic instead.
>
> On 12/7/22 20:25, Andrey Chudnovsky wrote:
> > That being said, the Diagram 2 would look like this with our proposal:
> > [snip]
> >
> > With the application taking care of all Token acquisition logic. While
> > the server-side hook is participating in the pre-authentication reply.
> >
> > That is definitely a required scenario for the long term and the
> > easiest to implement in the client core.> And if we can do at least that flow in PG16 it will be a strong
> > foundation to provide more support for specific grants in libpq going
> > forward.
>
> Agreed.
> > Does the diagram above look good to you? We can then start cleaning up
> > the patch to get that in first.
>
> I maintain that the hook doesn't need to hand back artifacts to the
> client for a second PQconnect call. It can just use those artifacts to
> obtain the access token and hand that right back to libpq. (I think any
> requirement that clients be rewritten to call PQconnect twice will
> probably be a sticking point for adoption of an OAuth patch.)
>
> That said, now that your proposal is also compatible with OAUTHBEARER, I
> can pony up some code to hopefully prove my point. (I don't know if I'll
> be able to do that by the holidays though.)
>
> Thanks!
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Dec 12, 2022 at 9:06 PM Andrey Chudnovsky
<achudnovskij@gmail.com> wrote:
> If your concern is extension not honoring the DBA configured values:
> Would a server-side logic to prefer HBA value over extension-provided
> resolve this concern?

Yeah. It also seals the role of the extension here as "optional".

> We are definitely biased towards the cloud deployment scenarios, where
> direct access to .hba files is usually not offered at all.
> Let's find the middle ground here.

Sure. I don't want to make this difficult in cloud scenarios --
obviously I'd like for Timescale Cloud to be able to make use of this
too. But if we make this easy for a lone DBA (who doesn't have any
institutional power with the providers) to use correctly and securely,
then it should follow that the providers who _do_ have power and
resources will have an easy time of it as well. The reverse isn't
necessarily true. So I'm definitely planning to focus on the DBA case
first.

> A separate reason for creating this pre-authentication hook is further
> extensibility to support more metadata.
> Specifically when we add support for OAUTH flows to libpq, server-side
> extensions can help bridge the gap between the identity provider
> implementation and OAUTH/OIDC specs.
> For example, that could allow the Github extension to provide an OIDC
> discovery document.
>
> I definitely see identity providers as institutional actors here which
> can be given some power through the extension hooks to customize the
> behavior within the framework.

We'll probably have to make some compromises in this area, but I think
they should be carefully considered exceptions and not a core feature
of the mechanism. The gaps you point out are just fragmentation, and
adding custom extensions to deal with it leads to further
fragmentation instead of providing pressure on providers to just
implement the specs. Worst case, we open up new exciting security
flaws, and then no one can analyze them independently because no one
other than the provider knows how the two sides work together anymore.

Don't get me wrong; it would be naive to proceed as if the OAUTHBEARER
spec were perfect, because it's clearly not. But if we need to make
extensions to it, we can participate in IETF discussions and make our
case publicly for review, rather than enshrining MS/GitHub/Google/etc.
versions of the RFC and enabling that proliferation as a Postgres core
feature.

> Obtaining a token is an asynchronous process with a human in the loop.
> Not sure if expecting a hook function to return a token synchronously
> is the best option here.
> Can that be an optional return value of the hook in cases when a token
> can be obtained synchronously?

I don't think the hook is generally going to be able to return a token
synchronously, and I expect the final design to be async-first. As far
as I know, this will need to be solved for the builtin flows as well
(you don't want a synchronous HTTP call to block your PQconnectPoll
architecture), so the hook should be able to make use of whatever
solution we land on for that.

This is hand-wavy, and I don't expect it to be easy to solve. I just
don't think we have to solve it twice.

Have a good end to the year!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
mahendrakar s
Date:
Hi All,

Changes added to Jacob's patch(v2) as per the discussion in the thread.

The changes allow the customer to send the OAUTH BEARER token through psql connection string.

Example:
psql  -U user@example.com -d 'dbname=postgres oauth_bearer_token=abc'

To configure OAUTH, the pg_hba.conf line look like:
local   all             all                                     oauth   provider=oauth_provider issuer="https://example.com" scope="openid email"

We also added hook to libpq to pass on the metadata about the issuer.

Thanks,
Mahendrakar.


On Sat, 17 Dec 2022 at 04:48, Jacob Champion <jchampion@timescale.com> wrote:
>
> On Mon, Dec 12, 2022 at 9:06 PM Andrey Chudnovsky
> <achudnovskij@gmail.com> wrote:
> > If your concern is extension not honoring the DBA configured values:
> > Would a server-side logic to prefer HBA value over extension-provided
> > resolve this concern?
>
> Yeah. It also seals the role of the extension here as "optional".
>
> > We are definitely biased towards the cloud deployment scenarios, where
> > direct access to .hba files is usually not offered at all.
> > Let's find the middle ground here.
>
> Sure. I don't want to make this difficult in cloud scenarios --
> obviously I'd like for Timescale Cloud to be able to make use of this
> too. But if we make this easy for a lone DBA (who doesn't have any
> institutional power with the providers) to use correctly and securely,
> then it should follow that the providers who _do_ have power and
> resources will have an easy time of it as well. The reverse isn't
> necessarily true. So I'm definitely planning to focus on the DBA case
> first.
>
> > A separate reason for creating this pre-authentication hook is further
> > extensibility to support more metadata.
> > Specifically when we add support for OAUTH flows to libpq, server-side
> > extensions can help bridge the gap between the identity provider
> > implementation and OAUTH/OIDC specs.
> > For example, that could allow the Github extension to provide an OIDC
> > discovery document.
> >
> > I definitely see identity providers as institutional actors here which
> > can be given some power through the extension hooks to customize the
> > behavior within the framework.
>
> We'll probably have to make some compromises in this area, but I think
> they should be carefully considered exceptions and not a core feature
> of the mechanism. The gaps you point out are just fragmentation, and
> adding custom extensions to deal with it leads to further
> fragmentation instead of providing pressure on providers to just
> implement the specs. Worst case, we open up new exciting security
> flaws, and then no one can analyze them independently because no one
> other than the provider knows how the two sides work together anymore.
>
> Don't get me wrong; it would be naive to proceed as if the OAUTHBEARER
> spec were perfect, because it's clearly not. But if we need to make
> extensions to it, we can participate in IETF discussions and make our
> case publicly for review, rather than enshrining MS/GitHub/Google/etc.
> versions of the RFC and enabling that proliferation as a Postgres core
> feature.
>
> > Obtaining a token is an asynchronous process with a human in the loop.
> > Not sure if expecting a hook function to return a token synchronously
> > is the best option here.
> > Can that be an optional return value of the hook in cases when a token
> > can be obtained synchronously?
>
> I don't think the hook is generally going to be able to return a token
> synchronously, and I expect the final design to be async-first. As far
> as I know, this will need to be solved for the builtin flows as well
> (you don't want a synchronous HTTP call to block your PQconnectPoll
> architecture), so the hook should be able to make use of whatever
> solution we land on for that.
>
> This is hand-wavy, and I don't expect it to be easy to solve. I just
> don't think we have to solve it twice.
>
> Have a good end to the year!
> --Jacob
Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
More information on the latest patch.

1. We aligned the implementation with the barebone SASL for OAUTH
described here - https://www.rfc-editor.org/rfc/rfc7628
The flow can be explained in the diagram below:

  +----------------------+                                 +----------+
  |             +-------+                                  | Postgres |
  | PQconnect ->|       |                                  |          |
  |             |       |                                  |   +-----------+
  |             |       | ---------- Empty Token---------> | > |           |
  |             | libpq | <-- Error(Discovery + Scope ) -- | < | Pre-Auth  |
  |          +------+   |                                  |   |  Hook     |
  |     +- < | Hook |   |                                  |   +-----------+
  |     |    +------+   |                                  |          |
  |     v       |       |                                  |          |
  |  [get token]|       |                                  |          |
  |     |       |       |                                  |          |
  |     +       |       |                                  |   +-----------+
  | PQconnect > |       | --------- Access Token --------> | > | Validator |
  |             |       | <---------- Auth Result -------- | < |   Hook    |
  |             |       |                                  |   +-----------+
  |             +-------+                                  |          |
  +----------------------+                                 +----------+

2. Removed Device Code implementation in libpq. Several reasons:
   - Reduce scope and focus on the protocol first.
   - Device code implementation uses iddawc dependency. Taking this
dependency is a controversial step which requires broader discussion.
   - Device code implementation without iddaws would significantly
increase the scope of the patch, as libpq needs to poll the token
endpoint, setup different API calls, e.t.c.
   - That flow should canonically only be used for clients which can't
invoke browsers. If it is the only flow to be implemented, it can be
used in the context when it's not expected by the OAUTH protocol.

3. Temporarily removed test suite. We are actively working on aligning
the tests with the latest changes. Will add a patch with tests soon.

We will change the "V3" prefix to make it the next after the previous
iterations.

Thanks!
Andrey.

On Thu, Jan 12, 2023 at 11:08 AM mahendrakar s
<mahendrakarforpg@gmail.com> wrote:
>
> Hi All,
>
> Changes added to Jacob's patch(v2) as per the discussion in the thread.
>
> The changes allow the customer to send the OAUTH BEARER token through psql connection string.
>
> Example:
> psql  -U user@example.com -d 'dbname=postgres oauth_bearer_token=abc'
>
> To configure OAUTH, the pg_hba.conf line look like:
> local   all             all                                     oauth   provider=oauth_provider
issuer="https://example.com"scope="openid email"
 
>
> We also added hook to libpq to pass on the metadata about the issuer.
>
> Thanks,
> Mahendrakar.
>
>
> On Sat, 17 Dec 2022 at 04:48, Jacob Champion <jchampion@timescale.com> wrote:
> >
> > On Mon, Dec 12, 2022 at 9:06 PM Andrey Chudnovsky
> > <achudnovskij@gmail.com> wrote:
> > > If your concern is extension not honoring the DBA configured values:
> > > Would a server-side logic to prefer HBA value over extension-provided
> > > resolve this concern?
> >
> > Yeah. It also seals the role of the extension here as "optional".
> >
> > > We are definitely biased towards the cloud deployment scenarios, where
> > > direct access to .hba files is usually not offered at all.
> > > Let's find the middle ground here.
> >
> > Sure. I don't want to make this difficult in cloud scenarios --
> > obviously I'd like for Timescale Cloud to be able to make use of this
> > too. But if we make this easy for a lone DBA (who doesn't have any
> > institutional power with the providers) to use correctly and securely,
> > then it should follow that the providers who _do_ have power and
> > resources will have an easy time of it as well. The reverse isn't
> > necessarily true. So I'm definitely planning to focus on the DBA case
> > first.
> >
> > > A separate reason for creating this pre-authentication hook is further
> > > extensibility to support more metadata.
> > > Specifically when we add support for OAUTH flows to libpq, server-side
> > > extensions can help bridge the gap between the identity provider
> > > implementation and OAUTH/OIDC specs.
> > > For example, that could allow the Github extension to provide an OIDC
> > > discovery document.
> > >
> > > I definitely see identity providers as institutional actors here which
> > > can be given some power through the extension hooks to customize the
> > > behavior within the framework.
> >
> > We'll probably have to make some compromises in this area, but I think
> > they should be carefully considered exceptions and not a core feature
> > of the mechanism. The gaps you point out are just fragmentation, and
> > adding custom extensions to deal with it leads to further
> > fragmentation instead of providing pressure on providers to just
> > implement the specs. Worst case, we open up new exciting security
> > flaws, and then no one can analyze them independently because no one
> > other than the provider knows how the two sides work together anymore.
> >
> > Don't get me wrong; it would be naive to proceed as if the OAUTHBEARER
> > spec were perfect, because it's clearly not. But if we need to make
> > extensions to it, we can participate in IETF discussions and make our
> > case publicly for review, rather than enshrining MS/GitHub/Google/etc.
> > versions of the RFC and enabling that proliferation as a Postgres core
> > feature.
> >
> > > Obtaining a token is an asynchronous process with a human in the loop.
> > > Not sure if expecting a hook function to return a token synchronously
> > > is the best option here.
> > > Can that be an optional return value of the hook in cases when a token
> > > can be obtained synchronously?
> >
> > I don't think the hook is generally going to be able to return a token
> > synchronously, and I expect the final design to be async-first. As far
> > as I know, this will need to be solved for the builtin flows as well
> > (you don't want a synchronous HTTP call to block your PQconnectPoll
> > architecture), so the hook should be able to make use of whatever
> > solution we land on for that.
> >
> > This is hand-wavy, and I don't expect it to be easy to solve. I just
> > don't think we have to solve it twice.
> >
> > Have a good end to the year!
> > --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Sun, Jan 15, 2023 at 12:03 PM Andrey Chudnovsky
<achudnovskij@gmail.com> wrote:
> 2. Removed Device Code implementation in libpq. Several reasons:
>    - Reduce scope and focus on the protocol first.
>    - Device code implementation uses iddawc dependency. Taking this
> dependency is a controversial step which requires broader discussion.
>    - Device code implementation without iddaws would significantly
> increase the scope of the patch, as libpq needs to poll the token
> endpoint, setup different API calls, e.t.c.
>    - That flow should canonically only be used for clients which can't
> invoke browsers. If it is the only flow to be implemented, it can be
> used in the context when it's not expected by the OAUTH protocol.

I'm not understanding the concern in the final point -- providers
generally require you to opt into device authorization, at least as far
as I can tell. So if you decide that it's not appropriate for your use
case... don't enable it. (And I haven't seen any claims that opting into
device authorization weakens the other flows in any way. So if we're
going to implement a flow in libpq, I still think device authorization
is the best choice, since it works on headless machines as well as those
with browsers.)

All of this points at a bigger question to the community: if we choose
not to provide a flow implementation in libpq, is adding OAUTHBEARER
worth the additional maintenance cost?

My personal vote would be "no". I think the hook-only approach proposed
here would ensure that only larger providers would implement it in
practice, and in that case I'd rather spend cycles on generic SASL.

> 3. Temporarily removed test suite. We are actively working on aligning
> the tests with the latest changes. Will add a patch with tests soon.

Okay. Case in point, the following change to the patch appears to be
invalid JSON:

> +   appendStringInfo(&buf,
> +       "{ "
> +           "\"status\": \"invalid_token\", "
> +           "\"openid-configuration\": \"%s\","
> +           "\"scope\": \"%s\" ",
> +           "\"issuer\": \"%s\" ",
> +       "}",

Additionally, the "issuer" field added here is not part of the RFC. I've
written my thoughts about unofficial extensions upthread but haven't
received a response, so I'm going to start being more strident: Please,
for the sake of reviewers, call out changes you've made to the spec, and
why they're justified.

The patches seem to be out of order now (and the documentation in the
commit messages has been removed).

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> All of this points at a bigger question to the community: if we choose
> not to provide a flow implementation in libpq, is adding OAUTHBEARER
> worth the additional maintenance cost?

> My personal vote would be "no". I think the hook-only approach proposed
> here would ensure that only larger providers would implement it in
> practice

Flow implementations in libpq are definitely a long term plan, and I
agree that it would democratise the adoption.
In the previous posts in this conversation I outlined the ones I think
we should support.

However, I don't see why it's strictly necessary to couple those.
As long as the SASL exchange for OAUTHBEARER mechanism is supported by
the protocol, the Client side can evolve at its own pace.

At the same time, the current implementation allows clients to start
building provider-agnostic OAUTH support. By using iddawc or OAUTH
client implementations in the respective platforms.
So I wouldn't refer to "larger providers", but rather "more motivated
clients" here. Which definitely overlaps, but keeps the system open.

> I'm not understanding the concern in the final point -- providers
> generally require you to opt into device authorization, at least as far
> as I can tell. So if you decide that it's not appropriate for your use
> case... don't enable it. (And I haven't seen any claims that opting into
> device authorization weakens the other flows in any way. So if we're
> going to implement a flow in libpq, I still think device authorization
> is the best choice, since it works on headless machines as well as those
> with browsers.)
I agree with the statement that Device code is the best first choice
if we absolutely have to pick one.
Though I don't think we have to.

While device flow can be used for all kinds of user-facing
applications, it's specifically designed for input-constrained
scenarios. As clearly stated in the Abstract here -
https://www.rfc-editor.org/rfc/rfc8628
The authorization code with pkce flow is recommended by the RFSc and
major providers for cases when it's feasible.
The long term goal is to provide both, though I don't see why the
backbone protocol implementation first wouldn't add value.

Another point is user authentication is one side of the whole story
and the other critical one is system-to-system authentication. Where
we have Client Credentials and Certificates.
With the latter it is much harder to get generically implemented, as
provider-specific tokens need to be signed.

Adding the other reasoning, I think libpq support for specific flows
can get in the further iterations, after the protocol support.

> in that case I'd rather spend cycles on generic SASL.
I see 2 approaches to generic SASL:
(a). Generic SASL is a framework used in the protocol, with the
mechanisms implemented on top and exposed to the DBAs as auth types to
configure in hba.
This is the direction we're going here, which is well aligned with the
existing hba-based auth configuration.
(b). Generic SASL exposed to developers on the server- and client-
side to extend on. It seems to be a much longer shot.
The specific points of large ambiguity are libpq distribution model
(which you pointed to) and potential pluggability of insecure
mechanisms.

I do see (a) as a sweet spot with a lot of value for various
participants with much less ambiguity.

> Additionally, the "issuer" field added here is not part of the RFC. I've
> written my thoughts about unofficial extensions upthread but haven't
> received a response, so I'm going to start being more strident: Please,
> for the sake of reviewers, call out changes you've made to the spec, and
> why they're justified.
Thanks for your feedback on this. We had this discussion as well, and
added that as a convenience for the client to identify the provider.
I don't see a reason why an issuer would be absolutely necessary, so
we will get your point that sticking to RFCs is a safer choice.

> The patches seem to be out of order now (and the documentation in the
> commit messages has been removed).
Feedback taken. Work in progress.

On Tue, Jan 17, 2023 at 2:44 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Sun, Jan 15, 2023 at 12:03 PM Andrey Chudnovsky
> <achudnovskij@gmail.com> wrote:
> > 2. Removed Device Code implementation in libpq. Several reasons:
> >    - Reduce scope and focus on the protocol first.
> >    - Device code implementation uses iddawc dependency. Taking this
> > dependency is a controversial step which requires broader discussion.
> >    - Device code implementation without iddaws would significantly
> > increase the scope of the patch, as libpq needs to poll the token
> > endpoint, setup different API calls, e.t.c.
> >    - That flow should canonically only be used for clients which can't
> > invoke browsers. If it is the only flow to be implemented, it can be
> > used in the context when it's not expected by the OAUTH protocol.
>
> I'm not understanding the concern in the final point -- providers
> generally require you to opt into device authorization, at least as far
> as I can tell. So if you decide that it's not appropriate for your use
> case... don't enable it. (And I haven't seen any claims that opting into
> device authorization weakens the other flows in any way. So if we're
> going to implement a flow in libpq, I still think device authorization
> is the best choice, since it works on headless machines as well as those
> with browsers.)
>
> All of this points at a bigger question to the community: if we choose
> not to provide a flow implementation in libpq, is adding OAUTHBEARER
> worth the additional maintenance cost?
>
> My personal vote would be "no". I think the hook-only approach proposed
> here would ensure that only larger providers would implement it in
> practice, and in that case I'd rather spend cycles on generic SASL.
>
> > 3. Temporarily removed test suite. We are actively working on aligning
> > the tests with the latest changes. Will add a patch with tests soon.
>
> Okay. Case in point, the following change to the patch appears to be
> invalid JSON:
>
> > +   appendStringInfo(&buf,
> > +       "{ "
> > +           "\"status\": \"invalid_token\", "
> > +           "\"openid-configuration\": \"%s\","
> > +           "\"scope\": \"%s\" ",
> > +           "\"issuer\": \"%s\" ",
> > +       "}",
>
> Additionally, the "issuer" field added here is not part of the RFC. I've
> written my thoughts about unofficial extensions upthread but haven't
> received a response, so I'm going to start being more strident: Please,
> for the sake of reviewers, call out changes you've made to the spec, and
> why they're justified.
>
> The patches seem to be out of order now (and the documentation in the
> commit messages has been removed).
>
> Thanks,
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
mahendrakar s
Date:
Hi All,

The "issuer" field has been removed to align  with the RFC
implementation - https://www.rfc-editor.org/rfc/rfc7628.
This patch "v6" is a single patch to support the OAUTH BEARER token
through psql connection string.
Below flow is supported. Added the documentation in the commit messages.

 +----------------------+                                 +----------+
  |             +-------+                                  | Postgres |
  | PQconnect ->|       |                                  |          |
  |             |       |                                  |   +-----------+
  |             |       | ---------- Empty Token---------> | > |           |
  |             | libpq | <-- Error(Discovery + Scope ) -- | < | Pre-Auth  |
  |          +------+   |                                  |   |  Hook     |
  |     +- < | Hook |   |                                  |   +-----------+
  |     |    +------+   |                                  |          |
  |     v       |       |                                  |          |
  |  [get token]|       |                                  |          |
  |     |       |       |                                  |          |
  |     +       |       |                                  |   +-----------+
  | PQconnect > |       | --------- Access Token --------> | > | Validator |
  |             |       | <---------- Auth Result -------- | < |   Hook    |
  |             |       |                                  |   +-----------+
  |             +-------+                                  |          |
  +----------------------+                                 +----------+

Please note that we are working on modifying/adding new tests (from
Jacob's Patch) with the latest changes. Will add a patch with tests
soon.

Thanks,
Mahendrakar.

On Wed, 18 Jan 2023 at 07:24, Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
>
> > All of this points at a bigger question to the community: if we choose
> > not to provide a flow implementation in libpq, is adding OAUTHBEARER
> > worth the additional maintenance cost?
>
> > My personal vote would be "no". I think the hook-only approach proposed
> > here would ensure that only larger providers would implement it in
> > practice
>
> Flow implementations in libpq are definitely a long term plan, and I
> agree that it would democratise the adoption.
> In the previous posts in this conversation I outlined the ones I think
> we should support.
>
> However, I don't see why it's strictly necessary to couple those.
> As long as the SASL exchange for OAUTHBEARER mechanism is supported by
> the protocol, the Client side can evolve at its own pace.
>
> At the same time, the current implementation allows clients to start
> building provider-agnostic OAUTH support. By using iddawc or OAUTH
> client implementations in the respective platforms.
> So I wouldn't refer to "larger providers", but rather "more motivated
> clients" here. Which definitely overlaps, but keeps the system open.
>
> > I'm not understanding the concern in the final point -- providers
> > generally require you to opt into device authorization, at least as far
> > as I can tell. So if you decide that it's not appropriate for your use
> > case... don't enable it. (And I haven't seen any claims that opting into
> > device authorization weakens the other flows in any way. So if we're
> > going to implement a flow in libpq, I still think device authorization
> > is the best choice, since it works on headless machines as well as those
> > with browsers.)
> I agree with the statement that Device code is the best first choice
> if we absolutely have to pick one.
> Though I don't think we have to.
>
> While device flow can be used for all kinds of user-facing
> applications, it's specifically designed for input-constrained
> scenarios. As clearly stated in the Abstract here -
> https://www.rfc-editor.org/rfc/rfc8628
> The authorization code with pkce flow is recommended by the RFSc and
> major providers for cases when it's feasible.
> The long term goal is to provide both, though I don't see why the
> backbone protocol implementation first wouldn't add value.
>
> Another point is user authentication is one side of the whole story
> and the other critical one is system-to-system authentication. Where
> we have Client Credentials and Certificates.
> With the latter it is much harder to get generically implemented, as
> provider-specific tokens need to be signed.
>
> Adding the other reasoning, I think libpq support for specific flows
> can get in the further iterations, after the protocol support.
>
> > in that case I'd rather spend cycles on generic SASL.
> I see 2 approaches to generic SASL:
> (a). Generic SASL is a framework used in the protocol, with the
> mechanisms implemented on top and exposed to the DBAs as auth types to
> configure in hba.
> This is the direction we're going here, which is well aligned with the
> existing hba-based auth configuration.
> (b). Generic SASL exposed to developers on the server- and client-
> side to extend on. It seems to be a much longer shot.
> The specific points of large ambiguity are libpq distribution model
> (which you pointed to) and potential pluggability of insecure
> mechanisms.
>
> I do see (a) as a sweet spot with a lot of value for various
> participants with much less ambiguity.
>
> > Additionally, the "issuer" field added here is not part of the RFC. I've
> > written my thoughts about unofficial extensions upthread but haven't
> > received a response, so I'm going to start being more strident: Please,
> > for the sake of reviewers, call out changes you've made to the spec, and
> > why they're justified.
> Thanks for your feedback on this. We had this discussion as well, and
> added that as a convenience for the client to identify the provider.
> I don't see a reason why an issuer would be absolutely necessary, so
> we will get your point that sticking to RFCs is a safer choice.
>
> > The patches seem to be out of order now (and the documentation in the
> > commit messages has been removed).
> Feedback taken. Work in progress.
>
> On Tue, Jan 17, 2023 at 2:44 PM Jacob Champion <jchampion@timescale.com> wrote:
> >
> > On Sun, Jan 15, 2023 at 12:03 PM Andrey Chudnovsky
> > <achudnovskij@gmail.com> wrote:
> > > 2. Removed Device Code implementation in libpq. Several reasons:
> > >    - Reduce scope and focus on the protocol first.
> > >    - Device code implementation uses iddawc dependency. Taking this
> > > dependency is a controversial step which requires broader discussion.
> > >    - Device code implementation without iddaws would significantly
> > > increase the scope of the patch, as libpq needs to poll the token
> > > endpoint, setup different API calls, e.t.c.
> > >    - That flow should canonically only be used for clients which can't
> > > invoke browsers. If it is the only flow to be implemented, it can be
> > > used in the context when it's not expected by the OAUTH protocol.
> >
> > I'm not understanding the concern in the final point -- providers
> > generally require you to opt into device authorization, at least as far
> > as I can tell. So if you decide that it's not appropriate for your use
> > case... don't enable it. (And I haven't seen any claims that opting into
> > device authorization weakens the other flows in any way. So if we're
> > going to implement a flow in libpq, I still think device authorization
> > is the best choice, since it works on headless machines as well as those
> > with browsers.)
> >
> > All of this points at a bigger question to the community: if we choose
> > not to provide a flow implementation in libpq, is adding OAUTHBEARER
> > worth the additional maintenance cost?
> >
> > My personal vote would be "no". I think the hook-only approach proposed
> > here would ensure that only larger providers would implement it in
> > practice, and in that case I'd rather spend cycles on generic SASL.
> >
> > > 3. Temporarily removed test suite. We are actively working on aligning
> > > the tests with the latest changes. Will add a patch with tests soon.
> >
> > Okay. Case in point, the following change to the patch appears to be
> > invalid JSON:
> >
> > > +   appendStringInfo(&buf,
> > > +       "{ "
> > > +           "\"status\": \"invalid_token\", "
> > > +           "\"openid-configuration\": \"%s\","
> > > +           "\"scope\": \"%s\" ",
> > > +           "\"issuer\": \"%s\" ",
> > > +       "}",
> >
> > Additionally, the "issuer" field added here is not part of the RFC. I've
> > written my thoughts about unofficial extensions upthread but haven't
> > received a response, so I'm going to start being more strident: Please,
> > for the sake of reviewers, call out changes you've made to the spec, and
> > why they're justified.
> >
> > The patches seem to be out of order now (and the documentation in the
> > commit messages has been removed).
> >
> > Thanks,
> > --Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Stephen Frost
Date:
Greetings,

* mahendrakar s (mahendrakarforpg@gmail.com) wrote:
> The "issuer" field has been removed to align  with the RFC
> implementation - https://www.rfc-editor.org/rfc/rfc7628.
> This patch "v6" is a single patch to support the OAUTH BEARER token
> through psql connection string.
> Below flow is supported. Added the documentation in the commit messages.
>
>  +----------------------+                                 +----------+
>   |             +-------+                                  | Postgres |
>   | PQconnect ->|       |                                  |          |
>   |             |       |                                  |   +-----------+
>   |             |       | ---------- Empty Token---------> | > |           |
>   |             | libpq | <-- Error(Discovery + Scope ) -- | < | Pre-Auth  |
>   |          +------+   |                                  |   |  Hook     |
>   |     +- < | Hook |   |                                  |   +-----------+
>   |     |    +------+   |                                  |          |
>   |     v       |       |                                  |          |
>   |  [get token]|       |                                  |          |
>   |     |       |       |                                  |          |
>   |     +       |       |                                  |   +-----------+
>   | PQconnect > |       | --------- Access Token --------> | > | Validator |
>   |             |       | <---------- Auth Result -------- | < |   Hook    |
>   |             |       |                                  |   +-----------+
>   |             +-------+                                  |          |
>   +----------------------+                                 +----------+
>
> Please note that we are working on modifying/adding new tests (from
> Jacob's Patch) with the latest changes. Will add a patch with tests
> soon.

Having skimmed back through this thread again, I still feel that the
direction that was originally being taken (actually support something in
libpq and the backend, be it with libiddawc or something else or even
our own code, and not just throw hooks in various places) makes a lot
more sense and is a lot closer to how Kerberos and client-side certs and
even LDAP auth work today.  That also seems like a much better answer
for our users when it comes to new authentication methods than having
extensions and making libpq developers have to write their own custom
code, not to mention that we'd still need to implement something in psql
to provide such a hook if we are to have psql actually usefully exercise
this, no?

In the Kerberos test suite we have today, we actually bring up a proper
Kerberos server, set things up, and then test end-to-end installing a
keytab for the server, getting a TGT, getting a service ticket, testing
authentication and encryption, etc.  Looking around, it seems like the
equivilant would perhaps be to use Glewlwyd and libiddawc or libcurl and
our own code to really be able to test this and show that it works and
that we're doing it correctly, and to let us know if we break something.

Thanks,

Stephen

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Feb 20, 2023 at 2:35 PM Stephen Frost <sfrost@snowman.net> wrote:
> Having skimmed back through this thread again, I still feel that the
> direction that was originally being taken (actually support something in
> libpq and the backend, be it with libiddawc or something else or even
> our own code, and not just throw hooks in various places) makes a lot
> more sense and is a lot closer to how Kerberos and client-side certs and
> even LDAP auth work today.

Cool, that helps focus the effort. Thanks!

> That also seems like a much better answer
> for our users when it comes to new authentication methods than having
> extensions and making libpq developers have to write their own custom
> code, not to mention that we'd still need to implement something in psql
> to provide such a hook if we are to have psql actually usefully exercise
> this, no?

I don't mind letting clients implement their own flows... as long as
it's optional. So even if we did use a hook in the end, I agree that
we've got to exercise it ourselves.

> In the Kerberos test suite we have today, we actually bring up a proper
> Kerberos server, set things up, and then test end-to-end installing a
> keytab for the server, getting a TGT, getting a service ticket, testing
> authentication and encryption, etc.  Looking around, it seems like the
> equivilant would perhaps be to use Glewlwyd and libiddawc or libcurl and
> our own code to really be able to test this and show that it works and
> that we're doing it correctly, and to let us know if we break something.

The original patchset includes a test server in Python -- a major
advantage being that you can test the client and server independently
of each other, since the implementation is so asymmetric. Additionally
testing against something like Glewlwyd would be a great way to stack
coverage. (If we *only* test against a packaged server, though, it'll
be harder to test our stuff in the presence of malfunctions and other
corner cases.)

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
Thanks for the feedback,

> Having skimmed back through this thread again, I still feel that the
> direction that was originally being taken (actually support something in
> libpq and the backend, be it with libiddawc or something else or even
> our own code, and not just throw hooks in various places) makes a lot
> more sense and is a lot closer to how Kerberos and client-side certs and
> even LDAP auth work today.  That also seems like a much better answer
> for our users when it comes to new authentication methods than having
> extensions and making libpq developers have to write their own custom
> code, not to mention that we'd still need to implement something in psql
> to provide such a hook if we are to have psql actually usefully exercise
> this, no?

libpq implementation is the long term plan. However, our intention is
to start with the protocol implementation which allows us to build on
top of.

While device code is the right solution for psql, having that as the
only one can result in incentive to use it in the cases it's not
intended to.
Reasonably good implementation should support all of the following:
(1.) authorization code with pkce (for GUI applications)
(2.) device code (for console user logins)
(3.) client secret
(4.) some support for client certificate flow

(1.) and (4.) require more work to get implemented, though necessary
for encouraging the most secure grant types.
As we didn't have those pieces, we're proposing starting with the
protocol, which can be used by the ecosystem to build token flow
implementations.
Then add the libpq support for individual grant types.

We originally looked at starting with bare bone protocol for PG16 and
adding libpq support in PG17.
That plan won't happen, though still splitting the work into separate
stages would make more sense in my opinion.

Several questions to follow up:
(a.) Would you support committing the protocol first? or you see libpq
implementation for grants as the prerequisite to consider the auth
type?
(b.) As of today, the server side core does not validate that the
token is actually a valid jwt token. Instead relies on the extensions
to do the validation.
Do you think server core should do the basic validation before passing
to extensions to prevent the auth type being used for anything other
than OAUTH flows?

Tests are the plan for the commit-ready implementation.

Thanks!
Andrey.

On Tue, Feb 21, 2023 at 2:24 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Mon, Feb 20, 2023 at 2:35 PM Stephen Frost <sfrost@snowman.net> wrote:
> > Having skimmed back through this thread again, I still feel that the
> > direction that was originally being taken (actually support something in
> > libpq and the backend, be it with libiddawc or something else or even
> > our own code, and not just throw hooks in various places) makes a lot
> > more sense and is a lot closer to how Kerberos and client-side certs and
> > even LDAP auth work today.
>
> Cool, that helps focus the effort. Thanks!
>
> > That also seems like a much better answer
> > for our users when it comes to new authentication methods than having
> > extensions and making libpq developers have to write their own custom
> > code, not to mention that we'd still need to implement something in psql
> > to provide such a hook if we are to have psql actually usefully exercise
> > this, no?
>
> I don't mind letting clients implement their own flows... as long as
> it's optional. So even if we did use a hook in the end, I agree that
> we've got to exercise it ourselves.
>
> > In the Kerberos test suite we have today, we actually bring up a proper
> > Kerberos server, set things up, and then test end-to-end installing a
> > keytab for the server, getting a TGT, getting a service ticket, testing
> > authentication and encryption, etc.  Looking around, it seems like the
> > equivilant would perhaps be to use Glewlwyd and libiddawc or libcurl and
> > our own code to really be able to test this and show that it works and
> > that we're doing it correctly, and to let us know if we break something.
>
> The original patchset includes a test server in Python -- a major
> advantage being that you can test the client and server independently
> of each other, since the implementation is so asymmetric. Additionally
> testing against something like Glewlwyd would be a great way to stack
> coverage. (If we *only* test against a packaged server, though, it'll
> be harder to test our stuff in the presence of malfunctions and other
> corner cases.)
>
> Thanks,
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Stephen Frost
Date:
Greetings,

* Jacob Champion (jchampion@timescale.com) wrote:
> On Mon, Feb 20, 2023 at 2:35 PM Stephen Frost <sfrost@snowman.net> wrote:
> > Having skimmed back through this thread again, I still feel that the
> > direction that was originally being taken (actually support something in
> > libpq and the backend, be it with libiddawc or something else or even
> > our own code, and not just throw hooks in various places) makes a lot
> > more sense and is a lot closer to how Kerberos and client-side certs and
> > even LDAP auth work today.
>
> Cool, that helps focus the effort. Thanks!

Great, glad to hear that.

> > That also seems like a much better answer
> > for our users when it comes to new authentication methods than having
> > extensions and making libpq developers have to write their own custom
> > code, not to mention that we'd still need to implement something in psql
> > to provide such a hook if we are to have psql actually usefully exercise
> > this, no?
>
> I don't mind letting clients implement their own flows... as long as
> it's optional. So even if we did use a hook in the end, I agree that
> we've got to exercise it ourselves.

This really doesn't feel like a great area to try and do hooks or
similar in, not the least because that approach has been tried and tried
again (PAM, GSSAPI, SASL would all be examples..) and frankly none of
them has turned out great (which is why we can't just tell people "well,
install the pam_oauth2 and watch everything work!") and this strikes me
as trying to do that yet again but worse as it's not even a dedicated
project trying to solve the problem but more like a side project.  SCRAM
was good, we've come a long way thanks to that, this feels like it
should be more in line with that rather than trying to invent yet
another new "generic" set of hooks/APIs that will just cause DBAs and
our users headaches trying to make work.

> > In the Kerberos test suite we have today, we actually bring up a proper
> > Kerberos server, set things up, and then test end-to-end installing a
> > keytab for the server, getting a TGT, getting a service ticket, testing
> > authentication and encryption, etc.  Looking around, it seems like the
> > equivilant would perhaps be to use Glewlwyd and libiddawc or libcurl and
> > our own code to really be able to test this and show that it works and
> > that we're doing it correctly, and to let us know if we break something.
>
> The original patchset includes a test server in Python -- a major
> advantage being that you can test the client and server independently
> of each other, since the implementation is so asymmetric. Additionally
> testing against something like Glewlwyd would be a great way to stack
> coverage. (If we *only* test against a packaged server, though, it'll
> be harder to test our stuff in the presence of malfunctions and other
> corner cases.)

Oh, that's even better- I agree entirely that having test code that can
be instructed to return specific errors so that we can test that our
code responds properly is great (and is why pgbackrest has things like
a stub'd out libpq, fake s3, GCS, and Azure servers, and more) and would
certainly want to keep that, even if we also build out a test that uses
a real server to provide integration testing with not-our-code too.

Thanks!

Stephen

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> This really doesn't feel like a great area to try and do hooks or
> similar in, not the least because that approach has been tried and tried
> again (PAM, GSSAPI, SASL would all be examples..) and frankly none of
> them has turned out great (which is why we can't just tell people "well,
> install the pam_oauth2 and watch everything work!") and this strikes me
> as trying to do that yet again but worse as it's not even a dedicated
> project trying to solve the problem but more like a side project.

In this case it's not intended to be an open-ended hook, but rather an
implementation of a specific rfc (rfc-7628) which defines a
client-server communication for the authentication flow.
The rfc itself does leave a lot of flexibility on specific parts of
the implementation. Which do require hooks:
(1.) Server side hook to validate the token, which is specific to the
OAUTH provider.
(2.) Client side hook to request the client to obtain the token.

On (1.), we would need a hook for the OAUTH provider extension to do
validation. We can though do some basic check that the credential is
indeed a JWT token signed by the requested issuer.

Specifically (2.) is where we can provide a layer in libpq to simplify
the integration. i.e. implement some OAUTH flows.
Though we would need some flexibility for the clients to bring their own token:
For example there are cases where the credential to obtain the token
is stored in a separate secure location and the token is returned from
a separate service or pushed from a more secure environment.

> another new "generic" set of hooks/APIs that will just cause DBAs and
> our users headaches trying to make work.
As I mentioned above, it's an rfc implementation, rather than our invention.
When it comes to DBAs and the users.
Builtin libpq implementations which allows psql and pgadmin to
seamlessly connect should suffice those needs.
While extensibility would allow the ecosystem to be open for OAUTH
providers, SAAS developers, PAAS providers and other institutional
players.

Thanks!
Andrey.

On Thu, Feb 23, 2023 at 10:47 AM Stephen Frost <sfrost@snowman.net> wrote:
>
> Greetings,
>
> * Jacob Champion (jchampion@timescale.com) wrote:
> > On Mon, Feb 20, 2023 at 2:35 PM Stephen Frost <sfrost@snowman.net> wrote:
> > > Having skimmed back through this thread again, I still feel that the
> > > direction that was originally being taken (actually support something in
> > > libpq and the backend, be it with libiddawc or something else or even
> > > our own code, and not just throw hooks in various places) makes a lot
> > > more sense and is a lot closer to how Kerberos and client-side certs and
> > > even LDAP auth work today.
> >
> > Cool, that helps focus the effort. Thanks!
>
> Great, glad to hear that.
>
> > > That also seems like a much better answer
> > > for our users when it comes to new authentication methods than having
> > > extensions and making libpq developers have to write their own custom
> > > code, not to mention that we'd still need to implement something in psql
> > > to provide such a hook if we are to have psql actually usefully exercise
> > > this, no?
> >
> > I don't mind letting clients implement their own flows... as long as
> > it's optional. So even if we did use a hook in the end, I agree that
> > we've got to exercise it ourselves.
>
> This really doesn't feel like a great area to try and do hooks or
> similar in, not the least because that approach has been tried and tried
> again (PAM, GSSAPI, SASL would all be examples..) and frankly none of
> them has turned out great (which is why we can't just tell people "well,
> install the pam_oauth2 and watch everything work!") and this strikes me
> as trying to do that yet again but worse as it's not even a dedicated
> project trying to solve the problem but more like a side project.  SCRAM
> was good, we've come a long way thanks to that, this feels like it
> should be more in line with that rather than trying to invent yet
> another new "generic" set of hooks/APIs that will just cause DBAs and
> our users headaches trying to make work.
>
> > > In the Kerberos test suite we have today, we actually bring up a proper
> > > Kerberos server, set things up, and then test end-to-end installing a
> > > keytab for the server, getting a TGT, getting a service ticket, testing
> > > authentication and encryption, etc.  Looking around, it seems like the
> > > equivilant would perhaps be to use Glewlwyd and libiddawc or libcurl and
> > > our own code to really be able to test this and show that it works and
> > > that we're doing it correctly, and to let us know if we break something.
> >
> > The original patchset includes a test server in Python -- a major
> > advantage being that you can test the client and server independently
> > of each other, since the implementation is so asymmetric. Additionally
> > testing against something like Glewlwyd would be a great way to stack
> > coverage. (If we *only* test against a packaged server, though, it'll
> > be harder to test our stuff in the presence of malfunctions and other
> > corner cases.)
>
> Oh, that's even better- I agree entirely that having test code that can
> be instructed to return specific errors so that we can test that our
> code responds properly is great (and is why pgbackrest has things like
> a stub'd out libpq, fake s3, GCS, and Azure servers, and more) and would
> certainly want to keep that, even if we also build out a test that uses
> a real server to provide integration testing with not-our-code too.
>
> Thanks!
>
> Stephen



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Stephen Frost
Date:
Greetings,

* Andrey Chudnovsky (achudnovskij@gmail.com) wrote:
> > This really doesn't feel like a great area to try and do hooks or
> > similar in, not the least because that approach has been tried and tried
> > again (PAM, GSSAPI, SASL would all be examples..) and frankly none of
> > them has turned out great (which is why we can't just tell people "well,
> > install the pam_oauth2 and watch everything work!") and this strikes me
> > as trying to do that yet again but worse as it's not even a dedicated
> > project trying to solve the problem but more like a side project.
>
> In this case it's not intended to be an open-ended hook, but rather an
> implementation of a specific rfc (rfc-7628) which defines a
> client-server communication for the authentication flow.
> The rfc itself does leave a lot of flexibility on specific parts of
> the implementation. Which do require hooks:

Color me skeptical on an RFC that requires hooks.

> (1.) Server side hook to validate the token, which is specific to the
> OAUTH provider.
> (2.) Client side hook to request the client to obtain the token.

Perhaps I'm missing it... but weren't these handled with what the
original patch that Jacob had was doing?

> On (1.), we would need a hook for the OAUTH provider extension to do
> validation. We can though do some basic check that the credential is
> indeed a JWT token signed by the requested issuer.
>
> Specifically (2.) is where we can provide a layer in libpq to simplify
> the integration. i.e. implement some OAUTH flows.
> Though we would need some flexibility for the clients to bring their own token:
> For example there are cases where the credential to obtain the token
> is stored in a separate secure location and the token is returned from
> a separate service or pushed from a more secure environment.

In those cases... we could, if we wanted, simply implement the code to
actually pull the token, no?  We don't *have* to have a hook here for
this, we could just make it work.

> > another new "generic" set of hooks/APIs that will just cause DBAs and
> > our users headaches trying to make work.
> As I mentioned above, it's an rfc implementation, rather than our invention.

While I only took a quick look, I didn't see anything in that RFC that
explicitly says that hooks or a plugin or a library or such is required
to meet the RFC.  Sure, there are places which say that the
implementation is specific to a particular server or client but that's
not the same thing.

> When it comes to DBAs and the users.
> Builtin libpq implementations which allows psql and pgadmin to
> seamlessly connect should suffice those needs.
> While extensibility would allow the ecosystem to be open for OAUTH
> providers, SAAS developers, PAAS providers and other institutional
> players.

Each to end up writing their own code to do largely the same thing
without the benefit of the larger community to be able to review and
ensure that it's done properly?

That doesn't sound like a great approach to me.

Thanks,

Stephen

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Sep 23, 2022 at 3:39 PM Jacob Champion <jchampion@timescale.com> wrote:
> Here's a newly rebased v5. (They're all zipped now, which I probably
> should have done a while back, sorry.)

To keep this current, v7 is rebased over latest, without the pluggable
authentication patches. This doesn't yet address the architectural
feedback that was discussed previously, so if you're primarily
interested in that, you can safely ignore this version of the
patchset.

The key changes here include
- Meson support, for both the build and the pytest suite
- Cirrus support (and unsurprisingly, Mac and Windows builds fail due
to the Linux-oriented draft code)
- A small tweak to support iddawc down to 0.9.8 (shipped with e.g.
Debian Bullseye)
- Removal of the authn_id test extension in favor of SYSTEM_USER

The meson+pytest support was big enough that I split it into its own
patch. It's not very polished yet, but it mostly works, and when
running tests via Meson it'll now spin up a test server for you. My
virtualenv approach apparently interacts poorly with the multiarch
Cirrus setup (64-bit tests pass, 32-bit tests fail).

Moving forward, the first thing I plan to tackle is asynchronous
operation, so that polling clients can still operate sanely. If I can
find a good solution there, the conversations about possible extension
points should get a lot easier.

Thanks,
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On 4/27/23 10:35, Jacob Champion wrote:
> Moving forward, the first thing I plan to tackle is asynchronous
> operation, so that polling clients can still operate sanely. If I can
> find a good solution there, the conversations about possible extension
> points should get a lot easier.

Attached is patchset v8, now with concurrency and 300% more cURL! And
many more questions to answer.

This is a full reimplementation of the client-side OAuth flow. It's an
async-first engine built on top of cURL's multi handles. All pending
operations are multiplexed into a single epoll set (the "altsock"),
which is exposed through PQsocket() for the duration of the OAuth flow.
Clients return to the flow on their next call to PQconnectPoll().

Andrey and Mahendrakar: you'll probably be interested in the
conn->async_auth() callback, conn->altsock, and the pg_fe_run_oauth_flow
entry point. This is intended to be the foundation for alternative flows.

I've kept the blocking iddawc implementation for comparison, but if
you're running the tests against it, be aware that the asynchronous
tests will, predictably, hang. Skip them with `py.test -k 'not
asynchronous'`.

= The Good =

- PQconnectPoll() is no longer indefinitely blocked on a single
connection's OAuth handshake. (iddawc doesn't appear to have any
asynchronous primitives in its API, unless I've missed something crucial.)

- We now have a swappable entry point. Alternative flows could be
implemented by applications without forcing clients to redesign their
polling loops (PQconnect* should just work as expected).

- We have full control over corner cases in our default flow. Debugging
failures is much nicer, with explanations of exactly what has gone wrong
and where, compared to iddawc's "I_ERROR" messages.

- cURL is not a lightweight library by any means, but we're no longer
bundling things like web servers that we're not going to use.

= The Bad =

- Unsurprisingly, there's a lot more code now that we're implementing
the flow ourselves. The client patch has tripled in size, and we'd be on
the hook for implementing and staying current with the RFCs.

- The client implementation is currently epoll-/Linux-specific. I think
kqueue shouldn't be too much trouble for the BSDs, but it's even more
code to maintain.

- Some clients in the wild (psycopg2/psycopg) suppress all notifications
during PQconnectPoll(). To accommodate them, I no longer use the
noticeHooks for communicating the user code, but that means we have to
come up with some other way to let applications override the printing to
stderr. Something like the OpenSSL decryption callback, maybe?

= The Ugly =

- Unless someone is aware of some amazing Winsock magic, I'm pretty sure
the multiplexed-socket approach is dead in the water on Windows. I think
the strategy there probably has to be a background thread plus a fake
"self-pipe" (loopback socket) for polling... which may be controversial?

- We have to figure out how to initialize cURL in a thread-safe manner.
Newer versions of libcurl and OpenSSL improve upon this situation, but I
don't think there's a way to check at compile time whether the
initialization strategy is safe or not (and even at runtime, I think
there may be a chicken-and-egg problem with the API, where it's not safe
to check for thread-safe initialization until after you've safely
initialized).

= Next Steps =

There are so many TODOs in the cURL implementation: it's been a while
since I've done any libcurl programming, it all needs to be hardened,
and I need to comb through the relevant specs again. But I don't want to
gold-plate it if this overall approach is unacceptable. So, questions
for the gallery:

1) Would starting up a background thread (pooled or not) be acceptable
on Windows? Alternatively, does anyone know enough Winsock deep magic to
combine multiple pending events into one (selectable!) socket?

2) If a background thread is acceptable on one platform, does it make
more sense to use one on every platform and just have synchronous code
everywhere? Or should we use a threadless async implementation when we can?

3) Is the current conn->async_auth() entry point sufficient for an
application to implement the Microsoft flows discussed upthread?

4) Would we want to try to require a new enough cURL/OpenSSL to avoid
thread safety problems during initialization, or do we need to introduce
some API equivalent to PQinitOpenSSL?

5) Does this maintenance tradeoff (full control over the client vs. a
large amount of RFC-governed code) seem like it could be okay?

Thanks,
--Jacob
Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniele Varrazzo
Date:
On Sat, 20 May 2023 at 00:01, Jacob Champion <jchampion@timescale.com> wrote:

> - Some clients in the wild (psycopg2/psycopg) suppress all notifications
> during PQconnectPoll().

If there is anything we can improve in psycopg please reach out.

-- Daniele



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, May 23, 2023 at 4:22 AM Daniele Varrazzo
<daniele.varrazzo@gmail.com> wrote:
> On Sat, 20 May 2023 at 00:01, Jacob Champion <jchampion@timescale.com> wrote:
> > - Some clients in the wild (psycopg2/psycopg) suppress all notifications
> > during PQconnectPoll().
>
> If there is anything we can improve in psycopg please reach out.

Will do, thank you! But in this case, I think there's nothing to
improve in psycopg -- in fact, it highlighted the problem with my
initial design, and now I think the notice processor will never be an
appropriate avenue for communication of the user code.

The biggest issue is that there's a chicken-and-egg situation: if
you're using the synchronous PQconnect* API, you can't override the
notice hooks while the handshake is in progress, because you don't
have a connection handle yet. The second problem is that there are a
bunch of parameters coming back from the server (user code,
verification URI, expiration time) that the application may choose to
display or use, and communicating those pieces in a (probably already
translated) flat text string is a pretty hostile API.

So I think we'll probably need to provide a global handler API,
similar to the passphrase hook we currently provide, that can receive
these pieces separately and assemble them however the application
desires. The hard part will be to avoid painting ourselves into a
corner, because this particular information is specific to the device
authorization flow, and if we ever want to add other flows into libpq,
we'll probably not want to add even more hooks.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On 5/19/23 15:01, Jacob Champion wrote:
> But I don't want to
> gold-plate it if this overall approach is unacceptable. So, questions
> for the gallery:
> 
> 1) Would starting up a background thread (pooled or not) be acceptable
> on Windows? Alternatively, does anyone know enough Winsock deep magic to
> combine multiple pending events into one (selectable!) socket?
> 
> 2) If a background thread is acceptable on one platform, does it make
> more sense to use one on every platform and just have synchronous code
> everywhere? Or should we use a threadless async implementation when we can?
> 
> 3) Is the current conn->async_auth() entry point sufficient for an
> application to implement the Microsoft flows discussed upthread?
> 
> 4) Would we want to try to require a new enough cURL/OpenSSL to avoid
> thread safety problems during initialization, or do we need to introduce
> some API equivalent to PQinitOpenSSL?
> 
> 5) Does this maintenance tradeoff (full control over the client vs. a
> large amount of RFC-governed code) seem like it could be okay?

There was additional interest at PGCon, so I've registered this in the
commitfest.

Potential reviewers should be aware that the current implementation
requires Linux (or, more specifically, epoll), as the cfbot shows. But
if you have any opinions on the above questions, those will help me
tackle the other platforms. :D

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Thomas Munro
Date:
On Sat, May 20, 2023 at 10:01 AM Jacob Champion <jchampion@timescale.com> wrote:
> - The client implementation is currently epoll-/Linux-specific. I think
> kqueue shouldn't be too much trouble for the BSDs, but it's even more
> code to maintain.

I guess you also need a fallback that uses plain old POSIX poll()?  I
see you're not just using epoll but also timerfd.  Could that be
converted to plain old timeout bookkeeping?  That should be enough to
get every other Unix and *possibly* also Windows to work with the same
code path.

> - Unless someone is aware of some amazing Winsock magic, I'm pretty sure
> the multiplexed-socket approach is dead in the water on Windows. I think
> the strategy there probably has to be a background thread plus a fake
> "self-pipe" (loopback socket) for polling... which may be controversial?

I am not a Windows user or hacker, but there are certainly several
ways to multiplex sockets.  First there is the WSAEventSelect() +
WaitForMultipleObjects() approach that latch.c uses.  It has the
advantage that it allows socket readiness to be multiplexed with
various other things that use Windows "events".  But if you don't need
that, ie you *only* need readiness-based wakeup for a bunch of sockets
and no other kinds of fd or object, you can use winsock's plain old
select() or its fairly faithful poll() clone called WSAPoll().  It
looks a bit like that'd be true here if you could kill the timerfd?

It's a shame to write modern code using select(), but you can find
lots of shouting all over the internet about WSAPoll()'s defects, most
famously the cURL guys[1] whose blog is widely cited, so people still
do it.  Possibly some good news on that front:  by my reading of the
docs, it looks like that problem was fixed in Windows 10 2004[2] which
itself is by now EOL, so all systems should have the fix?  I suspect
that means that, finally, you could probably just use the same poll()
code path for Unix (when epoll is not available) *and* Windows these
days, making porting a lot easier.  But I've never tried it, so I
don't know what other problems there might be.  Another thing people
complain about is the lack of socketpair() or similar in winsock which
means you unfortunately can't easily make anonymous
select/poll-compatible local sockets, but that doesn't seem to be
needed here.

[1] https://daniel.haxx.se/blog/2012/10/10/wsapoll-is-broken/
[2] https://learn.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsapoll



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Jun 30, 2023 at 9:29 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Sat, May 20, 2023 at 10:01 AM Jacob Champion <jchampion@timescale.com> wrote:
> > - The client implementation is currently epoll-/Linux-specific. I think
> > kqueue shouldn't be too much trouble for the BSDs, but it's even more
> > code to maintain.
>
> I guess you also need a fallback that uses plain old POSIX poll()?

The use of the epoll API here is to combine several sockets into one,
not to actually call epoll_wait() itself. kqueue descriptors should
let us do the same, IIUC.

> I see you're not just using epoll but also timerfd.  Could that be
> converted to plain old timeout bookkeeping?  That should be enough to
> get every other Unix and *possibly* also Windows to work with the same
> code path.

I might be misunderstanding your suggestion, but I think our internal
bookkeeping is orthogonal to that. The use of timerfd here allows us
to forward libcurl's timeout requirements up to the top-level
PQsocket(). As an example, libcurl is free to tell us to call it again
in ten milliseconds, and we have to make sure a nonblocking client
calls us again after that elapses; otherwise they might hang waiting
for data that's not coming.

> > - Unless someone is aware of some amazing Winsock magic, I'm pretty sure
> > the multiplexed-socket approach is dead in the water on Windows. I think
> > the strategy there probably has to be a background thread plus a fake
> > "self-pipe" (loopback socket) for polling... which may be controversial?
>
> I am not a Windows user or hacker, but there are certainly several
> ways to multiplex sockets.  First there is the WSAEventSelect() +
> WaitForMultipleObjects() approach that latch.c uses.

I don't think that strategy plays well with select() clients, though
-- it requires a handle array, and we've just got the one socket.

My goal is to maintain compatibility with existing PQconnectPoll()
applications, where the only way we get to communicate with the client
is through the PQsocket() for the connection. Ideally, you shouldn't
have to completely rewrite your application loop just to make use of
OAuth. (I assume a requirement like that would be a major roadblock to
committing this -- and if that's not a correct assumption, then I
guess my job gets a lot easier?)

> It's a shame to write modern code using select(), but you can find
> lots of shouting all over the internet about WSAPoll()'s defects, most
> famously the cURL guys[1] whose blog is widely cited, so people still
> do it.

Right -- that's basically the root of my concern. I can't guarantee
that existing Windows clients out there are all using
WaitForMultipleObjects(). From what I can tell, whatever we hand up
through PQsocket() has to be fully Winsock-/select-compatible.

> Another thing people
> complain about is the lack of socketpair() or similar in winsock which
> means you unfortunately can't easily make anonymous
> select/poll-compatible local sockets, but that doesn't seem to be
> needed here.

For the background-thread implementation, it probably would be. I've
been looking at libevent (BSD-licensed) and its socketpair hack for
Windows...

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Thomas Munro
Date:
On Thu, Jul 6, 2023 at 9:00 AM Jacob Champion <jchampion@timescale.com> wrote:
> My goal is to maintain compatibility with existing PQconnectPoll()
> applications, where the only way we get to communicate with the client
> is through the PQsocket() for the connection. Ideally, you shouldn't
> have to completely rewrite your application loop just to make use of
> OAuth. (I assume a requirement like that would be a major roadblock to
> committing this -- and if that's not a correct assumption, then I
> guess my job gets a lot easier?)

Ah, right, I get it.

I guess there are a couple of ways to do it if we give up the goal of
no-code-change-for-the-client:

1.  Generalised PQsocket(), that so that a client can call something like:

int PQpollset(const PGConn *conn, struct pollfd fds[], int fds_size,
int *nfds, int *timeout_ms);

That way, libpq could tell you about which events it would like to
wait for on which fds, and when it would like you to call it back due
to timeout, and you can either pass that information directly to
poll() or WSAPoll() or some equivalent interface (we don't care, we
just gave you the info you need), or combine it in obvious ways with
whatever else you want to multiplex with in your client program.

2.  Convert those events into new libpq events like 'I want you to
call me back in 100ms', and 'call me back when socket #42 has data',
and let clients handle that by managing their own poll set etc.  (This
is something I've speculated about to support more efficient
postgres_fdw shard query multiplexing; gotta figure out how to get
multiple connections' events into one WaitEventSet...)

I guess there is a practical middle ground where client code on
systems that have epoll/kqueue can use OAUTHBEARER without any code
change, and the feature is available on other systems too but you'll
have to change your client code to use one of those interfaces or else
you get an error 'coz we just can't do it.  Or, more likely in the
first version, you just can't do it at all...  Doesn't seem that bad
to me.

BTW I will happily do the epoll->kqueue port work if necessary.



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Wed, Jul 5, 2023 at 3:07 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> I guess there are a couple of ways to do it if we give up the goal of
> no-code-change-for-the-client:
>
> 1.  Generalised PQsocket(), that so that a client can call something like:
>
> int PQpollset(const PGConn *conn, struct pollfd fds[], int fds_size,
> int *nfds, int *timeout_ms);
>
> That way, libpq could tell you about which events it would like to
> wait for on which fds, and when it would like you to call it back due
> to timeout, and you can either pass that information directly to
> poll() or WSAPoll() or some equivalent interface (we don't care, we
> just gave you the info you need), or combine it in obvious ways with
> whatever else you want to multiplex with in your client program.

I absolutely wanted something like this while I was writing the code
(it would have made things much easier), but I'd feel bad adding that
much complexity to the API if the vast majority of connections use
exactly one socket. Are there other use cases in libpq where you think
this expanded API could be useful? Maybe to lift some of the existing
restrictions for PQconnectPoll(), add async DNS resolution, or
something?

Couple complications I can think of at the moment:
1. Clients using persistent pollsets will have to remove old
descriptors, presumably by tracking the delta since the last call,
which might make for a rough transition. Bookkeeping bugs probably
wouldn't show up unless they used OAuth in their test suites. With the
current model, that's more hidden and libpq takes responsibility for
getting it right.
2. In the future, we might need to think carefully around situations
where we want multiple PGConn handles to share descriptors (e.g.
multiplexed backend connections). I avoid tricky questions at the
moment by assigning only one connection per multi pool.

> 2.  Convert those events into new libpq events like 'I want you to
> call me back in 100ms', and 'call me back when socket #42 has data',
> and let clients handle that by managing their own poll set etc.  (This
> is something I've speculated about to support more efficient
> postgres_fdw shard query multiplexing; gotta figure out how to get
> multiple connections' events into one WaitEventSet...)

Something analogous to libcurl's socket and timeout callbacks [1],
then? Or is there an existing libpq API you were thinking about using?

> I guess there is a practical middle ground where client code on
> systems that have epoll/kqueue can use OAUTHBEARER without any code
> change, and the feature is available on other systems too but you'll
> have to change your client code to use one of those interfaces or else
> you get an error 'coz we just can't do it.

That's a possibility -- if your platform is able to do it nicely,
might as well use it. (In a similar vein, I'd personally vote against
having every platform use a background thread, even if we decided to
implement it for Windows.)

> Or, more likely in the
> first version, you just can't do it at all...  Doesn't seem that bad
> to me.

Any initial opinions on whether it's worse or better than a worker thread?

> BTW I will happily do the epoll->kqueue port work if necessary.

And I will happily take you up on that; thanks!

--Jacob

[1] https://curl.se/libcurl/c/CURLMOPT_SOCKETFUNCTION.html



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Thomas Munro
Date:
On Fri, Jul 7, 2023 at 4:57 AM Jacob Champion <jchampion@timescale.com> wrote:
> On Wed, Jul 5, 2023 at 3:07 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > 2.  Convert those events into new libpq events like 'I want you to
> > call me back in 100ms', and 'call me back when socket #42 has data',
> > and let clients handle that by managing their own poll set etc.  (This
> > is something I've speculated about to support more efficient
> > postgres_fdw shard query multiplexing; gotta figure out how to get
> > multiple connections' events into one WaitEventSet...)
>
> Something analogous to libcurl's socket and timeout callbacks [1],
> then? Or is there an existing libpq API you were thinking about using?

Yeah.  Libpq already has an event concept.  I did some work on getting
long-lived WaitEventSet objects to be used in various places, some of
which got committed[1], but not yet the parts related to postgres_fdw
(which uses libpq connections to talk to other PostgreSQL servers, and
runs into the limitations of PQsocket()).  Horiguchi-san had the good
idea of extending the event system to cover socket changes, but I
haven't actually tried it yet.  One day.

> > Or, more likely in the
> > first version, you just can't do it at all...  Doesn't seem that bad
> > to me.
>
> Any initial opinions on whether it's worse or better than a worker thread?

My vote is that it's perfectly fine to make a new feature that only
works on some OSes.  If/when someone wants to work on getting it going
on Windows/AIX/Solaris (that's the complete set of no-epoll, no-kqueue
OSes we target), they can write the patch.

[1]
https://www.postgresql.org/message-id/flat/CA%2BhUKGJAC4Oqao%3DqforhNey20J8CiG2R%3DoBPqvfR0vOJrFysGw%40mail.gmail.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Jul 6, 2023 at 1:48 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Fri, Jul 7, 2023 at 4:57 AM Jacob Champion <jchampion@timescale.com> wrote:
> > Something analogous to libcurl's socket and timeout callbacks [1],
> > then? Or is there an existing libpq API you were thinking about using?
>
> Yeah.  Libpq already has an event concept.

Thanks -- I don't know how I never noticed libpq-events.h before.

Per-connection events (or callbacks) might bring up the same
chicken-and-egg situation discussed above, with the notice hook. We'll
be fine as long as PQconnectStart is guaranteed to return before the
PQconnectPoll engine gets to authentication, and it looks like that's
true with today's implementation, which returns pessimistically at
several points instead of just trying to continue the exchange. But I
don't know if that's intended as a guarantee for the future. At the
very least we would have to pin that implementation detail.

> > > Or, more likely in the
> > > first version, you just can't do it at all...  Doesn't seem that bad
> > > to me.
> >
> > Any initial opinions on whether it's worse or better than a worker thread?
>
> My vote is that it's perfectly fine to make a new feature that only
> works on some OSes.  If/when someone wants to work on getting it going
> on Windows/AIX/Solaris (that's the complete set of no-epoll, no-kqueue
> OSes we target), they can write the patch.

Okay. I'm curious to hear others' thoughts on that, too, if anyone's lurking.

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
Thanks Jacob for making progress on this.

> 3) Is the current conn->async_auth() entry point sufficient for an
> application to implement the Microsoft flows discussed upthread?

Please confirm my understanding of the flow is correct:
1. Client calls PQconnectStart.
  - The client doesn't know yet what is the issuer and the scope.
  - Parameters are strings, so callback is not provided yet.
2. Client gets PgConn from PQconnectStart return value and updates
conn->async_auth to its own callback.
3. Client polls PQconnectPoll and checks conn->sasl_state until the
value is SASL_ASYNC
4. Client accesses conn->oauth_issuer and conn->oauth_scope and uses
those info to trigger the token flow.
5. Expectations on async_auth:
    a. It returns PGRES_POLLING_READING while token acquisition is going on
    b. It returns PGRES_POLLING_OK and sets conn->sasl_state->token
when token acquisition succeeds.
6. Is the client supposed to do anything with the altsock parameter?

Is the above accurate understanding?

If yes, it looks workable with a couple of improvements I think would be nice:
1. Currently, oauth_exchange function sets conn->async_auth =
pg_fe_run_oauth_flow and starts Device Code flow automatically when
receiving challenge and metadata from the server.
    There probably should be a way for the client to prevent default
Device Code flow from triggering.
2. The current signature and expectations from async_auth function
seems to be tightly coupled with the internal implementation:
    - Pieces of information need to be picked and updated in different
places in the PgConn structure.
    - Function is expected to return PostgresPollingStatusType which
is used to communicate internal state to the client.
   Would it make sense to separate the internal callback used to
communicate with Device Code flow from client facing API?
   I.e. introduce a new client facing structure and enum to facilitate
callback and its return value.

-----------
On a separate note:
The backend code currently spawns an external command for token validation.
As we discussed before, an extension hook would be a more efficient
extensibility option.
We see clients make 10k+ connections using OAuth tokens per minute to
our service, and stating external processes would be too much overhead
here.

-----------

> 5) Does this maintenance tradeoff (full control over the client vs. a
> large amount of RFC-governed code) seem like it could be okay?

It's nice for psql to have Device Code flow. Can be made even more
convenient with refresh tokens support.
And for clients on resource constrained devices to be able to
authenticate with Client Credentials (app secret) without bringing
more dependencies.

In most other cases, upstream PostgreSQL drivers written in higher
level languages have libraries / abstractions to implement OAUTH flows
for the platforms they support.

On Fri, Jul 7, 2023 at 11:48 AM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Thu, Jul 6, 2023 at 1:48 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > On Fri, Jul 7, 2023 at 4:57 AM Jacob Champion <jchampion@timescale.com> wrote:
> > > Something analogous to libcurl's socket and timeout callbacks [1],
> > > then? Or is there an existing libpq API you were thinking about using?
> >
> > Yeah.  Libpq already has an event concept.
>
> Thanks -- I don't know how I never noticed libpq-events.h before.
>
> Per-connection events (or callbacks) might bring up the same
> chicken-and-egg situation discussed above, with the notice hook. We'll
> be fine as long as PQconnectStart is guaranteed to return before the
> PQconnectPoll engine gets to authentication, and it looks like that's
> true with today's implementation, which returns pessimistically at
> several points instead of just trying to continue the exchange. But I
> don't know if that's intended as a guarantee for the future. At the
> very least we would have to pin that implementation detail.
>
> > > > Or, more likely in the
> > > > first version, you just can't do it at all...  Doesn't seem that bad
> > > > to me.
> > >
> > > Any initial opinions on whether it's worse or better than a worker thread?
> >
> > My vote is that it's perfectly fine to make a new feature that only
> > works on some OSes.  If/when someone wants to work on getting it going
> > on Windows/AIX/Solaris (that's the complete set of no-epoll, no-kqueue
> > OSes we target), they can write the patch.
>
> Okay. I'm curious to hear others' thoughts on that, too, if anyone's lurking.
>
> Thanks!
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Thomas Munro
Date:
On Fri, Jul 7, 2023 at 4:57 AM Jacob Champion <jchampion@timescale.com> wrote:
> On Wed, Jul 5, 2023 at 3:07 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > BTW I will happily do the epoll->kqueue port work if necessary.
>
> And I will happily take you up on that; thanks!

Some initial hacking, about 2 coffees' worth:
https://github.com/macdice/postgres/commits/oauth-kqueue

This compiles on FreeBSD and macOS, but I didn't have time to figure
out all your Python testing magic so I don't know if it works yet and
it's still red on CI...  one thing I wondered about is the *altsock =
timerfd part which I couldn't do.

The situation on macOS is a little odd: the man page says EVFILT_TIMER
is not implemented.  But clearly it is, we can read the source code as
I had to do to find out which unit of time it defaults to[1] (huh,
Apple's github repo for Darwin appears to have been archived recently
-- no more source code updates?  that'd be a shame!), and it works
exactly as expected in simple programs.  So I would just assume it
works until we see evidence otherwise.  (We already use a couple of
other things on macOS more or less by accident because configure finds
them, where they are undocumented or undeclared.)

[1] https://github.com/apple/darwin-xnu/blob/main/bsd/kern/kern_event.c#L1345



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Jul 7, 2023 at 2:16 PM Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
> Please confirm my understanding of the flow is correct:
> 1. Client calls PQconnectStart.
>   - The client doesn't know yet what is the issuer and the scope.

Right. (Strictly speaking it doesn't even know that OAuth will be used
for the connection, yet, though at some point we'll be able to force
the issue with e.g. `require_auth=oauth`. That's not currently
implemented.)

>   - Parameters are strings, so callback is not provided yet.
> 2. Client gets PgConn from PQconnectStart return value and updates
> conn->async_auth to its own callback.

This is where some sort of official authn callback registration (see
above reply to Daniele) would probably come in handy.

> 3. Client polls PQconnectPoll and checks conn->sasl_state until the
> value is SASL_ASYNC

In my head, the client's custom callback would always be invoked
during the call to PQconnectPoll, rather than making the client do
work in between calls. That way, a client can use custom flows even
with a synchronous PQconnectdb().

> 4. Client accesses conn->oauth_issuer and conn->oauth_scope and uses
> those info to trigger the token flow.

Right.

> 5. Expectations on async_auth:
>     a. It returns PGRES_POLLING_READING while token acquisition is going on
>     b. It returns PGRES_POLLING_OK and sets conn->sasl_state->token
> when token acquisition succeeds.

Yes. Though the token should probably be returned through some
explicit part of the callback, now that you mention it...

> 6. Is the client supposed to do anything with the altsock parameter?

The callback needs to set the altsock up with a select()able
descriptor, which wakes up the client when more work is ready to be
done. Without that, you can't handle multiple connections on a single
thread.

> If yes, it looks workable with a couple of improvements I think would be nice:
> 1. Currently, oauth_exchange function sets conn->async_auth =
> pg_fe_run_oauth_flow and starts Device Code flow automatically when
> receiving challenge and metadata from the server.
>     There probably should be a way for the client to prevent default
> Device Code flow from triggering.

Agreed. I'd like the client to be able to override this directly.

> 2. The current signature and expectations from async_auth function
> seems to be tightly coupled with the internal implementation:
>     - Pieces of information need to be picked and updated in different
> places in the PgConn structure.
>     - Function is expected to return PostgresPollingStatusType which
> is used to communicate internal state to the client.
>    Would it make sense to separate the internal callback used to
> communicate with Device Code flow from client facing API?
>    I.e. introduce a new client facing structure and enum to facilitate
> callback and its return value.

Yep, exactly right! I just wanted to check that the architecture
*looked* sufficient before pulling it up into an API.

> On a separate note:
> The backend code currently spawns an external command for token validation.
> As we discussed before, an extension hook would be a more efficient
> extensibility option.
> We see clients make 10k+ connections using OAuth tokens per minute to
> our service, and stating external processes would be too much overhead
> here.

+1. I'm curious, though -- what language do you expect to use to write
a production validator hook? Surely not low-level C...?

> > 5) Does this maintenance tradeoff (full control over the client vs. a
> > large amount of RFC-governed code) seem like it could be okay?
>
> It's nice for psql to have Device Code flow. Can be made even more
> convenient with refresh tokens support.
> And for clients on resource constrained devices to be able to
> authenticate with Client Credentials (app secret) without bringing
> more dependencies.
>
> In most other cases, upstream PostgreSQL drivers written in higher
> level languages have libraries / abstractions to implement OAUTH flows
> for the platforms they support.

Yeah, I'm really interested in seeing which existing high-level flows
can be mixed in through a driver. Trying not to get too far ahead of
myself :D

Thanks for the review!

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Jul 7, 2023 at 6:01 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Fri, Jul 7, 2023 at 4:57 AM Jacob Champion <jchampion@timescale.com> wrote:
> > On Wed, Jul 5, 2023 at 3:07 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > > BTW I will happily do the epoll->kqueue port work if necessary.
> >
> > And I will happily take you up on that; thanks!
>
> Some initial hacking, about 2 coffees' worth:
> https://github.com/macdice/postgres/commits/oauth-kqueue
>
> This compiles on FreeBSD and macOS, but I didn't have time to figure
> out all your Python testing magic so I don't know if it works yet and
> it's still red on CI...

This is awesome, thank you!

I need to look into the CI more, but it looks like the client tests
are passing, which is a good sign. (I don't understand why the
server-side tests are failing on FreeBSD, but they shouldn't be using
the libpq code at all, so I think your kqueue implementation is in the
clear. Cirrus doesn't have the logs from the server-side test failures
anywhere -- probably a bug in my Meson patch.)

> one thing I wondered about is the *altsock =
> timerfd part which I couldn't do.

I did that because I'm not entirely sure that libcurl is guaranteed to
have cleared out all its sockets from the mux, and I didn't want to
invite spurious wakeups. I should probably verify whether or not
that's possible. If so, we could just make that code resilient to
early wakeup, so that it matters less, or set up a second kqueue that
only holds the timer if that turns out to be unacceptable?

> The situation on macOS is a little odd: the man page says EVFILT_TIMER
> is not implemented.  But clearly it is, we can read the source code as
> I had to do to find out which unit of time it defaults to[1] (huh,
> Apple's github repo for Darwin appears to have been archived recently
> -- no more source code updates?  that'd be a shame!), and it works
> exactly as expected in simple programs.  So I would just assume it
> works until we see evidence otherwise.  (We already use a couple of
> other things on macOS more or less by accident because configure finds
> them, where they are undocumented or undeclared.)

Huh. Something to keep an eye on... might be a problem with older versions?

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Jul 10, 2023 at 4:50 PM Jacob Champion <jchampion@timescale.com> wrote:
> I don't understand why the
> server-side tests are failing on FreeBSD, but they shouldn't be using
> the libpq code at all, so I think your kqueue implementation is in the
> clear.

Oh, whoops, it's just the missed CLOEXEC flag in the final patch. (If
the write side of the pipe gets copied around, it hangs open and the
validator never sees the "end" of the token.) I'll switch the logic
around to set the flag on the write side instead of unsetting it on
the read side.

I have a WIP patch that passes tests on FreeBSD, which I'll clean up
and post Sometime Soon. macOS builds now but still fails before it
runs the test; looks like it's having trouble finding OpenSSL during
`pip install` of the test modules...

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Thomas Munro
Date:
On Wed, Jul 12, 2023 at 5:50 AM Jacob Champion <jchampion@timescale.com> wrote:
> Oh, whoops, it's just the missed CLOEXEC flag in the final patch. (If
> the write side of the pipe gets copied around, it hangs open and the
> validator never sees the "end" of the token.) I'll switch the logic
> around to set the flag on the write side instead of unsetting it on
> the read side.

Oops, sorry about that.  Glad to hear it's all working!

(FTR my parenthetical note about macOS/XNU sources on Github was a
false alarm: the "apple" account has stopped publishing a redundant
copy of that, but "apple-oss-distributions" is the account I should
have been looking at and it is live.  I guess it migrated at some
point, or something.  Phew.)



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> >   - Parameters are strings, so callback is not provided yet.
> > 2. Client gets PgConn from PQconnectStart return value and updates
> > conn->async_auth to its own callback.
>
> This is where some sort of official authn callback registration (see
> above reply to Daniele) would probably come in handy.
+1

> > 3. Client polls PQconnectPoll and checks conn->sasl_state until the
> > value is SASL_ASYNC
>
> In my head, the client's custom callback would always be invoked
> during the call to PQconnectPoll, rather than making the client do
> work in between calls. That way, a client can use custom flows even
> with a synchronous PQconnectdb().
The way I see this API working is the asynchronous client needs at least 2 PQConnectPoll calls:
1. To be notified of what the authentication requirements are and get parameters.
2. When it acquires the token, the callback is used to inform libpq of the token and return PGRES_POLLING_OK.

For the synchronous client, the callback implementation would need to be aware of the fact that synchronous implementation invokes callback frequently and be implemented accordingly.

Bottom lime, I don't see much problem with the current proposal. Just the way of callback to know that OAUTH token is requested and get parameters relies on PQconnectPoll being invoked after corresponding parameters of conn object are populated.

> > > 5. Expectations on async_auth:
> > >     a. It returns PGRES_POLLING_READING while token acquisition is going on
> > >     b. It returns PGRES_POLLING_OK and sets conn->sasl_state->token
> > > when token acquisition succeeds.
> >
> > Yes. Though the token should probably be returned through some
> > explicit part of the callback, now that you mention it...
>
> > 6. Is the client supposed to do anything with the altsock parameter?
>
> The callback needs to set the altsock up with a select()able
> descriptor, which wakes up the client when more work is ready to be
> done. Without that, you can't handle multiple connections on a single
> thread.

Ok, thanks for clarification.

> > On a separate note:
> > The backend code currently spawns an external command for token validation.
> > As we discussed before, an extension hook would be a more efficient
> > extensibility option.
> > We see clients make 10k+ connections using OAuth tokens per minute to
> > our service, and stating external processes would be too much overhead
> > here.

> +1. I'm curious, though -- what language do you expect to use to write
> a production validator hook? Surely not low-level C...?

For the server side code, it would likely be identity providers publishing extensions to validate their tokens.
Those can do that in C too. Or extensions now can be implemented in Rust using pgrx. Which is developer friendly enough in my opinion.

> Yeah, I'm really interested in seeing which existing high-level flows
> can be mixed in through a driver. Trying not to get too far ahead of
> myself :D

I can think of the following as the most common:
1. Authorization code with PKCE. This is by far the most common for the user login flows. Requires to spin up a browser and listen to redirect URL/Port. Most high level platforms have libraries to do both.
2. Client Certificates. This requires an identity provider specific library to construct and sign the token. The providers publish SDKs to do that for most common app development platforms.

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Jul 11, 2023 at 10:50 AM Jacob Champion
<jchampion@timescale.com> wrote:
> I have a WIP patch that passes tests on FreeBSD, which I'll clean up
> and post Sometime Soon. macOS builds now but still fails before it
> runs the test; looks like it's having trouble finding OpenSSL during
> `pip install` of the test modules...

Hi Thomas,

v9 folds in your kqueue implementation (thanks again!) and I have a
quick question to check my understanding:

> +       case CURL_POLL_REMOVE:
> +           /*
> +            * We don't know which of these is currently registered, perhaps
> +            * both, so we try to remove both.  This means we need to tolerate
> +            * ENOENT below.
> +            */
> +           EV_SET(&ev[nev], socket, EVFILT_READ, EV_DELETE, 0, 0, 0);
> +           nev++;
> +           EV_SET(&ev[nev], socket, EVFILT_WRITE, EV_DELETE, 0, 0, 0);
> +           nev++;
> +           break;

We're not setting EV_RECEIPT for these -- is that because none of the
filters we're using are EV_CLEAR, and so it doesn't matter if we
accidentally pull pending events off the queue during the kevent() call?

v9 also improves the Cirrus debugging experience and fixes more issues
on macOS, so the tests should be green there now. The final patch in the
series works around what I think is a build bug in psycopg2 2.9 [1] for
the BSDs+meson.

Thanks,
--Jacob

[1] https://github.com/psycopg/psycopg2/issues/1599

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Thomas Munro
Date:
On Tue, Jul 18, 2023 at 11:55 AM Jacob Champion <jchampion@timescale.com> wrote:
> We're not setting EV_RECEIPT for these -- is that because none of the
> filters we're using are EV_CLEAR, and so it doesn't matter if we
> accidentally pull pending events off the queue during the kevent() call?

+1 for EV_RECEIPT ("just tell me about errors, don't drain any
events").  I had a vague memory that it caused portability problems.
Just checked... it was OpenBSD I was thinking of, but they finally
added that flag in 6.2 (2017).  Our older-than-that BF OpenBSD animal
recently retired so that should be fine.  (Yes, without EV_CLEAR it's
"level triggered" not "edge triggered" in epoll terminology, so the
way I had it was not broken, but the way you're suggesting would be
nicer.)  Note that you'll have to skip data == 0 (no error) too.

+ #ifdef HAVE_SYS_EVENT_H
+ /* macOS doesn't define the time unit macros, but uses milliseconds
by default. */
+ #ifndef NOTE_MSECONDS
+ #define NOTE_MSECONDS 0
+ #endif
+ #endif

While comparing the cousin OSs' man pages just now, I noticed that
it's not only macOS that lacks NOTE_MSECONDS, it's also OpenBSD and
NetBSD < 10.  Maybe just delete that cruft ^^^ and use literal 0 in
fflags directly.  FreeBSD, and recently also NetBSD, decided to get
fancy with high resolution timers, but 0 gets the traditional unit of
milliseconds on all platforms (I just wrote it like that because I
started from FreeBSD and didn't know the history/portability story).



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Jul 18, 2023 at 4:04 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Tue, Jul 18, 2023 at 11:55 AM Jacob Champion <jchampion@timescale.com> wrote:
> +1 for EV_RECEIPT ("just tell me about errors, don't drain any
> events").

Sounds good.

> While comparing the cousin OSs' man pages just now, I noticed that
> it's not only macOS that lacks NOTE_MSECONDS, it's also OpenBSD and
> NetBSD < 10.  Maybe just delete that cruft ^^^ and use literal 0 in
> fflags directly.

So I don't lose track of it, here's a v10 with those two changes.

Thanks!
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
v11 is a quick rebase over the recent Cirrus changes, and I've dropped
0006 now that psycopg2 can build against BSD/Meson setups (thanks Daniele!).

--Jacob
Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
v12 implements a first draft of a client hook, so applications can
replace either the device prompt or the entire OAuth flow. (Andrey and
Mahendrakar: hopefully this is close to what you need.) It also cleans
up some of the JSON tech debt.

Since (IMO) we don't want to introduce new hooks every time we make
improvements to the internal flows, the new hook is designed to
retrieve multiple pieces of data from the application. Clients either
declare their ability to get that data, or delegate the job to the
next link in the chain, which by default is a no-op. That lets us add
new data types to the end, and older clients will ignore them until
they're taught otherwise. (I'm trying hard not to over-engineer this,
but it seems like the concept of "give me some piece of data to
continue authenticating" could pretty easily subsume things like the
PQsslKeyPassHook if we wanted.)

The PQAUTHDATA_OAUTH_BEARER_TOKEN case is the one that replaces the
flow entirely, as discussed upthread. Your application gets the
discovery URI and the requested scope for the connection. It can then
either delegate back to libpq (e.g. if the issuer isn't one it can
help with), immediately return a token (e.g. if one is already cached
for the current user), or install a nonblocking callback to implement
a custom async flow. When the connection is closed (or fails), the
hook provides a cleanup function to free any resources it may have
allocated.

Thanks,
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Shlok Kyal
Date:
Hi,

On Fri, 3 Nov 2023 at 17:14, Jacob Champion <jchampion@timescale.com> wrote:
>
> v12 implements a first draft of a client hook, so applications can
> replace either the device prompt or the entire OAuth flow. (Andrey and
> Mahendrakar: hopefully this is close to what you need.) It also cleans
> up some of the JSON tech debt.

I went through CFbot and found that build is failing, links:

https://cirrus-ci.com/task/6061898244816896
https://cirrus-ci.com/task/6624848198238208
https://cirrus-ci.com/task/5217473314684928
https://cirrus-ci.com/task/6343373221527552

Just want to make sure you are aware of these failures.

Thanks,
Shlok Kumar Kyal



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Nov 3, 2023 at 5:28 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> Just want to make sure you are aware of these failures.

Thanks for the nudge! Looks like I need to reconcile with the changes
to JsonLexContext in 1c99cde2. I should be able to get to that next
week; in the meantime I'll mark it Waiting on Author.

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Nov 3, 2023 at 4:48 PM Jacob Champion <champion.p@gmail.com> wrote:
> Thanks for the nudge! Looks like I need to reconcile with the changes
> to JsonLexContext in 1c99cde2. I should be able to get to that next
> week; in the meantime I'll mark it Waiting on Author.

v13 rebases over latest. The JsonLexContext changes have simplified
0001 quite a bit, and there's probably a bit more minimization that
could be done.

Unfortunately the configure/Makefile build of libpq now seems to be
pulling in an `exit()` dependency in a way that Meson does not. (Or
maybe Meson isn't checking?) I still need to investigate that
difference and fix it, so I recommend Meson if you're looking to
test-drive a build.

Thanks,
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
Hi Jacob,

Wanted to follow up on one of the topics discussed here in the past:
Do you plan to support adding an extension hook to validate the token?

It would allow a more efficient integration, then spinning a separate process.

Thanks!
Andrey.

On Wed, Nov 8, 2023 at 11:00 AM Jacob Champion <champion.p@gmail.com> wrote:
On Fri, Nov 3, 2023 at 4:48 PM Jacob Champion <champion.p@gmail.com> wrote:
> Thanks for the nudge! Looks like I need to reconcile with the changes
> to JsonLexContext in 1c99cde2. I should be able to get to that next
> week; in the meantime I'll mark it Waiting on Author.

v13 rebases over latest. The JsonLexContext changes have simplified
0001 quite a bit, and there's probably a bit more minimization that
could be done.

Unfortunately the configure/Makefile build of libpq now seems to be
pulling in an `exit()` dependency in a way that Meson does not. (Or
maybe Meson isn't checking?) I still need to investigate that
difference and fix it, so I recommend Meson if you're looking to
test-drive a build.

Thanks,
--Jacob

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Nov 9, 2023 at 5:43 PM Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
> Do you plan to support adding an extension hook to validate the token?
>
> It would allow a more efficient integration, then spinning a separate process.

I think an API in the style of archive modules might probably be a
good way to go, yeah.

It's probably not very high on the list of priorities, though, since
the inputs and outputs are going to "look" the same whether you're
inside or outside of the server process. The client side is going to
need the bulk of the work/testing/validation. Speaking of which -- how
is the current PQauthDataHook design doing when paired with MS AAD
(er, Entra now I guess)? I haven't had an Azure test bed available for
a while.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 8 Nov 2023, at 20:00, Jacob Champion <champion.p@gmail.com> wrote:

> Unfortunately the configure/Makefile build of libpq now seems to be
> pulling in an `exit()` dependency in a way that Meson does not.

I believe this comes from the libcurl and specifically the ntlm_wb support
which is often enabled in system and package manager provided installations.
There isn't really a fix here apart from requiring a libcurl not compiled with
ntlm_wb support, or adding an exception to the exit() check in the Makefile.

Bringing this up with other curl developers to see if it could be fixed we
instead decided to deprecate the whole module as it's quirky and not used much.
This won't help with existing installations but at least it will be deprecated
and removed by the time v17 ships, so gating on a version shipped after its
removal will avoid it.

https://github.com/curl/curl/commit/04540f69cfd4bf16e80e7c190b645f1baf505a84

> (Or maybe Meson isn't checking?) I still need to investigate that
> difference and fix it, so I recommend Meson if you're looking to
> test-drive a build.

There is no corresponding check in the Meson build, which seems like a TODO.

--
Daniel Gustafsson




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Dec 5, 2023 at 1:44 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 8 Nov 2023, at 20:00, Jacob Champion <champion.p@gmail.com> wrote:
>
> > Unfortunately the configure/Makefile build of libpq now seems to be
> > pulling in an `exit()` dependency in a way that Meson does not.
>
> I believe this comes from the libcurl and specifically the ntlm_wb support
> which is often enabled in system and package manager provided installations.
> There isn't really a fix here apart from requiring a libcurl not compiled with
> ntlm_wb support, or adding an exception to the exit() check in the Makefile.
>
> Bringing this up with other curl developers to see if it could be fixed we
> instead decided to deprecate the whole module as it's quirky and not used much.
> This won't help with existing installations but at least it will be deprecated
> and removed by the time v17 ships, so gating on a version shipped after its
> removal will avoid it.
>
> https://github.com/curl/curl/commit/04540f69cfd4bf16e80e7c190b645f1baf505a84

Ooh, thank you for looking into that and fixing it!

> > (Or maybe Meson isn't checking?) I still need to investigate that
> > difference and fix it, so I recommend Meson if you're looking to
> > test-drive a build.
>
> There is no corresponding check in the Meson build, which seems like a TODO.

Okay, I'll look into that too when I get time.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
Hi all,

v14 rebases over latest and fixes a warning when assertions are
disabled. 0006 is temporary and hacks past a couple of issues I have
not yet root caused -- one of which makes me wonder if 0001 needs to
be considered alongside the recent pg_combinebackup and incremental
JSON work...?

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Feb 20, 2024 at 5:00 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> v14 rebases over latest and fixes a warning when assertions are
> disabled.

v15 is a housekeeping update that adds typedefs.list entries and runs
pgindent. It also includes a temporary patch from Daniel to get the
cfbot a bit farther (see above discussion on libcurl/exit).

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Feb 22, 2024 at 6:08 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> v15 is a housekeeping update that adds typedefs.list entries and runs
> pgindent.

v16 is more transformational!

Daniel contributed 0004, which completely replaces the
validator_command architecture with a C module API. This solves a
bunch of problems as discussed upthread and vastly simplifies the test
framework for the server side. 0004 also adds a set of Perl tests,
which will begin to subsume some of the Python server-side tests as I
get around to porting them. (@Daniel: 0005 is my diff against your
original patch, for review.)

0008 has been modified to quickfix the pgcommon linkage on the
Makefile side; my previous attempt at this only fixed Meson. The
patchset is now carrying a lot of squash-cruft, and I plan to flatten
it in the next version.

Thanks,
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Feb 27, 2024 at 11:20 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> This is done in v17, which is also now based on the two patches pulled
> out by Daniel in [1].

It looks like my patchset has been eaten by a malware scanner:

    550 Message content failed content scanning
(Sanesecurity.Foxhole.Mail_gz.UNOFFICIAL)

Was there a recent change to the lists? Is anyone able to see what the
actual error was so I don't do it again?

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
[Trying again, with all patches unzipped and the CC list temporarily
removed to avoid flooding people's inboxes. Original message follows.]

On Fri, Feb 23, 2024 at 5:01 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> The
> patchset is now carrying a lot of squash-cruft, and I plan to flatten
> it in the next version.

This is done in v17, which is also now based on the two patches pulled
out by Daniel in [1]. Besides the squashes, which make up most of the
range-diff, I've fixed a call to strncasecmp() which is not available
on Windows.

Daniel and I discussed trying a Python version of the test server,
since the standard library there should give us more goodies to work
with. A proof of concept is in 0009. I think the big question I have
for it is, how would we communicate what we want the server to do for
the test? (We could perhaps switch on magic values of the client ID?)
In the end I'd like to be testing close to 100% of the failure modes,
and that's likely to mean a lot of back-and-forth if the server
implementation isn't in the Perl process.

--Jacob

[1] https://postgr.es/m/flat/F51F8777-FAF5-49F2-BC5E-8F9EB423ECE0%40yesql.se

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 28 Feb 2024, at 15:05, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
>
> [Trying again, with all patches unzipped and the CC list temporarily
> removed to avoid flooding people's inboxes. Original message follows.]
>
> On Fri, Feb 23, 2024 at 5:01 PM Jacob Champion
> <jacob.champion@enterprisedb.com> wrote:
>> The
>> patchset is now carrying a lot of squash-cruft, and I plan to flatten
>> it in the next version.
>
> This is done in v17, which is also now based on the two patches pulled
> out by Daniel in [1]. Besides the squashes, which make up most of the
> range-diff, I've fixed a call to strncasecmp() which is not available
> on Windows.
>
> Daniel and I discussed trying a Python version of the test server,
> since the standard library there should give us more goodies to work
> with. A proof of concept is in 0009. I think the big question I have
> for it is, how would we communicate what we want the server to do for
> the test? (We could perhaps switch on magic values of the client ID?)
> In the end I'd like to be testing close to 100% of the failure modes,
> and that's likely to mean a lot of back-and-forth if the server
> implementation isn't in the Perl process.

Thanks for the new version, I'm digesting the test patches but for now I have a
few smaller comments:


+#define ALLOC(size) malloc(size)
I wonder if we should use pg_malloc_extended(size, MCXT_ALLOC_NO_OOM) instead
to self document the code.  We clearly don't want feature-parity with server-
side palloc here.  I know we use malloc in similar ALLOC macros so it's not
unique in that regard, but maybe?


+#ifdef FRONTEND
+               destroyPQExpBuffer(lex->errormsg);
+#else
+               pfree(lex->errormsg->data);
+               pfree(lex->errormsg);
+#endif
Wouldn't it be nicer if we abstracted this into a destroyStrVal function to a)
avoid the ifdefs and b) make it more like the rest of the new API?  While it's
only used in two places (close to each other) it's a shame to let the
underlying API bleed through the abstraction.


+   CURLM      *curlm;      /* top-level multi handle for cURL operations */
Nitpick, but curl is not capitalized cURL anymore (for some value of "anymore"
since it changed in 2016 [0]).  I do wonder if we should consistently write
"libcurl" as well since we don't use curl but libcurl.


+   PQExpBufferData     work_data;  /* scratch buffer for general use (remember
+                                      to clear out prior contents first!) */
This seems like asking for subtle bugs due to uncleared buffers bleeding into
another operation (especially since we are writing this data across the wire).
How about having an array the size of OAuthStep of unallocated buffers where
each step use it's own?  Storing the content of each step could also be useful
for debugging.  Looking at the statemachine here it's not an obvious change but
also not impossible.


+ * TODO: This disables DNS resolution timeouts unless libcurl has been
+ * compiled against alternative resolution support. We should check that.
curl_version_info() can be used to check for c-ares support.


+ * so you don't have to write out the error handling every time. They assume
+ * that they're embedded in a function returning bool, however.
It feels a bit iffy to encode the returntype in the macro, we can use the same
trick that DISABLE_SIGPIPE employs where a failaction is passed in.


+  if (!strcmp(name, field->name))
Project style is to test for (strcmp(x,y) == 0) rather than (!strcmp()) to
improve readability.


+  libpq_append_conn_error(conn, "out of memory");
While not introduced in this patch, it's not an ideal pattern to report "out of
memory" errors via a function which may allocate memory.


+  appendPQExpBufferStr(&conn->errorMessage,
+           libpq_gettext("server's error message contained an embedded NULL"));
We should maybe add ", discarding" or something similar after this string to
indicate that there was an actual error which has been thrown away, the error
wasn't that the server passed an embedded NULL.


+#ifdef USE_OAUTH
+       else if (strcmp(mechanism_buf.data, OAUTHBEARER_NAME) == 0 &&
+               !selected_mechanism)
I wonder if we instead should move the guards inside the statement and error
out with "not built with OAuth support" or something similar like how we do
with TLS and other optional components?


+   errdetail("Comma expected, but found character %s.",
+             sanitize_char(*p))));
The %s formatter should be wrapped like '%s' to indicate that the message part
is the character in question (and we can then reuse the translation since the
error message already exist for SCRAM).


+       temp = curl_slist_append(temp, "authorization_code");
+       if (!temp)
+           oom = true;
+
+       temp = curl_slist_append(temp, "implicit");
While not a bug per se, it reads a bit odd to call another operation that can
allocate memory when the oom flag has been set.  I think we can move some
things around a little to make it clearer.

The attached diff contains some (most?) of the above as a patch on top of your
v17, but as a .txt to keep the CFBot from munging on it.

--
Daniel Gustafsson


Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrew Dunstan
Date:
On 2024-02-28 We 09:05, Jacob Champion wrote:
>
> Daniel and I discussed trying a Python version of the test server,
> since the standard library there should give us more goodies to work
> with. A proof of concept is in 0009. I think the big question I have
> for it is, how would we communicate what we want the server to do for
> the test? (We could perhaps switch on magic values of the client ID?)
> In the end I'd like to be testing close to 100% of the failure modes,
> and that's likely to mean a lot of back-and-forth if the server
> implementation isn't in the Perl process.



Can you give some more details about what this python gadget would buy 
us? I note that there are a couple of CPAN modules that provide OAuth2 
servers, not sure if they would be of any use.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 28 Feb 2024, at 22:50, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 2024-02-28 We 09:05, Jacob Champion wrote:
>>
>> Daniel and I discussed trying a Python version of the test server,
>> since the standard library there should give us more goodies to work
>> with. A proof of concept is in 0009. I think the big question I have
>> for it is, how would we communicate what we want the server to do for
>> the test? (We could perhaps switch on magic values of the client ID?)
>> In the end I'd like to be testing close to 100% of the failure modes,
>> and that's likely to mean a lot of back-and-forth if the server
>> implementation isn't in the Perl process.
>
> Can you give some more details about what this python gadget would buy us? I note that there are a couple of CPAN
modulesthat provide OAuth2 servers, not sure if they would be of any use. 

The main benefit would be to be able to provide a full testharness without
adding any additional dependencies over what we already have (Python being
required by meson).  That should ideally make it easy to get good coverage from
BF animals as no installation is needed.

--
Daniel Gustafsson




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
[re-adding the CC list I dropped earlier]

On Wed, Feb 28, 2024 at 1:52 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 28 Feb 2024, at 22:50, Andrew Dunstan <andrew@dunslane.net> wrote:
> > Can you give some more details about what this python gadget would buy us? I note that there are a couple of CPAN
modulesthat provide OAuth2 servers, not sure if they would be of any use. 
>
> The main benefit would be to be able to provide a full testharness without
> adding any additional dependencies over what we already have (Python being
> required by meson).  That should ideally make it easy to get good coverage from
> BF animals as no installation is needed.

As an additional note, the test suite ideally needs to be able to
exercise failure modes where the provider itself is malfunctioning. So
we hand-roll responses rather than deferring to an external
OAuth/OpenID implementation, which adds HTTP and JSON dependencies at
minimum, and Python includes both. See also the discussion with
Stephen upthread [1].

(I do think it'd be nice to eventually include a prepackaged OAuth
server in the test suite, to stack coverage for the happy path and
further test interoperability.)

Thanks,
--Jacob

[1] https://postgr.es/m/CAAWbhmh%2B6q4t3P%2BwDmS%3DJuHBpcgF-VM2cXNft8XV02yk-cHCpQ%40mail.gmail.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 27 Feb 2024, at 20:20, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
>
> On Fri, Feb 23, 2024 at 5:01 PM Jacob Champion
> <jacob.champion@enterprisedb.com> wrote:
>> The
>> patchset is now carrying a lot of squash-cruft, and I plan to flatten
>> it in the next version.
>
> This is done in v17, which is also now based on the two patches pulled
> out by Daniel in [1]. Besides the squashes, which make up most of the
> range-diff, I've fixed a call to strncasecmp() which is not available
> on Windows.

Two quick questions:

+   /* TODO */
+   CHECK_SETOPT(actx, CURLOPT_WRITEDATA, stderr);
I might be missing something, but what this is intended for in
setup_curl_handles()?


--- /dev/null
+++ b/src/interfaces/libpq/fe-auth-oauth-iddawc.c
As discussed off-list I think we should leave iddawc support for later and
focus on getting one library properly supported to start with.  If you agree,
let's drop this from the patchset to make it easier to digest.  We should make
sure we keep pluggability such that another library can be supported though,
much like the libpq TLS support.

--
Daniel Gustafsson




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Wed, Feb 28, 2024 at 9:40 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> +#define ALLOC(size) malloc(size)
> I wonder if we should use pg_malloc_extended(size, MCXT_ALLOC_NO_OOM) instead
> to self document the code.  We clearly don't want feature-parity with server-
> side palloc here.  I know we use malloc in similar ALLOC macros so it's not
> unique in that regard, but maybe?

I have a vague recollection that linking fe_memutils into libpq
tripped the exit() checks, but I can try again and see.

> +#ifdef FRONTEND
> +               destroyPQExpBuffer(lex->errormsg);
> +#else
> +               pfree(lex->errormsg->data);
> +               pfree(lex->errormsg);
> +#endif
> Wouldn't it be nicer if we abstracted this into a destroyStrVal function to a)
> avoid the ifdefs and b) make it more like the rest of the new API?  While it's
> only used in two places (close to each other) it's a shame to let the
> underlying API bleed through the abstraction.

Good idea. I'll fold this from your patch into the next set (and do
the same for the ones I've marked +1 below).

> +   CURLM      *curlm;      /* top-level multi handle for cURL operations */
> Nitpick, but curl is not capitalized cURL anymore (for some value of "anymore"
> since it changed in 2016 [0]).  I do wonder if we should consistently write
> "libcurl" as well since we don't use curl but libcurl.

Huh, I missed that memo. Thanks -- that makes it much easier to type!

> +   PQExpBufferData     work_data;  /* scratch buffer for general use (remember
> +                                      to clear out prior contents first!) */
> This seems like asking for subtle bugs due to uncleared buffers bleeding into
> another operation (especially since we are writing this data across the wire).
> How about having an array the size of OAuthStep of unallocated buffers where
> each step use it's own?  Storing the content of each step could also be useful
> for debugging.  Looking at the statemachine here it's not an obvious change but
> also not impossible.

I like that idea; I'll give it a look.

> + * TODO: This disables DNS resolution timeouts unless libcurl has been
> + * compiled against alternative resolution support. We should check that.
> curl_version_info() can be used to check for c-ares support.

+1

> + * so you don't have to write out the error handling every time. They assume
> + * that they're embedded in a function returning bool, however.
> It feels a bit iffy to encode the returntype in the macro, we can use the same
> trick that DISABLE_SIGPIPE employs where a failaction is passed in.

+1

> +  if (!strcmp(name, field->name))
> Project style is to test for (strcmp(x,y) == 0) rather than (!strcmp()) to
> improve readability.

+1

> +  libpq_append_conn_error(conn, "out of memory");
> While not introduced in this patch, it's not an ideal pattern to report "out of
> memory" errors via a function which may allocate memory.

Does trying (and failing) to allocate more memory cause any harm? Best
case, we still have enough room in the errorMessage to fit the whole
error; worst case, we mark the errorMessage broken and then
PQerrorMessage() can handle it correctly.

> +  appendPQExpBufferStr(&conn->errorMessage,
> +           libpq_gettext("server's error message contained an embedded NULL"));
> We should maybe add ", discarding" or something similar after this string to
> indicate that there was an actual error which has been thrown away, the error
> wasn't that the server passed an embedded NULL.

+1

> +#ifdef USE_OAUTH
> +       else if (strcmp(mechanism_buf.data, OAUTHBEARER_NAME) == 0 &&
> +               !selected_mechanism)
> I wonder if we instead should move the guards inside the statement and error
> out with "not built with OAuth support" or something similar like how we do
> with TLS and other optional components?

This one seems like a step backwards. IIUC, the client can currently
handle a situation where the server returns multiple mechanisms
(though the server doesn't support that yet), and I'd really like to
make use of that property without making users upgrade libpq.

That said, it'd be good to have a more specific error message in the
case where we don't have a match...

> +   errdetail("Comma expected, but found character %s.",
> +             sanitize_char(*p))));
> The %s formatter should be wrapped like '%s' to indicate that the message part
> is the character in question (and we can then reuse the translation since the
> error message already exist for SCRAM).

+1

> +       temp = curl_slist_append(temp, "authorization_code");
> +       if (!temp)
> +           oom = true;
> +
> +       temp = curl_slist_append(temp, "implicit");
> While not a bug per se, it reads a bit odd to call another operation that can
> allocate memory when the oom flag has been set.  I think we can move some
> things around a little to make it clearer.

I'm not a huge fan of nested happy paths/pyramids of doom, but in this
case it's small enough that I'm not opposed. :D

> The attached diff contains some (most?) of the above as a patch on top of your
> v17, but as a .txt to keep the CFBot from munging on it.

Thanks very much! I plan to apply all but the USE_OAUTH guard change
(but let me know if you feel strongly about it).

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Feb 29, 2024 at 1:08 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> +   /* TODO */
> +   CHECK_SETOPT(actx, CURLOPT_WRITEDATA, stderr);
> I might be missing something, but what this is intended for in
> setup_curl_handles()?

Ah, that's cruft left over from early debugging, just so that I could
see what was going on. I'll remove it.

> --- /dev/null
> +++ b/src/interfaces/libpq/fe-auth-oauth-iddawc.c
> As discussed off-list I think we should leave iddawc support for later and
> focus on getting one library properly supported to start with.  If you agree,
> let's drop this from the patchset to make it easier to digest.  We should make
> sure we keep pluggability such that another library can be supported though,
> much like the libpq TLS support.

Agreed. The number of changes being folded into the next set is
already pretty big so I think this will wait until next+1.

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Feb 29, 2024 at 4:04 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> On Wed, Feb 28, 2024 at 9:40 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> > +       temp = curl_slist_append(temp, "authorization_code");
> > +       if (!temp)
> > +           oom = true;
> > +
> > +       temp = curl_slist_append(temp, "implicit");
> > While not a bug per se, it reads a bit odd to call another operation that can
> > allocate memory when the oom flag has been set.  I think we can move some
> > things around a little to make it clearer.
>
> I'm not a huge fan of nested happy paths/pyramids of doom, but in this
> case it's small enough that I'm not opposed. :D

I ended up rewriting this patch hunk a bit to handle earlier OOM
failures; let me know what you think.

--

v18 is the result of plenty of yak shaving now that the Windows build
is working. In addition to Daniel's changes as discussed upthread,
- I have rebased over v2 of the SASL-refactoring patches
- the last CompilerWarnings failure has been fixed
- the py.test suite now runs on Windows (but does not yet completely pass)
- py.test has been completely disabled for the 32-bit Debian test in
Cirrus; I don't know if there's a way to install 32-bit Python
side-by-side with 64-bit

We are now very, very close to green.

The new oauth_validator tests can't work on Windows, since the client
doesn't support OAuth there. The python/server tests can handle this
case, since they emulate the client behavior; do we want to try
something similar in Perl?

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Feb 29, 2024 at 5:08 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> We are now very, very close to green.

v19 gets us a bit closer by adding a missed import for Windows. I've
also removed iddawc support, so the client patch is lighter.

> The new oauth_validator tests can't work on Windows, since the client
> doesn't support OAuth there. The python/server tests can handle this
> case, since they emulate the client behavior; do we want to try
> something similar in Perl?

In addition to this question, I'm starting to notice intermittent
failures of the form

    error: ... failed to fetch OpenID discovery document: failed to
queue HTTP request

This corresponds to a TODO in the libcurl implementation -- if the
initial call to curl_multi_socket_action() reports that no handles are
running, I treated that as an error. But it looks like it's possible
for libcurl to finish a request synchronously if the remote responds
quickly enough, so that needs to change.

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Mar 1, 2024 at 9:46 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> v19 gets us a bit closer by adding a missed import for Windows. I've
> also removed iddawc support, so the client patch is lighter.

v20 fixes a bunch more TODOs:
1) the client initial response is validated more closely
2) the server's invalid_token parameters are properly escaped into the
containing JSON (though, eventually, we probably want to just reject
invalid HBA settings instead of passing them through to the client)
3) Windows-specific responses have been recorded in the test suite

While poking at item 2, I was reminded that there's an alternative way
to get OAuth parameters from the server, and it's subtly incompatible
with the OpenID spec because OpenID didn't follow the rules for
.well-known URI construction [1]. :( Some sort of knob will be
required to switch the behaviors.

I renamed the API for the validator module from res->authenticated to
res->authorized. Authentication is optional, but a validator *must*
check that the client it's talking to was authorized by the user to
access the server, whether or not the user is authenticated. (It may
additionally verify that the user is authorized to access the
database, or it may simply authenticate the user and defer to the
usermap.) Documenting that particular subtlety is going to be
interesting...

The tests now exercise different issuers for different users, which
will also be a good way to signal the server to respond in different
ways during the validator tests. It does raise the question: if a
third party provides an issuer-specific module, how do we switch
between that and some other module for a different user?

Andrew asked over at [2] if we could perhaps get 0001 in as well. I
think the main thing to figure out there is, is requiring linkage
against libpq (see 0008) going to be okay for the frontend binaries
that need JSON support? Or do we need to do something like moving
PQExpBuffer into src/common to simplify the dependency tree?

--Jacob

[1] https://www.rfc-editor.org/rfc/rfc8414.html#section-5
[2] https://www.postgresql.org/message-id/682c8fff-355c-a04f-57ac-81055c4ccda8%40dunslane.net

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
v21 is a quick rebase over HEAD, which has adopted a few pieces of
v20. I've also fixed a race condition in the tests.

On Mon, Mar 11, 2024 at 3:51 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> Andrew asked over at [2] if we could perhaps get 0001 in as well. I
> think the main thing to figure out there is, is requiring linkage
> against libpq (see 0008) going to be okay for the frontend binaries
> that need JSON support? Or do we need to do something like moving
> PQExpBuffer into src/common to simplify the dependency tree?

0001 has been pared down to the part that teaches jsonapi.c to use
PQExpBuffer and track out-of-memory conditions; the linkage questions
remain.

Thanks,
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 22 Mar 2024, at 19:21, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
>
> v21 is a quick rebase over HEAD, which has adopted a few pieces of
> v20. I've also fixed a race condition in the tests.

Thanks for the rebase, I have a few comments from working with it a bit:

In jsonapi.c, makeJsonLexContextCstringLen initialize a JsonLexContext with
palloc0 which would need to be ported over to use ALLOC for frontend code.  On
that note, the errorhandling in parse_oauth_json() for content-type checks
attempts to free the JsonLexContext even before it has been created.  Here we
can just return false.


-  echo 'libpq must not be calling any function which invokes exit'; exit 1; \
+  echo 'libpq must not be calling any function which invokes exit'; \
The offending codepath in libcurl was in the NTLM_WB module, a very old and
obscure form of NTLM support which was replaced (yet remained in the tree) a
long time ago by a full NTLM implementatin.  Based on the findings in this
thread it was deprecated with a removal date set to April 2024 [0].  A bug in
the 8.4.0 release however disconnected NTLM_WB from the build and given the
lack of complaints it was decided to leave as is, so we can base our libcurl
requirements on 8.4.0 while keeping the exit() check intact.


+  else if (strcasecmp(content_type, "application/json") != 0)
This needs to handle parameters as well since it will now fail if the charset
parameter is appended (which undoubtedly will be pretty common).  The easiest
way is probably to just verify the mediatype and skip the parameters since we
know it can only be charset?


+  /* TODO investigate using conn->Pfdebug and CURLOPT_DEBUGFUNCTION here */
+  CHECK_SETOPT(actx, CURLOPT_VERBOSE, 1L, return false);
+  CHECK_SETOPT(actx, CURLOPT_ERRORBUFFER, actx->curl_err, return false);
CURLOPT_ERRORBUFFER is the old and finicky way of extracting error messages, we
should absolutely move to using CURLOPT_DEBUGFUNCTION instead.


+  /* && response_code != 401 TODO */ )
Why is this marked with a TODO, do you remember?


+  print("# OAuth provider (PID $pid) is listening on port $port\n");
Code running under Test::More need to use diag() for printing non-test output
like this.


Another issue I have is the sheer size and the fact that so much code is
replaced by subsequent commits, so I took the liberty to squash some of this
down into something less daunting.  The attached v22 retains the 0001 and then
condenses the rest into two commits for frontent and backend parts.  I did drop
the Python pytest patch since I feel that it's unlikely to go in from this
thread (adding pytest seems worthy of its own thread and discussion), and the
weight of it makes this seem scarier than it is.  For using it, it can be
easily applied from the v21 patchset independently.  I did tweak the commit
message to match reality a bit better, but there is a lot of work left there.

The final patch contains fixes for all of the above review comments as well as
a some refactoring, smaller clean-ups and TODO fixing.  If these fixes are
accepted I'll incorporate them into the two commits.

Next I intend to work on writing documentation for this.

--
Daniel Gustafsson

[0] https://curl.se/dev/deprecate.html
[1] https://github.com/curl/curl/pull/12479


Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Mar 28, 2024 at 3:34 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> In jsonapi.c, makeJsonLexContextCstringLen initialize a JsonLexContext with
> palloc0 which would need to be ported over to use ALLOC for frontend code.

Seems reasonable (but see below, too).

> On
> that note, the errorhandling in parse_oauth_json() for content-type checks
> attempts to free the JsonLexContext even before it has been created.  Here we
> can just return false.

Agreed. They're zero-initialized, so freeJsonLexContext() is safe
IIUC, but it's clearer not to call the free function at all.

But for these additions:

> -   makeJsonLexContextCstringLen(&lex, resp->data, resp->len, PG_UTF8, true);
> +   if (!makeJsonLexContextCstringLen(&lex, resp->data, resp->len, PG_UTF8, true))
> +   {
> +       actx_error(actx, "out of memory");
> +       return false;
> +   }

...since we're using the stack-based API as opposed to the heap-based
API, they shouldn't be possible to hit. Any failures in createStrVal()
are deferred to parse time on purpose.

> -  echo 'libpq must not be calling any function which invokes exit'; exit 1; \
> +  echo 'libpq must not be calling any function which invokes exit'; \
> The offending codepath in libcurl was in the NTLM_WB module, a very old and
> obscure form of NTLM support which was replaced (yet remained in the tree) a
> long time ago by a full NTLM implementatin.  Based on the findings in this
> thread it was deprecated with a removal date set to April 2024 [0].  A bug in
> the 8.4.0 release however disconnected NTLM_WB from the build and given the
> lack of complaints it was decided to leave as is, so we can base our libcurl
> requirements on 8.4.0 while keeping the exit() check intact.

Of the Cirrus machines, it looks like only FreeBSD has a new enough
libcurl for that. Ubuntu won't until 24.04, Debian Bookworm doesn't
have it unless you're running backports, RHEL 9 is still on 7.x... I
think requiring libcurl 8 is effectively saying no one will be able to
use this for a long time. Is there an alternative?

> +  else if (strcasecmp(content_type, "application/json") != 0)
> This needs to handle parameters as well since it will now fail if the charset
> parameter is appended (which undoubtedly will be pretty common).  The easiest
> way is probably to just verify the mediatype and skip the parameters since we
> know it can only be charset?

Good catch. application/json no longer defines charsets officially
[1], so I think we should be able to just ignore them. The new
strncasecmp needs to handle a spurious prefix, too; I have that on my
TODO list.

> +  /* TODO investigate using conn->Pfdebug and CURLOPT_DEBUGFUNCTION here */
> +  CHECK_SETOPT(actx, CURLOPT_VERBOSE, 1L, return false);
> +  CHECK_SETOPT(actx, CURLOPT_ERRORBUFFER, actx->curl_err, return false);
> CURLOPT_ERRORBUFFER is the old and finicky way of extracting error messages, we
> should absolutely move to using CURLOPT_DEBUGFUNCTION instead.

This new way doesn't do the same thing. Here's a sample error:

    connection to server at "127.0.0.1", port 56619 failed: failed to
fetch OpenID discovery document: Weird server reply (  Trying
127.0.0.1:36647...
    Connected to localhost (127.0.0.1) port 36647 (#0)
    Mark bundle as not supporting multiuse
    HTTP 1.0, assume close after body
    Invalid Content-Length: value
    Closing connection 0
    )

IMO that's too much noise. Prior to the change, the same error would have been

    connection to server at "127.0.0.1", port 56619 failed: failed to
fetch OpenID discovery document: Weird server reply (Invalid
Content-Length: value)

The error buffer is finicky for sure, but it's also a generic one-line
explanation of what went wrong... Is there an alternative API for that
I'm missing?

> +  /* && response_code != 401 TODO */ )
> Why is this marked with a TODO, do you remember?

Yeah -- I have a feeling that 401s coming back are going to need more
helpful hints to the user, since it implies that libpq itself hasn't
authenticated correctly as opposed to some user-related auth failure.
I was hoping to find some sample behaviors in the wild and record
those into the suite.

> +  print("# OAuth provider (PID $pid) is listening on port $port\n");
> Code running under Test::More need to use diag() for printing non-test output
> like this.

Ah, thanks.

> +#if LIBCURL_VERSION_MAJOR <= 8 && LIBCURL_VERSION_MINOR < 4

I don't think this catches versions like 7.76, does it? Maybe
`LIBCURL_VERSION_MAJOR < 8 || (LIBCURL_VERSION_MAJOR == 8 &&
LIBCURL_VERSION_MINOR < 4)`, or else `LIBCURL_VERSION_NUM < 0x080400`?

>     my $pid = open(my $read_fh, "-|", $ENV{PYTHON}, "t/oauth_server.py")
> -       // die "failed to start OAuth server: $!";
> +       or die "failed to start OAuth server: $!";
>
> -   read($read_fh, $port, 7) // die "failed to read port number: $!";
> +   read($read_fh, $port, 7) or die "failed to read port number: $!";

The first hunk here looks good (thanks for the catch!) but I think the
second is not correct behavior. $! doesn't get set unless undef is
returned, if I'm reading the docs correctly. Yay Perl.

> +   /* Sanity check the previous operation */
> +   if (actx->running != 1)
> +   {
> +       actx_error(actx, "failed to queue HTTP request");
> +       return false;
> +   }

`running` can be set to zero on success, too. I'm having trouble
forcing that code path in a test so far, but we're going to have to do
something special in that case.

> Another issue I have is the sheer size and the fact that so much code is
> replaced by subsequent commits, so I took the liberty to squash some of this
> down into something less daunting.  The attached v22 retains the 0001 and then
> condenses the rest into two commits for frontent and backend parts.

Looks good.

> I did drop
> the Python pytest patch since I feel that it's unlikely to go in from this
> thread (adding pytest seems worthy of its own thread and discussion), and the
> weight of it makes this seem scarier than it is.

Until its coverage gets ported over, can we keep it as a `DO NOT
MERGE` patch? Otherwise there's not much to run in Cirrus.

> The final patch contains fixes for all of the above review comments as well as
> a some refactoring, smaller clean-ups and TODO fixing.  If these fixes are
> accepted I'll incorporate them into the two commits.
>
> Next I intend to work on writing documentation for this.

Awesome, thank you! I will start adding coverage to the new code paths.

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc7159#section-11