Thread: [PoC] Federated Authn/z with OAUTHBEARER

[PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
Hi all,

We've been working on ways to expand the list of third-party auth
methods that Postgres provides. Some example use cases might be "I want
to let anyone with a Google account read this table" or "let anyone who
belongs to this GitHub organization connect as a superuser".

Attached is a proof of concept that implements pieces of OAuth 2.0
federated authorization, via the OAUTHBEARER SASL mechanism from RFC
7628 [1]. Currently, only Linux is supported due to some ugly hacks in
the backend.

The architecture can support the following use cases, as long as your
OAuth issuer of choice implements the necessary specs, and you know how
to write a validator for your issuer's bearer tokens:

- Authentication only, where an external validator uses the bearer
token to determine the end user's identity, and Postgres decides
whether that user ID is authorized to connect via the standard pg_ident
user mapping.

- Authorization only, where the validator uses the bearer token to
determine the allowed roles for the end user, and then checks to make
sure that the connection's role is one of those. This bypasses pg_ident
and allows pseudonymous connections, where Postgres doesn't care who
you are as long as the token proves you're allowed to assume the role
you want.

- A combination, where the validator provides both an authn_id (for
later audits of database access) and an authorization decision based on
the bearer token and role provided.

It looks kinda like this during use:

    $ psql 'host=example.org oauth_client_id=f02c6361-0635-...'
    Visit https://oauth.example.org/login and enter the code: FPQ2-M4BG

= Quickstart =

For anyone who likes building and seeing green tests ASAP.

Prerequisite software:
- iddawc v0.9.9 [2], library and dev headers, for client support
- Python 3, for the test suite only

(Some newer distributions have dev packages for iddawc, but mine did
not.)

Configure using --with-oauth (and, if you've installed iddawc into a
non-standard location, be sure to use --with-includes and --with-
libraries. Make sure either rpath or LD_LIBRARY_PATH will get you what
you need). Install as usual.

To run the test suite, make sure the contrib/authn_id extension is
installed, then init and start your dev cluster. No other configuration
is required; the test will do it for you. Switch to the src/test/python
directory, point your PG* envvars to a superuser connection on the
cluster (so that a "bare" psql will connect automatically), and run
`make installcheck`.

= Production Setup =

(but don't use this in production, please)

Actually setting up a "real" system requires knowing the specifics of
your third-party issuer of choice. Your issuer MUST implement OpenID
Discovery and the OAuth Device Authorization flow! Seriously, check
this before spending a lot of time writing a validator against an
issuer that can't actually talk to libpq.

The broad strokes are as follows:

1. Register a new public client with your issuer to get an OAuth client
ID for libpq. You'll use this as the oauth_client_id in the connection
string. (If your issuer doesn't support public clients and gives you a
client secret, you can use the oauth_client_secret connection parameter
to provide that too.)

The client you register must be able to use a device authorization
flow; some issuers require additional setup for that.

2. Set up your HBA with the 'oauth' auth method, and set the 'issuer'
and 'scope' options. 'issuer' is the base URL identifying your third-
party issuer (for example, https://accounts.google.com), and 'scope' is
the set of OAuth scopes that the client and server will need to
authenticate and/or authorize the user (e.g. "openid email").

So a sample HBA line might look like

    host all all samehost oauth issuer="https://accounts.google.com" scope="openid email"

3. In postgresql.conf, set up an oauth_validator_command that's capable
of verifying bearer tokens and implements the validator protocol. This
is the hardest part. See below.

= Design =

On the client side, I've implemented the Device Authorization flow (RFC
8628, [3]). What this means in practice is that libpq reaches out to a
third-party issuer (e.g. Google, Azure, etc.), identifies itself with a
client ID, and requests permission to act on behalf of the end user.
The issuer responds with a login URL and a one-time code, which libpq
presents to the user using the notice hook. The end user then navigates
to that URL, presents their code, authenticates to the issuer, and
grants permission for libpq to retrieve a bearer token. libpq grabs a
token and sends it to the server for verification.

(The bearer token, in this setup, is essentially a plaintext password,
and you must secure it like you would a plaintext password. The token
has an expiration date and can be explicitly revoked, which makes it
slightly better than a password, but this is still a step backwards
from something like SCRAM with channel binding. There are ways to bind
a bearer token to a client certificate [4], which would mitigate the
risk of token theft -- but your issuer has to support that, and I
haven't found much support in the wild.)

The server side is where things get more difficult for the DBA. The
OAUTHBEARER spec has this to say about the server side implementation:

   The server validates the response according to the specification for
   the OAuth Access Token Types used.

And here's what the Bearer Token specification [5] says:

   This document does not specify the encoding or the contents of the
   token; hence, detailed recommendations about the means of
   guaranteeing token integrity protection are outside the scope of
   this document.

It's the Wild West. Every issuer does their own thing in their own
special way. Some don't really give you a way to introspect information
about a bearer token at all, because they assume that the issuer of the
token and the consumer of the token are essentially the same service.
Some major players provide their own custom libraries, implemented in
your-language-of-choice, to deal with their particular brand of magic.

So I punted and added the oauth_validator_command GUC. A token
validator command reads the bearer token from a file descriptor that's
passed to it, then does whatever magic is necessary to validate that
token and find out who owns it. Optionally, it can look at the role
that's being connected and make sure that the token authorizes the user
to actually use that role. Then it says yea or nay to Postgres, and
optionally tells the server who the user is so that their ID can be
logged and mapped through pg_ident.

(See the commit message in 0005 for a full description of the protocol.
The test suite also has two toy implementations that illustrate the
protocol, but they provide zero security.)

This is easily the worst part of the patch, not only because my
implementation is a bad hack on OpenPipeStream(), but because it
balances the security of the entire system on the shoulders of a DBA
who does not have time to read umpteen OAuth specifications cover to
cover. More thought and coding effort is needed here, but I didn't want
to gold-plate a bad design. I'm not sure what alternatives there are
within the rules laid out by OAUTHBEARER. And the system is _extremely_
flexible, in the way that only code that's maintained by somebody else
can be.

= Patchset Roadmap =

The seven patches can be grouped into three:

1. Prep

  0001 decouples the SASL code from the SCRAM implementation.
  0002 makes it possible to use common/jsonapi from the frontend.
  0003 lets the json_errdetail() result be freed, to avoid leaks.

2. OAUTHBEARER Implementation

  0004 implements the client with libiddawc.
  0005 implements server HBA support and oauth_validator_command.

3. Testing

  0006 adds a simple test extension to retrieve the authn_id.
  0007 adds the Python test suite I've been developing against.

The first three patches are, hopefully, generally useful outside of
this implementation, and I'll plan to register them in the next
commitfest. The middle two patches are the "interesting" pieces, and
I've split them into client and server for ease of understanding,
though neither is particularly useful without the other.

The last two patches grew out of a test suite that I originally built
to be able to exercise NSS corner cases at the protocol/byte level. It
was incredibly helpful during implementation of this new SASL
mechanism, since I could write the client and server independently of
each other and get high coverage of broken/malicious implementations.
It's based on pytest and Construct, and the Python 3 requirement might
turn some away, but I wanted to include it in case anyone else wanted
to hack on the code. src/test/python/README explains more.

= Thoughts/Reflections =

...in no particular order.

I picked OAuth 2.0 as my first experiment in federated auth mostly
because I was already familiar with pieces of it. I think SAML (via the
SAML20 mechanism, RFC 6595) would be a good companion to this proof of
concept, if there is general interest in federated deployments.

I don't really like the OAUTHBEARER spec, but I'm not sure there's a
better alternative. Everything is left as an exercise for the reader.
It's not particularly extensible. Standard OAuth is built for
authorization, not authentication, and from reading the RFC's history,
it feels like it was a hack to just get something working. New
standards like OpenID Connect have begun to fill in the gaps, but the
SASL mechanisms have not kept up. (The OPENID20 mechanism is, to my
understanding, unrelated/obsolete.) And support for helpful OIDC
features seems to be spotty in the real world.

The iddawc dependency for client-side OAuth was extremely helpful to
develop this proof of concept quickly, but I don't think it would be an
appropriate component to build a real feature on. It's extremely
heavyweight -- it incorporates a huge stack of dependencies, including
a logging framework and a web server, to implement features we would
probably never use -- and it's fairly difficult to debug in practice.
If a device authorization flow were the only thing that libpq needed to
support natively, I think we should just depend on a widely used HTTP
client, like libcurl or neon, and implement the minimum spec directly
against the existing test suite.

There are a huge number of other authorization flows besides Device
Authorization; most would involve libpq automatically opening a web
browser for you. I felt like that wasn't an appropriate thing for a
library to do by default, especially when one of the most important
clients is a command-line application. Perhaps there could be a hook
for applications to be able to override the builtin flow and substitute
their own.

Since bearer tokens are essentially plaintext passwords, the relevant
specs require the use of transport-level protection, and I think it'd
be wise for the client to require TLS to be in place before performing
the initial handshake or sending a token.

Not every OAuth issuer is also an OpenID Discovery provider, so it's
frustrating that OAUTHBEARER (which is purportedly an OAuth 2.0
feature) requires OIDD for real-world implementations. Perhaps we could
hack around this with a data: URI or something.

The client currently performs the OAuth login dance every single time a
connection is made, but a proper OAuth client would cache its tokens to
reuse later, and keep an eye on their expiration times. This would make
daily use a little more like that of Kerberos, but we would have to
design a way to create and secure a token cache on disk.

If you've read this far, thank you for your interest, and I hope you
enjoy playing with it!

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc7628
[2] https://github.com/babelouest/iddawc
[3] https://datatracker.ietf.org/doc/html/rfc8628
[4] https://datatracker.ietf.org/doc/html/rfc8705
[5] https://datatracker.ietf.org/doc/html/rfc6750#section-5.2

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Michael Paquier
Date:
On Tue, Jun 08, 2021 at 04:37:46PM +0000, Jacob Champion wrote:
> 1. Prep
>
>   0001 decouples the SASL code from the SCRAM implementation.
>   0002 makes it possible to use common/jsonapi from the frontend.
>   0003 lets the json_errdetail() result be freed, to avoid leaks.
>
> The first three patches are, hopefully, generally useful outside of
> this implementation, and I'll plan to register them in the next
> commitfest. The middle two patches are the "interesting" pieces, and
> I've split them into client and server for ease of understanding,
> though neither is particularly useful without the other.

Beginning with the beginning, could you spawn two threads for the
jsonapi rework and the SASL/SCRAM business?  I agree that these look
independently useful.  Glad to see someone improving the code with
SASL and SCRAM which are too inter-dependent now.  I saw in the RFCs
dedicated to OAUTH the need for the JSON part as well.

+#  define check_stack_depth()
+#  ifdef JSONAPI_NO_LOG
+#    define json_log_and_abort(...) \
+   do { fprintf(stderr, __VA_ARGS__); exit(1); } while(0)
+#  else
In patch 0002, this is the wrong approach.  libpq will not be able to
feed on such reports, and you cannot use any of the APIs from the
palloc() family either as these just fail on OOM.  libpq should be
able to know about the error, and would fill in the error back to the
application.  This abstraction is not necessary on HEAD as
pg_verifybackup is fine with this level of reporting.  My rough guess
is that we will need to split the existing jsonapi.c into two files,
one that can be used in shared libraries and a second that handles the
errors.

+           /* TODO: SASL_EXCHANGE_FAILURE with output is forbidden in SASL */
            if (result == SASL_EXCHANGE_SUCCESS)
                sendAuthRequest(port,
                            AUTH_REQ_SASL_FIN,
                            output,
                            outputlen);
Perhaps that's an issue we need to worry on its own?  I didn't recall
this part..
--
Michael

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Heikki Linnakangas
Date:
On 08/06/2021 19:37, Jacob Champion wrote:
> We've been working on ways to expand the list of third-party auth
> methods that Postgres provides. Some example use cases might be "I want
> to let anyone with a Google account read this table" or "let anyone who
> belongs to this GitHub organization connect as a superuser".

Cool!

> The iddawc dependency for client-side OAuth was extremely helpful to
> develop this proof of concept quickly, but I don't think it would be an
> appropriate component to build a real feature on. It's extremely
> heavyweight -- it incorporates a huge stack of dependencies, including
> a logging framework and a web server, to implement features we would
> probably never use -- and it's fairly difficult to debug in practice.
> If a device authorization flow were the only thing that libpq needed to
> support natively, I think we should just depend on a widely used HTTP
> client, like libcurl or neon, and implement the minimum spec directly
> against the existing test suite.

You could punt and let the application implement that stuff. I'm 
imagining that the application code would look something like this:

conn = PQconnectStartParams(...);
for (;;)
{
     status = PQconnectPoll(conn)
     switch (status)
     {
         case CONNECTION_SASL_TOKEN_REQUIRED:
             /* open a browser for the user, get token */
             token = open_browser()
             PQauthResponse(token);
             break;
         ...
     }
}

It would be nice to have a simple default implementation, though, for 
psql and all the other client applications that come with PostgreSQL itself.

> If you've read this far, thank you for your interest, and I hope you
> enjoy playing with it!

A few small things caught my eye in the backend oauth_exchange function:

> +       /* Handle the client's initial message. */
> +       p = strdup(input);

this strdup() should be pstrdup().

In the same function, there are a bunch of reports like this:

>                    ereport(ERROR,
> +                          (errcode(ERRCODE_PROTOCOL_VIOLATION),
> +                           errmsg("malformed OAUTHBEARER message"),
> +                           errdetail("Comma expected, but found character \"%s\".",
> +                                     sanitize_char(*p))));

I don't think the double quotes are needed here, because sanitize_char 
will return quotes if it's a single character. So it would end up 
looking like this: ... found character "'x'".

- Heikki



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
> On 08/06/2021 19:37, Jacob Champion wrote:
> > We've been working on ways to expand the list of third-party auth
> > methods that Postgres provides. Some example use cases might be "I want
> > to let anyone with a Google account read this table" or "let anyone who
> > belongs to this GitHub organization connect as a superuser".
> 
> Cool!

Glad you think so! :D

> > The iddawc dependency for client-side OAuth was extremely helpful to
> > develop this proof of concept quickly, but I don't think it would be an
> > appropriate component to build a real feature on. It's extremely
> > heavyweight -- it incorporates a huge stack of dependencies, including
> > a logging framework and a web server, to implement features we would
> > probably never use -- and it's fairly difficult to debug in practice.
> > If a device authorization flow were the only thing that libpq needed to
> > support natively, I think we should just depend on a widely used HTTP
> > client, like libcurl or neon, and implement the minimum spec directly
> > against the existing test suite.
> 
> You could punt and let the application implement that stuff. I'm 
> imagining that the application code would look something like this:
> 
> conn = PQconnectStartParams(...);
> for (;;)
> {
>      status = PQconnectPoll(conn)
>      switch (status)
>      {
>          case CONNECTION_SASL_TOKEN_REQUIRED:
>              /* open a browser for the user, get token */
>              token = open_browser()
>              PQauthResponse(token);
>              break;
>          ...
>      }
> }

I was toying with the idea of having a callback for libpq clients,
where they could take full control of the OAuth flow if they wanted to.
Doing it inline with PQconnectPoll seems like it would work too. It has
a couple of drawbacks that I can see:

- If a client isn't currently using a poll loop, they'd have to switch
to one to be able to use OAuth connections. Not a difficult change, but
considering all the other hurdles to making this work, I'm hoping to
minimize the hoop-jumping.

- A client would still have to receive a bunch of OAuth parameters from
some new libpq API in order to construct the correct URL to visit, so
the overall complexity for implementers might be higher than if we just
passed those params directly in a callback.

> It would be nice to have a simple default implementation, though, for 
> psql and all the other client applications that come with PostgreSQL itself.

I agree. I think having a bare-bones implementation in libpq itself
would make initial adoption *much* easier, and then if specific
applications wanted to have richer control over an authorization flow,
then they could implement that themselves with the aforementioned
callback.

The Device Authorization flow was the most minimal working
implementation I could find, since it doesn't require a web browser on
the system, just the ability to print a prompt to the console. But if
anyone knows of a better flow for this use case, I'm all ears.

> > If you've read this far, thank you for your interest, and I hope you
> > enjoy playing with it!
> 
> A few small things caught my eye in the backend oauth_exchange function:
> 
> > +       /* Handle the client's initial message. */
> > +       p = strdup(input);
> 
> this strdup() should be pstrdup().

Thanks, I'll fix that in the next re-roll.

> In the same function, there are a bunch of reports like this:
> 
> >                    ereport(ERROR,
> > +                          (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > +                           errmsg("malformed OAUTHBEARER message"),
> > +                           errdetail("Comma expected, but found character \"%s\".",
> > +                                     sanitize_char(*p))));
> 
> I don't think the double quotes are needed here, because sanitize_char 
> will return quotes if it's a single character. So it would end up 
> looking like this: ... found character "'x'".

I'll fix this too. Thanks!

--Jacob

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, 2021-06-18 at 13:07 +0900, Michael Paquier wrote:
> On Tue, Jun 08, 2021 at 04:37:46PM +0000, Jacob Champion wrote:
> > 1. Prep
> > 
> >   0001 decouples the SASL code from the SCRAM implementation.
> >   0002 makes it possible to use common/jsonapi from the frontend.
> >   0003 lets the json_errdetail() result be freed, to avoid leaks.
> > 
> > The first three patches are, hopefully, generally useful outside of
> > this implementation, and I'll plan to register them in the next
> > commitfest. The middle two patches are the "interesting" pieces, and
> > I've split them into client and server for ease of understanding,
> > though neither is particularly useful without the other.
> 
> Beginning with the beginning, could you spawn two threads for the
> jsonapi rework and the SASL/SCRAM business?

Done [1, 2]. I've copied your comments into those threads with my
responses, and I'll have them registered in commitfest shortly.

Thanks!
--Jacob

[1] https://www.postgresql.org/message-id/3d2a6f5d50e741117d6baf83eb67ebf1a8a35a11.camel%40vmware.com
[2] https://www.postgresql.org/message-id/a250d475ba1c0cc0efb7dfec8e538fcc77cdcb8e.camel%40vmware.com

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Michael Paquier
Date:
On Tue, Jun 22, 2021 at 11:26:03PM +0000, Jacob Champion wrote:
> Done [1, 2]. I've copied your comments into those threads with my
> responses, and I'll have them registered in commitfest shortly.

Thanks!
--
Michael

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, 2021-06-22 at 23:22 +0000, Jacob Champion wrote:
> On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
> > 
> > A few small things caught my eye in the backend oauth_exchange function:
> > 
> > > +       /* Handle the client's initial message. */
> > > +       p = strdup(input);
> > 
> > this strdup() should be pstrdup().
> 
> Thanks, I'll fix that in the next re-roll.
> 
> > In the same function, there are a bunch of reports like this:
> > 
> > >                    ereport(ERROR,
> > > +                          (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > > +                           errmsg("malformed OAUTHBEARER message"),
> > > +                           errdetail("Comma expected, but found character \"%s\".",
> > > +                                     sanitize_char(*p))));
> > 
> > I don't think the double quotes are needed here, because sanitize_char 
> > will return quotes if it's a single character. So it would end up 
> > looking like this: ... found character "'x'".
> 
> I'll fix this too. Thanks!

v2, attached, incorporates Heikki's suggested fixes and also rebases on
top of latest HEAD, which had the SASL refactoring changes committed
last month.

The biggest change from the last patchset is 0001, an attempt at
enabling jsonapi in the frontend without the use of palloc(), based on
suggestions by Michael and Tom from last commitfest. I've also made
some improvements to the pytest suite. No major changes to the OAuth
implementation yet.

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Zhihong Yu
Date:


On Wed, Aug 25, 2021 at 11:42 AM Jacob Champion <pchampion@vmware.com> wrote:
On Tue, 2021-06-22 at 23:22 +0000, Jacob Champion wrote:
> On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
> >
> > A few small things caught my eye in the backend oauth_exchange function:
> >
> > > +       /* Handle the client's initial message. */
> > > +       p = strdup(input);
> >
> > this strdup() should be pstrdup().
>
> Thanks, I'll fix that in the next re-roll.
>
> > In the same function, there are a bunch of reports like this:
> >
> > >                    ereport(ERROR,
> > > +                          (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > > +                           errmsg("malformed OAUTHBEARER message"),
> > > +                           errdetail("Comma expected, but found character \"%s\".",
> > > +                                     sanitize_char(*p))));
> >
> > I don't think the double quotes are needed here, because sanitize_char
> > will return quotes if it's a single character. So it would end up
> > looking like this: ... found character "'x'".
>
> I'll fix this too. Thanks!

v2, attached, incorporates Heikki's suggested fixes and also rebases on
top of latest HEAD, which had the SASL refactoring changes committed
last month.

The biggest change from the last patchset is 0001, an attempt at
enabling jsonapi in the frontend without the use of palloc(), based on
suggestions by Michael and Tom from last commitfest. I've also made
some improvements to the pytest suite. No major changes to the OAuth
implementation yet.

--Jacob
Hi,
For v2-0001-common-jsonapi-support-FRONTEND-clients.patch :

+   /* Clean up. */
+   termJsonLexContext(&lex); 

At the end of termJsonLexContext(), empty is copied to lex. For stack based JsonLexContext, the copy seems unnecessary.
Maybe introduce a boolean parameter for termJsonLexContext() to signal that the copy can be omitted ?

+#ifdef FRONTEND
+       /* make sure initialization succeeded */
+       if (lex->strval == NULL)
+           return JSON_OUT_OF_MEMORY;

Should PQExpBufferBroken(lex->strval) be used for the check ?

Thanks

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Zhihong Yu
Date:


On Wed, Aug 25, 2021 at 3:25 PM Zhihong Yu <zyu@yugabyte.com> wrote:


On Wed, Aug 25, 2021 at 11:42 AM Jacob Champion <pchampion@vmware.com> wrote:
On Tue, 2021-06-22 at 23:22 +0000, Jacob Champion wrote:
> On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
> >
> > A few small things caught my eye in the backend oauth_exchange function:
> >
> > > +       /* Handle the client's initial message. */
> > > +       p = strdup(input);
> >
> > this strdup() should be pstrdup().
>
> Thanks, I'll fix that in the next re-roll.
>
> > In the same function, there are a bunch of reports like this:
> >
> > >                    ereport(ERROR,
> > > +                          (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > > +                           errmsg("malformed OAUTHBEARER message"),
> > > +                           errdetail("Comma expected, but found character \"%s\".",
> > > +                                     sanitize_char(*p))));
> >
> > I don't think the double quotes are needed here, because sanitize_char
> > will return quotes if it's a single character. So it would end up
> > looking like this: ... found character "'x'".
>
> I'll fix this too. Thanks!

v2, attached, incorporates Heikki's suggested fixes and also rebases on
top of latest HEAD, which had the SASL refactoring changes committed
last month.

The biggest change from the last patchset is 0001, an attempt at
enabling jsonapi in the frontend without the use of palloc(), based on
suggestions by Michael and Tom from last commitfest. I've also made
some improvements to the pytest suite. No major changes to the OAuth
implementation yet.

--Jacob
Hi,
For v2-0001-common-jsonapi-support-FRONTEND-clients.patch :

+   /* Clean up. */
+   termJsonLexContext(&lex); 

At the end of termJsonLexContext(), empty is copied to lex. For stack based JsonLexContext, the copy seems unnecessary.
Maybe introduce a boolean parameter for termJsonLexContext() to signal that the copy can be omitted ?

+#ifdef FRONTEND
+       /* make sure initialization succeeded */
+       if (lex->strval == NULL)
+           return JSON_OUT_OF_MEMORY;

Should PQExpBufferBroken(lex->strval) be used for the check ?

Thanks
Hi,
For v2-0002-libpq-add-OAUTHBEARER-SASL-mechanism.patch :

+   i_init_session(&session);
+
+   if (!conn->oauth_client_id)
+   {
+       /* We can't talk to a server without a client identifier. */
+       appendPQExpBufferStr(&conn->errorMessage,
+                            libpq_gettext("no oauth_client_id is set for the connection"));
+       goto cleanup;

Can conn->oauth_client_id check be performed ahead of i_init_session() ? That way, ```goto cleanup``` can be replaced with return.

+       if (!error_code || (strcmp(error_code, "authorization_pending")
+                           && strcmp(error_code, "slow_down")))

What if, in the future, there is error code different from the above two which doesn't represent "OAuth token retrieval failed" scenario ?

For client_initial_response(),

+   token_buf = createPQExpBuffer();
+   if (!token_buf)
+       goto cleanup;

If token_buf is NULL, there doesn't seem to be anything to free. We can return directly.

Cheers

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Wed, 2021-08-25 at 15:25 -0700, Zhihong Yu wrote:
> 
> Hi,
> For v2-0001-common-jsonapi-support-FRONTEND-clients.patch :
> 
> +   /* Clean up. */
> +   termJsonLexContext(&lex); 
> 
> At the end of termJsonLexContext(), empty is copied to lex. For stack
> based JsonLexContext, the copy seems unnecessary.
> Maybe introduce a boolean parameter for termJsonLexContext() to
> signal that the copy can be omitted ?

Do you mean heap-based? i.e. destroyJsonLexContext() does an
unnecessary copy before free? Yeah, in that case it's not super useful,
but I think I'd want some evidence that the performance hit matters
before optimizing it.

Are there any other internal APIs that take a boolean parameter like
that? If not, I think we'd probably just want to remove the copy
entirely if it's a problem.

> +#ifdef FRONTEND
> +       /* make sure initialization succeeded */
> +       if (lex->strval == NULL)
> +           return JSON_OUT_OF_MEMORY;
> 
> Should PQExpBufferBroken(lex->strval) be used for the check ?

It should be okay to continue if the strval is broken but non-NULL,
since it's about to be reset. That has the fringe benefit of allowing
the function to go as far as possible without failing, though that's
probably a pretty weak justification.

In practice, do you think that the probability of success is low enough
that we should just short-circuit and be done with it?

On Wed, 2021-08-25 at 16:24 -0700, Zhihong Yu wrote:
> 
> For v2-0002-libpq-add-OAUTHBEARER-SASL-mechanism.patch :
> 
> +   i_init_session(&session);
> +
> +   if (!conn->oauth_client_id)
> +   {
> +       /* We can't talk to a server without a client identifier. */
> +       appendPQExpBufferStr(&conn->errorMessage,
> +                            libpq_gettext("no oauth_client_id is set for the connection"));
> +       goto cleanup;
> 
> Can conn->oauth_client_id check be performed ahead
> of i_init_session() ? That way, ```goto cleanup``` can be replaced
> with return.

Yeah, I think that makes sense. FYI, this is probably one of the
functions that will be rewritten completely once iddawc is removed.

> +       if (!error_code || (strcmp(error_code, "authorization_pending")
> +                           && strcmp(error_code, "slow_down")))
> 
> What if, in the future, there is error code different from the above
> two which doesn't represent "OAuth token retrieval failed" scenario ?

We'd have to update our code; that would be a breaking change to the
Device Authorization spec. Here's what it says today [1]:

   The "authorization_pending" and "slow_down" error codes define
   particularly unique behavior, as they indicate that the OAuth client
   should continue to poll the token endpoint by repeating the token
   request (implementing the precise behavior defined above).  If the
   client receives an error response with any other error code, it MUST
   stop polling and SHOULD react accordingly, for example, by displaying
   an error to the user.

> For client_initial_response(),
> 
> +   token_buf = createPQExpBuffer();
> +   if (!token_buf)
> +       goto cleanup;
> 
> If token_buf is NULL, there doesn't seem to be anything to free. We
> can return directly.

That's true today, but implementations have a habit of changing. I
personally prefer not to introduce too many exit points from a function
that's already using goto. In my experience, that makes future
maintenance harder.

Thanks for the reviews! Have you been able to give the patchset a try
with an OAuth deployment?

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc8628#section-3.5

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Zhihong Yu
Date:


On Thu, Aug 26, 2021 at 9:13 AM Jacob Champion <pchampion@vmware.com> wrote:
On Wed, 2021-08-25 at 15:25 -0700, Zhihong Yu wrote:
>
> Hi,
> For v2-0001-common-jsonapi-support-FRONTEND-clients.patch :
>
> +   /* Clean up. */
> +   termJsonLexContext(&lex);
>
> At the end of termJsonLexContext(), empty is copied to lex. For stack
> based JsonLexContext, the copy seems unnecessary.
> Maybe introduce a boolean parameter for termJsonLexContext() to
> signal that the copy can be omitted ?

Do you mean heap-based? i.e. destroyJsonLexContext() does an
unnecessary copy before free? Yeah, in that case it's not super useful,
but I think I'd want some evidence that the performance hit matters
before optimizing it.

Are there any other internal APIs that take a boolean parameter like
that? If not, I think we'd probably just want to remove the copy
entirely if it's a problem.

> +#ifdef FRONTEND
> +       /* make sure initialization succeeded */
> +       if (lex->strval == NULL)
> +           return JSON_OUT_OF_MEMORY;
>
> Should PQExpBufferBroken(lex->strval) be used for the check ?

It should be okay to continue if the strval is broken but non-NULL,
since it's about to be reset. That has the fringe benefit of allowing
the function to go as far as possible without failing, though that's
probably a pretty weak justification.

In practice, do you think that the probability of success is low enough
that we should just short-circuit and be done with it?

On Wed, 2021-08-25 at 16:24 -0700, Zhihong Yu wrote:
>
> For v2-0002-libpq-add-OAUTHBEARER-SASL-mechanism.patch :
>
> +   i_init_session(&session);
> +
> +   if (!conn->oauth_client_id)
> +   {
> +       /* We can't talk to a server without a client identifier. */
> +       appendPQExpBufferStr(&conn->errorMessage,
> +                            libpq_gettext("no oauth_client_id is set for the connection"));
> +       goto cleanup;
>
> Can conn->oauth_client_id check be performed ahead
> of i_init_session() ? That way, ```goto cleanup``` can be replaced
> with return.

Yeah, I think that makes sense. FYI, this is probably one of the
functions that will be rewritten completely once iddawc is removed.

> +       if (!error_code || (strcmp(error_code, "authorization_pending")
> +                           && strcmp(error_code, "slow_down")))
>
> What if, in the future, there is error code different from the above
> two which doesn't represent "OAuth token retrieval failed" scenario ?

We'd have to update our code; that would be a breaking change to the
Device Authorization spec. Here's what it says today [1]:

   The "authorization_pending" and "slow_down" error codes define
   particularly unique behavior, as they indicate that the OAuth client
   should continue to poll the token endpoint by repeating the token
   request (implementing the precise behavior defined above).  If the
   client receives an error response with any other error code, it MUST
   stop polling and SHOULD react accordingly, for example, by displaying
   an error to the user.

> For client_initial_response(),
>
> +   token_buf = createPQExpBuffer();
> +   if (!token_buf)
> +       goto cleanup;
>
> If token_buf is NULL, there doesn't seem to be anything to free. We
> can return directly.

That's true today, but implementations have a habit of changing. I
personally prefer not to introduce too many exit points from a function
that's already using goto. In my experience, that makes future
maintenance harder.

Thanks for the reviews! Have you been able to give the patchset a try
with an OAuth deployment?

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc8628#section-3.5
Hi,
bq. destroyJsonLexContext() does an unnecessary copy before free? Yeah, in that case it's not super useful,
but I think I'd want some evidence that the performance hit matters before optimizing it. 

Yes I agree.

bq. In practice, do you think that the probability of success is low enough that we should just short-circuit and be done with it?

Haven't had a chance to try your patches out yet.
I will leave this to people who are more familiar with OAuth implementation(s).

bq.  I personally prefer not to introduce too many exit points from a function that's already using goto.

Fair enough.

Cheers

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Michael Paquier
Date:
On Thu, Aug 26, 2021 at 04:13:08PM +0000, Jacob Champion wrote:
> Do you mean heap-based? i.e. destroyJsonLexContext() does an
> unnecessary copy before free? Yeah, in that case it's not super useful,
> but I think I'd want some evidence that the performance hit matters
> before optimizing it.

As an authentication code path, the impact is minimal and my take on
that would be to keep the code simple.  Now if you'd really wish to
stress that without relying on the backend, one simple way is to use
pgbench -C -n with a mostly-empty script (one meta-command) coupled
with some profiling.
--
Michael

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, 2021-08-27 at 11:32 +0900, Michael Paquier wrote:
> Now if you'd really wish to
> stress that without relying on the backend, one simple way is to use
> pgbench -C -n with a mostly-empty script (one meta-command) coupled
> with some profiling.

Ah, thanks! I'll add that to the toolbox.

--Jacob

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
Hi all,

v3 rebases this patchset over the top of Samay's pluggable auth
provider API [1], included here as patches 0001-3. The final patch in
the set ports the server implementation from a core feature to a
contrib module; to switch between the two approaches, simply leave out
that final patch.

There are still some backend changes that must be made to get this
working, as pointed out in 0009, and obviously libpq support still
requires code changes.

--Jacob

[1] https://www.postgresql.org/message-id/flat/CAJxrbyxTRn5P8J-p%2BwHLwFahK5y56PhK28VOb55jqMO05Y-DJw%40mail.gmail.com

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
samay sharma
Date:
Hi Jacob,

Thank you for porting this on top of the pluggable auth methods API. I've addressed the feedback around other backend changes in my latest patch, but the client side changes still remain. I had a few questions to understand them better.

(a) What specifically do the client side changes in the patch implement?
(b) Are the changes you made on the client side specific to OAUTH or are they about making SASL more generic? As an additional question, if someone wanted to implement something similar on top of your patch, would they still have to make client side changes?

Regards,
Samay

On Fri, Mar 4, 2022 at 11:13 AM Jacob Champion <pchampion@vmware.com> wrote:
Hi all,

v3 rebases this patchset over the top of Samay's pluggable auth
provider API [1], included here as patches 0001-3. The final patch in
the set ports the server implementation from a core feature to a
contrib module; to switch between the two approaches, simply leave out
that final patch.

There are still some backend changes that must be made to get this
working, as pointed out in 0009, and obviously libpq support still
requires code changes.

--Jacob

[1] https://www.postgresql.org/message-id/flat/CAJxrbyxTRn5P8J-p%2BwHLwFahK5y56PhK28VOb55jqMO05Y-DJw%40mail.gmail.com

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, 2022-03-22 at 14:48 -0700, samay sharma wrote:
> Thank you for porting this on top of the pluggable auth methods API.
> I've addressed the feedback around other backend changes in my latest
> patch, but the client side changes still remain. I had a few
> questions to understand them better.
> 
> (a) What specifically do the client side changes in the patch implement?

Hi Samay,

The client-side changes are an implementation of the OAuth 2.0 Device
Authorization Grant [1] in libpq. The majority of the OAuth logic is
handled by the third-party iddawc library.

The server tells the client what OIDC provider to contact, and then
libpq prompts you to log into that provider on your
smartphone/browser/etc. using a one-time code. After you give libpq
permission to act on your behalf, the Bearer token gets sent to libpq
via a direct connection, and libpq forwards it to the server so that
the server can determine whether you're allowed in.

> (b) Are the changes you made on the client side specific to OAUTH or
> are they about making SASL more generic?

The original patchset included changes to make SASL more generic. Many
of those changes have since been merged, and the remaining code is
mostly OAuth-specific, but there are still improvements to be made.
(And there's some JSON crud to sift through in the first couple of
patches. I'm still mad that the OAUTHBEARER spec requires clients to
parse JSON in the first place.)

> As an additional question,
> if someone wanted to implement something similar on top of your
> patch, would they still have to make client side changes?

Any new SASL mechanisms require changes to libpq at this point. You
need to implement a new pg_sasl_mech, modify pg_SASL_init() to select
the mechanism correctly, and add whatever connection string options you
need, along with the associated state in pg_conn. Patch 0004 has all
the client-side magic for OAUTHBEARER.

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc8628

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, 2022-03-04 at 19:13 +0000, Jacob Champion wrote:
> v3 rebases this patchset over the top of Samay's pluggable auth
> provider API [1], included here as patches 0001-3.

v4 rebases over the latest version of the pluggable auth patchset
(included as 0001-4). Note that there's a recent conflict as
of d4781d887; use an older commit as the base (or wait for the other
thread to be updated).

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
mahendrakar s
Date:
Hi  Hackers,

We are trying to implement AAD(Azure AD) support in PostgreSQL and it
can be achieved with support of the OAuth method. To support AAD on
top of OAuth in a generic fashion (i.e for all other OAuth providers),
we are proposing this patch. It basically exposes two new hooks (one
for error reporting and one for OAuth provider specific token
validation) and passing OAuth bearer token to backend. It also adds
support for client credentials flow of OAuth additional to device code
flow which Jacob has proposed.

The changes for each component are summarized below.

1.     Provider-specific extension:
        Each OAuth provider implements their own token validator as an
extension. Extension registers an OAuth provider hook which is matched
to a line in the HBA file.

2.     Add support to pass on the OAuth bearer token. In this
obtaining the bearer token is left to 3rd party application or user.

        ./psql -U <username> -d 'dbname=postgres
oauth_client_id=<client_id> oauth_bearer_token=<token>

3.     HBA: An additional param ‘provider’ is added for the oauth method.
        Defining "oauth" as method + passing provider, issuer endpoint
and expected audience

        * * * * oauth   provider=<token validation extension>
issuer=.... scope=....

4.     Engine Backend:
        Support for generic OAUTHBEARER type, requesting client to
provide token and passing to token for provider-specific extension.

5.     Engine Frontend: Two-tiered approach.
           a)      libpq transparently passes on the token received
from 3rd party client as is to the backend.
           b)      libpq optionally compiled for the clients which
explicitly need libpq to orchestrate OAuth communication with the
issuer (it depends heavily on 3rd party library iddawc as Jacob
already pointed out. The library seems to be supporting all the OAuth
flows.)

Please let us know your thoughts as the proposed method supports
different OAuth flows with the use of provider specific hooks. We
think that the proposal would be useful for various OAuth providers.

Thanks,
Mahendrakar.


On Tue, 20 Sept 2022 at 10:18, Jacob Champion <pchampion@vmware.com> wrote:
>
> On Tue, 2021-06-22 at 23:22 +0000, Jacob Champion wrote:
> > On Fri, 2021-06-18 at 11:31 +0300, Heikki Linnakangas wrote:
> > >
> > > A few small things caught my eye in the backend oauth_exchange function:
> > >
> > > > +       /* Handle the client's initial message. */
> > > > +       p = strdup(input);
> > >
> > > this strdup() should be pstrdup().
> >
> > Thanks, I'll fix that in the next re-roll.
> >
> > > In the same function, there are a bunch of reports like this:
> > >
> > > >                    ereport(ERROR,
> > > > +                          (errcode(ERRCODE_PROTOCOL_VIOLATION),
> > > > +                           errmsg("malformed OAUTHBEARER message"),
> > > > +                           errdetail("Comma expected, but found character \"%s\".",
> > > > +                                     sanitize_char(*p))));
> > >
> > > I don't think the double quotes are needed here, because sanitize_char
> > > will return quotes if it's a single character. So it would end up
> > > looking like this: ... found character "'x'".
> >
> > I'll fix this too. Thanks!
>
> v2, attached, incorporates Heikki's suggested fixes and also rebases on
> top of latest HEAD, which had the SASL refactoring changes committed
> last month.
>
> The biggest change from the last patchset is 0001, an attempt at
> enabling jsonapi in the frontend without the use of palloc(), based on
> suggestions by Michael and Tom from last commitfest. I've also made
> some improvements to the pytest suite. No major changes to the OAuth
> implementation yet.
>
> --Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
Hi Mahendrakar, thanks for your interest and for the patch!

On Mon, Sep 19, 2022 at 10:03 PM mahendrakar s
<mahendrakarforpg@gmail.com> wrote:
> The changes for each component are summarized below.
>
> 1.     Provider-specific extension:
>         Each OAuth provider implements their own token validator as an
> extension. Extension registers an OAuth provider hook which is matched
> to a line in the HBA file.

How easy is it to write a Bearer validator using C? My limited
understanding was that most providers were publishing libraries in
higher-level languages.

Along those lines, sample validators will need to be provided, both to
help in review and to get the pytest suite green again. (And coverage
for the new code is important, too.)

> 2.     Add support to pass on the OAuth bearer token. In this
> obtaining the bearer token is left to 3rd party application or user.
>
>         ./psql -U <username> -d 'dbname=postgres
> oauth_client_id=<client_id> oauth_bearer_token=<token>

This hurts, but I think people are definitely going to ask for it, given
the frightening practice of copy-pasting these (incredibly sensitive
secret) tokens all over the place... Ideally I'd like to implement
sender constraints for the Bearer token, to *prevent* copy-pasting (or,
you know, outright theft). But I'm not sure that sender constraints are
well-implemented yet for the major providers.

> 3.     HBA: An additional param ‘provider’ is added for the oauth method.
>         Defining "oauth" as method + passing provider, issuer endpoint
> and expected audience
>
>         * * * * oauth   provider=<token validation extension>
> issuer=.... scope=....

Naming aside (this conflicts with Samay's previous proposal, I think), I
have concerns about the implementation. There's this code:

> +        if (oauth_provider && oauth_provider->name)
> +        {
> +            ereport(ERROR,
> +                (errmsg("OAuth provider \"%s\" is already loaded.",
> +                    oauth_provider->name)));
> +        }

which appears to prevent loading more than one global provider. But
there's also code that deals with a provider list? (Again, it'd help to
have test code covering the new stuff.)

>            b)      libpq optionally compiled for the clients which
> explicitly need libpq to orchestrate OAuth communication with the
> issuer (it depends heavily on 3rd party library iddawc as Jacob
> already pointed out. The library seems to be supporting all the OAuth
> flows.)

Speaking of iddawc, I don't think it's a dependency we should choose to
rely on. For all the code that it has, it doesn't seem to provide
compatibility with several real-world providers.

Google, for one, chose not to follow the IETF spec it helped author, and
iddawc doesn't support its flavor of Device Authorization. At another
point, I think iddawc tried to decode Azure's Bearer tokens, which is
incorrect...

I haven't been able to check if those problems have been fixed in a
recent version, but if we're going to tie ourselves to a huge
dependency, I'd at least like to believe that said dependency is
battle-tested and solid, and personally I don't feel like iddawc is.

> -    auth_method = I_TOKEN_AUTH_METHOD_NONE;
> -    if (conn->oauth_client_secret && *conn->oauth_client_secret)
> -        auth_method = I_TOKEN_AUTH_METHOD_SECRET_BASIC;

This code got moved, but I'm not sure why? It doesn't appear to have
made a change to the logic.

> +    if (conn->oauth_client_secret && *conn->oauth_client_secret)
> +    {
> +        session_response_type = I_RESPONSE_TYPE_CLIENT_CREDENTIALS;
> +    }

Is this an Azure-specific requirement? Ideally a public client (which
psql is) shouldn't have to provide a secret to begin with, if I
understand that bit of the protocol correctly. I think Google also
required provider-specific changes in this part of the code, and
unfortunately I don't think they looked the same as yours.

We'll have to figure all that out... Standards are great; everyone has
one of their own. :)

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Sep 20, 2022 at 4:19 PM Jacob Champion <jchampion@timescale.com> wrote:
> > 2.     Add support to pass on the OAuth bearer token. In this
> > obtaining the bearer token is left to 3rd party application or user.
> >
> >         ./psql -U <username> -d 'dbname=postgres
> > oauth_client_id=<client_id> oauth_bearer_token=<token>
>
> This hurts, but I think people are definitely going to ask for it, given
> the frightening practice of copy-pasting these (incredibly sensitive
> secret) tokens all over the place...

After some further thought -- in this case, you already have an opaque
Bearer token (and therefore you already know, out of band, which
provider needs to be used), you're willing to copy-paste it from
whatever service you got it from, and you have an extension plugged
into Postgres on the backend that verifies this Bearer blob using some
procedure that Postgres knows nothing about.

Why do you need the OAUTHBEARER mechanism logic at that point? Isn't
that identical to a custom password scheme? It seems like that could
be handled completely by Samay's pluggable auth proposal.

--Jacob



RE: [EXTERNAL] Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovskiy
Date:
We can support both passing the token from an upstream client and libpq implementing OAUTH2 protocol to obtaining one.

Libpq implementing OAUTHBEARER is needed for community/3rd party tools to have user-friendly authentication experience:
1. For community client tools, like pg_admin, psql etc.
   Example experience: pg_admin would be able to open a popup dialog to authenticate customer and keep refresh token to
avoidasking the user frequently. 
2. For 3rd party connectors supporting generic OAUTH with any provider. Useful for datawiz clients, like Tableau or ETL
tools.Those can support both user and client OAUTH flows. 

Libpq passing toked directly from an upstream client is useful in other scenarios:
1. Enterprise clients, built with .Net / Java and using provider-specific authentication libraries, like MSAL for AAD.
Thosecan also support more advance provider-specific token acquisition flows. 
2. Resource-tight (like IoT) clients. Those can be compiled without optional libpq flag not including the iddawc or
otherdependency. 

Thanks!
Andrey.

-----Original Message-----
From: Jacob Champion <jchampion@timescale.com>
Sent: Wednesday, September 21, 2022 9:03 AM
To: mahendrakar s <mahendrakarforpg@gmail.com>
Cc: pgsql-hackers@postgresql.org; smilingsamay@gmail.com; andres@anarazel.de; Andrey Chudnovskiy
<Andrey.Chudnovskiy@microsoft.com>;Mahendrakar Srinivasarao <mahendrakars@microsoft.com> 
Subject: [EXTERNAL] Re: [PoC] Federated Authn/z with OAUTHBEARER

[You don't often get email from jchampion@timescale.com. Learn why this is important at
https://aka.ms/LearnAboutSenderIdentification] 

On Tue, Sep 20, 2022 at 4:19 PM Jacob Champion <jchampion@timescale.com> wrote:
> > 2.     Add support to pass on the OAuth bearer token. In this
> > obtaining the bearer token is left to 3rd party application or user.
> >
> >         ./psql -U <username> -d 'dbname=postgres
> > oauth_client_id=<client_id> oauth_bearer_token=<token>
>
> This hurts, but I think people are definitely going to ask for it,
> given the frightening practice of copy-pasting these (incredibly
> sensitive
> secret) tokens all over the place...

After some further thought -- in this case, you already have an opaque Bearer token (and therefore you already know,
outof band, which provider needs to be used), you're willing to copy-paste it from whatever service you got it from,
andyou have an extension plugged into Postgres on the backend that verifies this Bearer blob using some procedure that
Postgresknows nothing about. 

Why do you need the OAUTHBEARER mechanism logic at that point? Isn't that identical to a custom password scheme? It
seemslike that could be handled completely by Samay's pluggable auth proposal. 

--Jacob



Re: [EXTERNAL] Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Wed, Sep 21, 2022 at 3:10 PM Andrey Chudnovskiy
<Andrey.Chudnovskiy@microsoft.com> wrote:
> We can support both passing the token from an upstream client and libpq implementing OAUTH2 protocol to obtaining
one.

Right, I agree that we could potentially do both.

> Libpq passing toked directly from an upstream client is useful in other scenarios:
> 1. Enterprise clients, built with .Net / Java and using provider-specific authentication libraries, like MSAL for
AAD.Those can also support more advance provider-specific token acquisition flows.
 
> 2. Resource-tight (like IoT) clients. Those can be compiled without optional libpq flag not including the iddawc or
otherdependency.
 

What I don't understand is how the OAUTHBEARER mechanism helps you in
this case. You're short-circuiting the negotiation where the server
tells the client what provider to use and what scopes to request, and
instead you're saying "here's a secret string, just take it and
validate it with magic."

I realize the ability to pass an opaque token may be useful, but from
the server's perspective, I don't see what differentiates it from the
password auth method plus a custom authenticator plugin. Why pay for
the additional complexity of OAUTHBEARER if you're not going to use
it?

--Jacob



Re: [EXTERNAL] Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
First, My message from corp email wasn't displayed in the thread,
That is what Jacob replied to, let me post it here for context:

> We can support both passing the token from an upstream client and libpq implementing OAUTH2 protocol to obtain one.
>
> Libpq implementing OAUTHBEARER is needed for community/3rd party tools to have user-friendly authentication
experience:
>
> 1. For community client tools, like pg_admin, psql etc.
>   Example experience: pg_admin would be able to open a popup dialog to authenticate customers and keep refresh tokens
toavoid asking the user frequently.
 
> 2. For 3rd party connectors supporting generic OAUTH with any provider. Useful for datawiz clients, like Tableau or
ETLtools. Those can support both user and client OAUTH flows.
 
>
> Libpq passing toked directly from an upstream client is useful in other scenarios:
> 1. Enterprise clients, built with .Net / Java and using provider-specific authentication libraries, like MSAL for
AAD.Those can also support more advanced provider-specific token acquisition flows.
 
> 2. Resource-tight (like IoT) clients. Those can be compiled without the optional libpq flag not including the iddawc
orother dependency.
 

-----------------------------------------------------------------------------------------------------
On this:

> What I don't understand is how the OAUTHBEARER mechanism helps you in
> this case. You're short-circuiting the negotiation where the server
> tells the client what provider to use and what scopes to request, and
> instead you're saying "here's a secret string, just take it and
> validate it with magic."
>
> I realize the ability to pass an opaque token may be useful, but from
> the server's perspective, I don't see what differentiates it from the
> password auth method plus a custom authenticator plugin. Why pay for
> the additional complexity of OAUTHBEARER if you're not going to use
> it?

Yes, passing a token as a new auth method won't make much sense in
isolation. However:
1. Since OAUTHBEARER is supported in the ecosystem, passing a token as
a way to authenticate with OAUTHBEARER is more consistent (IMO), then
passing it as a password.
2. Validation on the backend side doesn't depend on whether the token
is obtained by libpq or transparently passed by the upstream client.
3. Single OAUTH auth method on the server side for both scenarios,
would allow both enterprise clients with their own Token acquisition
and community clients using libpq flows to connect as the same PG
users/roles.

On Wed, Sep 21, 2022 at 8:36 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Wed, Sep 21, 2022 at 3:10 PM Andrey Chudnovskiy
> <Andrey.Chudnovskiy@microsoft.com> wrote:
> > We can support both passing the token from an upstream client and libpq implementing OAUTH2 protocol to obtaining
one.
>
> Right, I agree that we could potentially do both.
>
> > Libpq passing toked directly from an upstream client is useful in other scenarios:
> > 1. Enterprise clients, built with .Net / Java and using provider-specific authentication libraries, like MSAL for
AAD.Those can also support more advance provider-specific token acquisition flows.
 
> > 2. Resource-tight (like IoT) clients. Those can be compiled without optional libpq flag not including the iddawc or
otherdependency.
 
>
> What I don't understand is how the OAUTHBEARER mechanism helps you in
> this case. You're short-circuiting the negotiation where the server
> tells the client what provider to use and what scopes to request, and
> instead you're saying "here's a secret string, just take it and
> validate it with magic."
>
> I realize the ability to pass an opaque token may be useful, but from
> the server's perspective, I don't see what differentiates it from the
> password auth method plus a custom authenticator plugin. Why pay for
> the additional complexity of OAUTHBEARER if you're not going to use
> it?
>
> --Jacob
>
>
>
>



Re: [EXTERNAL] Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On 9/21/22 21:55, Andrey Chudnovsky wrote:
> First, My message from corp email wasn't displayed in the thread,

I see it on the public archives [1]. Your client is choosing some pretty
confusing quoting tactics, though, which you may want to adjust. :D

I have what I'll call some "skeptical curiosity" here -- you don't need
to defend your use cases to me by any means, but I'd love to understand
more about them.

> Yes, passing a token as a new auth method won't make much sense in
> isolation. However:
> 1. Since OAUTHBEARER is supported in the ecosystem, passing a token as
> a way to authenticate with OAUTHBEARER is more consistent (IMO), then
> passing it as a password.

Agreed. It's probably not a very strong argument for the new mechanism,
though, especially if you're not using the most expensive code inside it.

> 2. Validation on the backend side doesn't depend on whether the token
> is obtained by libpq or transparently passed by the upstream client.

Sure.

> 3. Single OAUTH auth method on the server side for both scenarios,
> would allow both enterprise clients with their own Token acquisition
> and community clients using libpq flows to connect as the same PG
> users/roles.

Okay, this is a stronger argument. With that in mind, I want to revisit
your examples and maybe provide some counterproposals:

>> Libpq passing toked directly from an upstream client is useful in other scenarios:
>> 1. Enterprise clients, built with .Net / Java and using provider-specific authentication libraries, like MSAL for
AAD.Those can also support more advanced provider-specific token acquisition flows.
 

I can see that providing a token directly would help you work around
limitations in libpq's "standard" OAuth flows, whether we use iddawc or
not. And it's cheap in terms of implementation. But I have a feeling it
would fall apart rapidly with error cases, where the server is giving
libpq information via the OAUTHBEARER mechanism, but libpq can only
communicate to your wrapper through human-readable error messages on stderr.

This seems like clear motivation for client-side SASL plugins (which
were also discussed on Samay's proposal thread). That's a lot more
expensive to implement in libpq, but if it were hypothetically
available, wouldn't you rather your provider-specific code be able to
speak OAUTHBEARER directly with the server?

>> 2. Resource-tight (like IoT) clients. Those can be compiled without the optional libpq flag not including the iddawc
orother dependency.
 

I want to dig into this much more; resource-constrained systems are near
and dear to me. I can see two cases here:

Case 1: The device is an IoT client that wants to connect on its own
behalf. Why would you want to use OAuth in that case? And how would the
IoT device get its Bearer token to begin with? I'm much more used to
architectures that provision high-entropy secrets for this, whether
they're incredibly long passwords per device (in which case,
channel-bound SCRAM should be a fairly strong choice?) or client certs
(which can be better decentralized, but make for a lot of bookkeeping).

If the answer to that is, "we want an IoT client to be able to connect
using the same role as a person", then I think that illustrates a clear
need for SASL negotiation. That would let the IoT client choose
SCRAM-*-PLUS or EXTERNAL, and the person at the keyboard can choose
OAUTHBEARER. Then we have incredible flexibility, because you don't have
to engineer one mechanism to handle them all.

Case 2: The constrained device is being used as a jump point. So there's
an actual person at a keyboard, trying to get into a backend server
(maybe behind a firewall layer, etc.), and the middlebox is either not
web-connected or is incredibly tiny for some reason. That might be a
good use case for a copy-pasted Bearer token, but is there actual demand
for that use case? What motivation would you (or your end user) have for
choosing a fairly heavy, web-centric authentication method in such a
constrained environment?

Are there other resource-constrained use cases I've missed?

Thanks,
--Jacob

[1]
https://www.postgresql.org/message-id/MN0PR21MB31694BAC193ECE1807FD45358F4F9%40MN0PR21MB3169.namprd21.prod.outlook.com




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Mar 25, 2022 at 5:00 PM Jacob Champion <pchampion@vmware.com> wrote:
> v4 rebases over the latest version of the pluggable auth patchset
> (included as 0001-4). Note that there's a recent conflict as
> of d4781d887; use an older commit as the base (or wait for the other
> thread to be updated).

Here's a newly rebased v5. (They're all zipped now, which I probably
should have done a while back, sorry.)

- As before, 0001-4 are the pluggable auth set; they've now diverged
from the official version over on the other thread [1].
- I'm not sure that 0005 is still completely coherent after the
rebase, given the recent changes to jsonapi.c. But for now, the tests
are green, and that should be enough to keep the conversation going.
- 0008 will hopefully be obsoleted when the SYSTEM_USER proposal [2] lands.

Thanks,
--Jacob

[1] https://www.postgresql.org/message-id/CAJxrbyxgFzfqby%2BVRCkeAhJnwVZE50%2BZLPx0JT2TDg9LbZtkCg%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/7e692b8c-0b11-45db-1cad-3afc5b57409f@amazon.com

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
>>> Libpq passing toked directly from an upstream client is useful in other scenarios:
>>> 1. Enterprise clients, built with .Net / Java and using provider-specific authentication libraries, like MSAL for
AAD.Those can also support more advanced provider-specific token acquisition flows.
 

> I can see that providing a token directly would help you work around
> limitations in libpq's "standard" OAuth flows, whether we use iddawc or
> not. And it's cheap in terms of implementation. But I have a feeling it
> would fall apart rapidly with error cases, where the server is giving
> libpq information via the OAUTHBEARER mechanism, but libpq can only
> communicate to your wrapper through human-readable error messages on stderr.

For the providing token directly, that would be primarily used for
scenarios where the same party controls both the server and the client
side wrapper.
I.e. The client knows how to get a token for a particular principal
and doesn't need any additional information other than human readable
messages.
Please clarify the scenarios where you see this falling apart.

I can provide an example in the cloud world. We (Azure) as well as
other providers offer ways to obtain OAUTH tokens for
Service-to-Service communication at IAAS / PAAS level.
on Azure "Managed Identity" feature integrated in Compute VM allows a
client to make a local http call to get a token. VM itself manages the
certificate livecycle, as well as implements the corresponding OAUTH
flow.
This capability is used by both our 1st party PAAS offerings, as well
as 3rd party services deploying on VMs or managed K8S clusters.
Here, the client doesn't need libpq assistance in obtaining the token.

> This seems like clear motivation for client-side SASL plugins (which
> were also discussed on Samay's proposal thread). That's a lot more
> expensive to implement in libpq, but if it were hypothetically
> available, wouldn't you rather your provider-specific code be able to
> speak OAUTHBEARER directly with the server?

I generally agree that pluggable auth layers in libpq could be
beneficial. However, as you pointed out in Samay's thread, that would
require a new distribution model for libpq / clients to optionally
include provider-specific logic.

My optimistic plan here would be to implement several core OAUTH flows
in libpq core which would be generic enough to support major
enterprise OAUTH providers:
1. Client Credentials flow (Client_id + Client_secret) for backend applications.
2. Authorization Code Flow with PKCE and/or Device code flow for GUI
applications.

(2.) above would require a protocol between libpq and upstream clients
to exchange several messages.
Your patch includes a way for libpq to deliver to the client a message
about the next authentication steps, so planned to build on top of
that.

A little about scenarios, we look at.
What we're trying to achieve here is an easy integration path for
multiple players in the ecosystem:
- Managed PaaS Postgres providers (both us and multi-cloud solutions)
- SaaS providers deploying postgres on IaaS/PaaS providers' clouds
- Tools - pg_admin, psql and other ones.
- BI, ETL, Federation and other scenarios where postgres is used as
the data source.

If we can offer a provider agnostic solution for Backend <=> libpq <=>
Upstreal client path, we can have all players above build support for
OAUTH credentials, managed by the cloud provider of their choice.

For us, that would mean:
- Better administrator experience with pg_admin / psql handling of the
AAD (Azure Active Directory) authentication flows.
- Path for integration solutions using Postgres to build AAD
authentication in their management experience.
- Ability to use AAD identity provider for any Postgres deployments
other than our 1st party PaaS offering.
- Ability to offer github as the identity provider for PaaS Postgres offering.

Other players in the ecosystem above would be able to get the same benefits.

Does that make sense and possible without provider specific libpq plugin?

-------------------------
On resource constrained scenarios.
> I want to dig into this much more; resource-constrained systems are near
> and dear to me. I can see two cases here:

I just referred to the ability to compile libpq without extra
dependencies to save some kilobytes.
Not sure if OAUTH is widely used in those cases. It involves overhead
anyway, and requires the device to talk to an additional party (OAUTH
provider).
Likely Cert authentication is easier.
If needed, it can get libpq with full OAUTH support and use a client
code. But I didn't think about this scenario.

On Fri, Sep 23, 2022 at 3:39 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Fri, Mar 25, 2022 at 5:00 PM Jacob Champion <pchampion@vmware.com> wrote:
> > v4 rebases over the latest version of the pluggable auth patchset
> > (included as 0001-4). Note that there's a recent conflict as
> > of d4781d887; use an older commit as the base (or wait for the other
> > thread to be updated).
>
> Here's a newly rebased v5. (They're all zipped now, which I probably
> should have done a while back, sorry.)
>
> - As before, 0001-4 are the pluggable auth set; they've now diverged
> from the official version over on the other thread [1].
> - I'm not sure that 0005 is still completely coherent after the
> rebase, given the recent changes to jsonapi.c. But for now, the tests
> are green, and that should be enough to keep the conversation going.
> - 0008 will hopefully be obsoleted when the SYSTEM_USER proposal [2] lands.
>
> Thanks,
> --Jacob
>
> [1] https://www.postgresql.org/message-id/CAJxrbyxgFzfqby%2BVRCkeAhJnwVZE50%2BZLPx0JT2TDg9LbZtkCg%40mail.gmail.com
> [2] https://www.postgresql.org/message-id/flat/7e692b8c-0b11-45db-1cad-3afc5b57409f@amazon.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Sep 26, 2022 at 6:39 PM Andrey Chudnovsky
<achudnovskij@gmail.com> wrote:
> For the providing token directly, that would be primarily used for
> scenarios where the same party controls both the server and the client
> side wrapper.
> I.e. The client knows how to get a token for a particular principal
> and doesn't need any additional information other than human readable
> messages.
> Please clarify the scenarios where you see this falling apart.

The most concrete example I can see is with the OAUTHBEARER error
response. If you want to eventually handle differing scopes per role,
or different error statuses (which the proof-of-concept currently
hardcodes as `invalid_token`), then the client can't assume it knows
what the server is going to say there. I think that's true even if you
control both sides and are hardcoding the provider.

How should we communicate those pieces to a custom client when it's
passing a token directly? The easiest way I can see is for the custom
client to speak the OAUTHBEARER protocol directly (e.g. SASL plugin).
If you had to parse the libpq error message, I don't think that'd be
particularly maintainable.

> I can provide an example in the cloud world. We (Azure) as well as
> other providers offer ways to obtain OAUTH tokens for
> Service-to-Service communication at IAAS / PAAS level.
> on Azure "Managed Identity" feature integrated in Compute VM allows a
> client to make a local http call to get a token. VM itself manages the
> certificate livecycle, as well as implements the corresponding OAUTH
> flow.
> This capability is used by both our 1st party PAAS offerings, as well
> as 3rd party services deploying on VMs or managed K8S clusters.
> Here, the client doesn't need libpq assistance in obtaining the token.

Cool. To me that's the strongest argument yet for directly providing
tokens to libpq.

> My optimistic plan here would be to implement several core OAUTH flows
> in libpq core which would be generic enough to support major
> enterprise OAUTH providers:
> 1. Client Credentials flow (Client_id + Client_secret) for backend applications.
> 2. Authorization Code Flow with PKCE and/or Device code flow for GUI
> applications.

As long as it's clear to DBAs when to use which flow (because existing
documentation for that is hit-and-miss), I think it's reasonable to
eventually support multiple flows. Personally my preference would be
to start with one or two core flows, and expand outward once we're
sure that we do those perfectly. Otherwise the explosion of knobs and
buttons might be overwhelming, both to users and devs.

Related to the question of flows is the client implementation library.
I've mentioned that I don't think iddawc is production-ready. As far
as I'm aware, there is only one certified OpenID relying party written
in C, and that's... an Apache server plugin. That leaves us either
choosing an untested library, scouring the web for a "tested" library
(and hoping we're right in our assessment), or implementing our own
(which is going to tamp down enthusiasm for supporting many flows,
though that has its own set of benefits). If you know of any reliable
implementations with a C API, please let me know.

> (2.) above would require a protocol between libpq and upstream clients
> to exchange several messages.
> Your patch includes a way for libpq to deliver to the client a message
> about the next authentication steps, so planned to build on top of
> that.

Specifically it delivers that message to an end user. If you want a
generic machine client to be able to use that, then we'll need to talk
about how.

> A little about scenarios, we look at.
> What we're trying to achieve here is an easy integration path for
> multiple players in the ecosystem:
> - Managed PaaS Postgres providers (both us and multi-cloud solutions)
> - SaaS providers deploying postgres on IaaS/PaaS providers' clouds
> - Tools - pg_admin, psql and other ones.
> - BI, ETL, Federation and other scenarios where postgres is used as
> the data source.
>
> If we can offer a provider agnostic solution for Backend <=> libpq <=>
> Upstreal client path, we can have all players above build support for
> OAUTH credentials, managed by the cloud provider of their choice.

Well... I don't quite understand why we'd go to the trouble of
providing a provider-agnostic communication solution only to have
everyone write their own provider-specific client support. Unless
you're saying Microsoft would provide an officially blessed plugin for
the *server* side only, and Google would provide one of their own, and
so on.

The server side authorization is the only place where I think it makes
sense to specialize by default. libpq should remain agnostic, with the
understanding that we'll need to make hard decisions when a major
provider decides not to follow a spec.

> For us, that would mean:
> - Better administrator experience with pg_admin / psql handling of the
> AAD (Azure Active Directory) authentication flows.
> - Path for integration solutions using Postgres to build AAD
> authentication in their management experience.
> - Ability to use AAD identity provider for any Postgres deployments
> other than our 1st party PaaS offering.
> - Ability to offer github as the identity provider for PaaS Postgres offering.

GitHub is unfortunately a bit tricky, unless they've started
supporting OpenID recently?

> Other players in the ecosystem above would be able to get the same benefits.
>
> Does that make sense and possible without provider specific libpq plugin?

If the players involved implement the flows and follow the specs, yes.
That's a big "if", unfortunately. I think GitHub and Google are two
major players who are currently doing things their own way.

> I just referred to the ability to compile libpq without extra
> dependencies to save some kilobytes.
> Not sure if OAUTH is widely used in those cases. It involves overhead
> anyway, and requires the device to talk to an additional party (OAUTH
> provider).
> Likely Cert authentication is easier.
> If needed, it can get libpq with full OAUTH support and use a client
> code. But I didn't think about this scenario.

Makes sense. Thanks!

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> The most concrete example I can see is with the OAUTHBEARER error
> response. If you want to eventually handle differing scopes per role,
> or different error statuses (which the proof-of-concept currently
> hardcodes as `invalid_token`), then the client can't assume it knows
> what the server is going to say there. I think that's true even if you
> control both sides and are hardcoding the provider.

Ok, I see the point. It's related to the topic of communication
between libpq and the upstream client.


> How should we communicate those pieces to a custom client when it's
> passing a token directly? The easiest way I can see is for the custom
> client to speak the OAUTHBEARER protocol directly (e.g. SASL plugin).
> If you had to parse the libpq error message, I don't think that'd be
> particularly maintainable.

I agree that parsing the message is not a sustainable way.
Could you provide more details on the SASL plugin approach you propose?

Specifically, is this basically a set of extension hooks for the client side?
With the need for the client to be compiled with the plugins based on
the set of providers it needs.


> Well... I don't quite understand why we'd go to the trouble of
> providing a provider-agnostic communication solution only to have
> everyone write their own provider-specific client support. Unless
> you're saying Microsoft would provide an officially blessed plugin for
> the *server* side only, and Google would provide one of their own, and
> so on.

Yes, via extensions. Identity providers can open source extensions to
use their auth services outside of first party PaaS offerings.
For 3rd party Postgres PaaS or on premise deployments.


> The server side authorization is the only place where I think it makes
> sense to specialize by default. libpq should remain agnostic, with the
> understanding that we'll need to make hard decisions when a major
> provider decides not to follow a spec.

Completely agree with agnostic libpq. Though needs validation with
several major providers to know if this is possible.


> Specifically it delivers that message to an end user. If you want a
> generic machine client to be able to use that, then we'll need to talk
> about how.

Yes, that's what needs to be decided.
In both Device code and Authorization code scenarios, libpq and the
client would need to exchange a couple of pieces of metadata.
Plus, after success, the client should be able to access a refresh token for further use.

Can we implement a generic protocol like for this between libpq and the clients?

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Sep 30, 2022 at 7:47 AM Andrey Chudnovsky
<achudnovskij@gmail.com> wrote:
> > How should we communicate those pieces to a custom client when it's
> > passing a token directly? The easiest way I can see is for the custom
> > client to speak the OAUTHBEARER protocol directly (e.g. SASL plugin).
> > If you had to parse the libpq error message, I don't think that'd be
> > particularly maintainable.
>
> I agree that parsing the message is not a sustainable way.
> Could you provide more details on the SASL plugin approach you propose?
>
> Specifically, is this basically a set of extension hooks for the client side?
> With the need for the client to be compiled with the plugins based on
> the set of providers it needs.

That's a good question. I can see two broad approaches, with maybe
some ability to combine them into a hybrid:

1. If there turns out to be serious interest in having libpq itself
handle OAuth natively (with all of the web-facing code that implies,
and all of the questions still left to answer), then we might be able
to provide a "token hook" in the same way that we currently provide a
passphrase hook for OpenSSL keys. By default, libpq would use its
internal machinery to take the provider details, navigate its builtin
flow, and return the Bearer token. If you wanted to override that
behavior as a client, you could replace the builtin flow with your
own, by registering a set of callbacks.

2. Alternatively, OAuth support could be provided via a mechanism
plugin for some third-party SASL library (GNU libgsasl, Cyrus
libsasl2). We could provide an OAuth plugin in contrib that handles
the default flow. Other providers could publish their alternative
plugins to completely replace the OAUTHBEARER mechanism handling.

Approach (2) would make for some duplicated effort since every
provider has to write code to speak the OAUTHBEARER protocol. It might
simplify provider-specific distribution, since (at least for Cyrus) I
think you could build a single plugin that supports both the client
and server side. But it would be a lot easier to unknowingly (or
knowingly) break the spec, since you'd control both the client and
server sides. There would be less incentive to interoperate.

Finally, we could potentially take pieces from both, by having an
official OAuth mechanism plugin that provides a client-side hook to
override the flow. I have no idea if the benefits would offset the
costs of a plugin-for-a-plugin style architecture. And providers would
still be free to ignore it and just provide a full mechanism plugin
anyway.

> > Well... I don't quite understand why we'd go to the trouble of
> > providing a provider-agnostic communication solution only to have
> > everyone write their own provider-specific client support. Unless
> > you're saying Microsoft would provide an officially blessed plugin for
> > the *server* side only, and Google would provide one of their own, and
> > so on.
>
> Yes, via extensions. Identity providers can open source extensions to
> use their auth services outside of first party PaaS offerings.
> For 3rd party Postgres PaaS or on premise deployments.

Sounds reasonable.

> > The server side authorization is the only place where I think it makes
> > sense to specialize by default. libpq should remain agnostic, with the
> > understanding that we'll need to make hard decisions when a major
> > provider decides not to follow a spec.
>
> Completely agree with agnostic libpq. Though needs validation with
> several major providers to know if this is possible.

Agreed.

> > Specifically it delivers that message to an end user. If you want a
> > generic machine client to be able to use that, then we'll need to talk
> > about how.
>
> Yes, that's what needs to be decided.
> In both Device code and Authorization code scenarios, libpq and the
> client would need to exchange a couple of pieces of metadata.
> Plus, after success, the client should be able to access a refresh token for further use.
>
> Can we implement a generic protocol like for this between libpq and the clients?

I think we can probably prototype a callback hook for approach (1)
pretty quickly. (2) is a lot more work and investigation, but it's
work that I'm interested in doing (when I get the time). I think there
are other very good reasons to consider a third-party SASL library,
and some good lessons to be learned, even if the community decides not
to go down that road.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> I think we can probably prototype a callback hook for approach (1)
> pretty quickly. (2) is a lot more work and investigation, but it's
> work that I'm interested in doing (when I get the time). I think there
> are other very good reasons to consider a third-party SASL library,
> and some good lessons to be learned, even if the community decides not
> to go down that road.

Makes sense. We will work on (1.) and do some check if there are any
blockers for a shared solution to support github and google.

On Fri, Sep 30, 2022 at 1:45 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Fri, Sep 30, 2022 at 7:47 AM Andrey Chudnovsky
> <achudnovskij@gmail.com> wrote:
> > > How should we communicate those pieces to a custom client when it's
> > > passing a token directly? The easiest way I can see is for the custom
> > > client to speak the OAUTHBEARER protocol directly (e.g. SASL plugin).
> > > If you had to parse the libpq error message, I don't think that'd be
> > > particularly maintainable.
> >
> > I agree that parsing the message is not a sustainable way.
> > Could you provide more details on the SASL plugin approach you propose?
> >
> > Specifically, is this basically a set of extension hooks for the client side?
> > With the need for the client to be compiled with the plugins based on
> > the set of providers it needs.
>
> That's a good question. I can see two broad approaches, with maybe
> some ability to combine them into a hybrid:
>
> 1. If there turns out to be serious interest in having libpq itself
> handle OAuth natively (with all of the web-facing code that implies,
> and all of the questions still left to answer), then we might be able
> to provide a "token hook" in the same way that we currently provide a
> passphrase hook for OpenSSL keys. By default, libpq would use its
> internal machinery to take the provider details, navigate its builtin
> flow, and return the Bearer token. If you wanted to override that
> behavior as a client, you could replace the builtin flow with your
> own, by registering a set of callbacks.
>
> 2. Alternatively, OAuth support could be provided via a mechanism
> plugin for some third-party SASL library (GNU libgsasl, Cyrus
> libsasl2). We could provide an OAuth plugin in contrib that handles
> the default flow. Other providers could publish their alternative
> plugins to completely replace the OAUTHBEARER mechanism handling.
>
> Approach (2) would make for some duplicated effort since every
> provider has to write code to speak the OAUTHBEARER protocol. It might
> simplify provider-specific distribution, since (at least for Cyrus) I
> think you could build a single plugin that supports both the client
> and server side. But it would be a lot easier to unknowingly (or
> knowingly) break the spec, since you'd control both the client and
> server sides. There would be less incentive to interoperate.
>
> Finally, we could potentially take pieces from both, by having an
> official OAuth mechanism plugin that provides a client-side hook to
> override the flow. I have no idea if the benefits would offset the
> costs of a plugin-for-a-plugin style architecture. And providers would
> still be free to ignore it and just provide a full mechanism plugin
> anyway.
>
> > > Well... I don't quite understand why we'd go to the trouble of
> > > providing a provider-agnostic communication solution only to have
> > > everyone write their own provider-specific client support. Unless
> > > you're saying Microsoft would provide an officially blessed plugin for
> > > the *server* side only, and Google would provide one of their own, and
> > > so on.
> >
> > Yes, via extensions. Identity providers can open source extensions to
> > use their auth services outside of first party PaaS offerings.
> > For 3rd party Postgres PaaS or on premise deployments.
>
> Sounds reasonable.
>
> > > The server side authorization is the only place where I think it makes
> > > sense to specialize by default. libpq should remain agnostic, with the
> > > understanding that we'll need to make hard decisions when a major
> > > provider decides not to follow a spec.
> >
> > Completely agree with agnostic libpq. Though needs validation with
> > several major providers to know if this is possible.
>
> Agreed.
>
> > > Specifically it delivers that message to an end user. If you want a
> > > generic machine client to be able to use that, then we'll need to talk
> > > about how.
> >
> > Yes, that's what needs to be decided.
> > In both Device code and Authorization code scenarios, libpq and the
> > client would need to exchange a couple of pieces of metadata.
> > Plus, after success, the client should be able to access a refresh token for further use.
> >
> > Can we implement a generic protocol like for this between libpq and the clients?
>
> I think we can probably prototype a callback hook for approach (1)
> pretty quickly. (2) is a lot more work and investigation, but it's
> work that I'm interested in doing (when I get the time). I think there
> are other very good reasons to consider a third-party SASL library,
> and some good lessons to be learned, even if the community decides not
> to go down that road.
>
> Thanks,
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
mahendrakar s
Date:
Hi,


We validated on  libpq handling OAuth natively with different flows
with different OIDC certified providers.

Flows: Device Code, Client Credentials and Refresh Token.
Providers: Microsoft, Google and Okta.
Also validated with OAuth provider Github.

We propose using OpenID Connect (OIDC) as the protocol, instead of
OAuth, as it is:
- Discovery mechanism to bridge the differences and provide metadata.
- Stricter protocol and certification process to reliably identify
which providers can be supported.
- OIDC is designed for authentication, while the main purpose of OAUTH is to
authorize applications on behalf of the user.

Github is not OIDC certified, so won’t be supported with this proposal.
However, it may be supported in the future through the ability for the
extension to provide custom discovery document content.

OpenID configuration has a well-known discovery mechanism
for the provider configuration URI which is
defined in OpenID Connect. It allows libpq to fetch
metadata about provider (i.e endpoints, supported grants, response types, etc).

In the attached patch (based on V2 patch in the thread and does not
contain Samay's changes):
- Provider can configure issuer url and scope through the options hook.)
- Server passes on an open discovery url and scope to libpq.
- Libpq handles OAuth flow based on the flow_type sent in the
connection string [1].
- Added callbacks to notify a structure to client tools if OAuth flow
requires user interaction.
- Pg backend uses hooks to validate bearer token.

Note that authentication code flow with PKCE for GUI clients is not
implemented yet.

Proposed next steps:
- Broaden discussion to reach agreement on the approach.
- Implement libpq changes without iddawc
- Prototype GUI flow with pgAdmin

Thanks,
Mahendrakar.

[1]:
connection string for refresh token flow:
./psql -U <user> -d 'dbname=postgres oauth_client_id=<client_id>
oauth_flow_type=<flowtype>  oauth_refresh_token=<refresh token>'

On Mon, 3 Oct 2022 at 23:34, Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
>
> > I think we can probably prototype a callback hook for approach (1)
> > pretty quickly. (2) is a lot more work and investigation, but it's
> > work that I'm interested in doing (when I get the time). I think there
> > are other very good reasons to consider a third-party SASL library,
> > and some good lessons to be learned, even if the community decides not
> > to go down that road.
>
> Makes sense. We will work on (1.) and do some check if there are any
> blockers for a shared solution to support github and google.
>
> On Fri, Sep 30, 2022 at 1:45 PM Jacob Champion <jchampion@timescale.com> wrote:
> >
> > On Fri, Sep 30, 2022 at 7:47 AM Andrey Chudnovsky
> > <achudnovskij@gmail.com> wrote:
> > > > How should we communicate those pieces to a custom client when it's
> > > > passing a token directly? The easiest way I can see is for the custom
> > > > client to speak the OAUTHBEARER protocol directly (e.g. SASL plugin).
> > > > If you had to parse the libpq error message, I don't think that'd be
> > > > particularly maintainable.
> > >
> > > I agree that parsing the message is not a sustainable way.
> > > Could you provide more details on the SASL plugin approach you propose?
> > >
> > > Specifically, is this basically a set of extension hooks for the client side?
> > > With the need for the client to be compiled with the plugins based on
> > > the set of providers it needs.
> >
> > That's a good question. I can see two broad approaches, with maybe
> > some ability to combine them into a hybrid:
> >
> > 1. If there turns out to be serious interest in having libpq itself
> > handle OAuth natively (with all of the web-facing code that implies,
> > and all of the questions still left to answer), then we might be able
> > to provide a "token hook" in the same way that we currently provide a
> > passphrase hook for OpenSSL keys. By default, libpq would use its
> > internal machinery to take the provider details, navigate its builtin
> > flow, and return the Bearer token. If you wanted to override that
> > behavior as a client, you could replace the builtin flow with your
> > own, by registering a set of callbacks.
> >
> > 2. Alternatively, OAuth support could be provided via a mechanism
> > plugin for some third-party SASL library (GNU libgsasl, Cyrus
> > libsasl2). We could provide an OAuth plugin in contrib that handles
> > the default flow. Other providers could publish their alternative
> > plugins to completely replace the OAUTHBEARER mechanism handling.
> >
> > Approach (2) would make for some duplicated effort since every
> > provider has to write code to speak the OAUTHBEARER protocol. It might
> > simplify provider-specific distribution, since (at least for Cyrus) I
> > think you could build a single plugin that supports both the client
> > and server side. But it would be a lot easier to unknowingly (or
> > knowingly) break the spec, since you'd control both the client and
> > server sides. There would be less incentive to interoperate.
> >
> > Finally, we could potentially take pieces from both, by having an
> > official OAuth mechanism plugin that provides a client-side hook to
> > override the flow. I have no idea if the benefits would offset the
> > costs of a plugin-for-a-plugin style architecture. And providers would
> > still be free to ignore it and just provide a full mechanism plugin
> > anyway.
> >
> > > > Well... I don't quite understand why we'd go to the trouble of
> > > > providing a provider-agnostic communication solution only to have
> > > > everyone write their own provider-specific client support. Unless
> > > > you're saying Microsoft would provide an officially blessed plugin for
> > > > the *server* side only, and Google would provide one of their own, and
> > > > so on.
> > >
> > > Yes, via extensions. Identity providers can open source extensions to
> > > use their auth services outside of first party PaaS offerings.
> > > For 3rd party Postgres PaaS or on premise deployments.
> >
> > Sounds reasonable.
> >
> > > > The server side authorization is the only place where I think it makes
> > > > sense to specialize by default. libpq should remain agnostic, with the
> > > > understanding that we'll need to make hard decisions when a major
> > > > provider decides not to follow a spec.
> > >
> > > Completely agree with agnostic libpq. Though needs validation with
> > > several major providers to know if this is possible.
> >
> > Agreed.
> >
> > > > Specifically it delivers that message to an end user. If you want a
> > > > generic machine client to be able to use that, then we'll need to talk
> > > > about how.
> > >
> > > Yes, that's what needs to be decided.
> > > In both Device code and Authorization code scenarios, libpq and the
> > > client would need to exchange a couple of pieces of metadata.
> > > Plus, after success, the client should be able to access a refresh token for further use.
> > >
> > > Can we implement a generic protocol like for this between libpq and the clients?
> >
> > I think we can probably prototype a callback hook for approach (1)
> > pretty quickly. (2) is a lot more work and investigation, but it's
> > work that I'm interested in doing (when I get the time). I think there
> > are other very good reasons to consider a third-party SASL library,
> > and some good lessons to be learned, even if the community decides not
> > to go down that road.
> >
> > Thanks,
> > --Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On 11/23/22 01:58, mahendrakar s wrote:
> We validated on  libpq handling OAuth natively with different flows
> with different OIDC certified providers.
> 
> Flows: Device Code, Client Credentials and Refresh Token.
> Providers: Microsoft, Google and Okta.

Great, thank you!

> Also validated with OAuth provider Github.

(How did you get discovery working? I tried this and had to give up
eventually.)

> We propose using OpenID Connect (OIDC) as the protocol, instead of
> OAuth, as it is:
> - Discovery mechanism to bridge the differences and provide metadata.
> - Stricter protocol and certification process to reliably identify
> which providers can be supported.
> - OIDC is designed for authentication, while the main purpose of OAUTH is to
> authorize applications on behalf of the user.

How does this differ from the previous proposal? The OAUTHBEARER SASL
mechanism already relies on OIDC for discovery. (I think that decision
is confusing from an architectural and naming standpoint, but I don't
think they really had an alternative...)

> Github is not OIDC certified, so won’t be supported with this proposal.
> However, it may be supported in the future through the ability for the
> extension to provide custom discovery document content.

Right.

> OpenID configuration has a well-known discovery mechanism
> for the provider configuration URI which is
> defined in OpenID Connect. It allows libpq to fetch
> metadata about provider (i.e endpoints, supported grants, response types, etc).

Sure, but this is already how the original PoC works. The test suite
implements an OIDC provider, for instance. Is there something different
to this that I'm missing?

> In the attached patch (based on V2 patch in the thread and does not
> contain Samay's changes):
> - Provider can configure issuer url and scope through the options hook.)
> - Server passes on an open discovery url and scope to libpq.
> - Libpq handles OAuth flow based on the flow_type sent in the
> connection string [1].
> - Added callbacks to notify a structure to client tools if OAuth flow
> requires user interaction.
> - Pg backend uses hooks to validate bearer token.

Thank you for the sample!

> Note that authentication code flow with PKCE for GUI clients is not
> implemented yet.
> 
> Proposed next steps:
> - Broaden discussion to reach agreement on the approach.

High-level thoughts on this particular patch (I assume you're not
looking for low-level implementation comments yet):

0) The original hook proposal upthread, I thought, was about allowing
libpq's flow implementation to be switched out by the application. I
don't see that approach taken here. It's fine if that turned out to be a
bad idea, of course, but this patch doesn't seem to match what we were
talking about.

1) I'm really concerned about the sudden explosion of flows. We went
from one flow (Device Authorization) to six. It's going to be hard
enough to validate that *one* flow is useful and can be securely
deployed by end users; I don't think we're going to be able to maintain
six, especially in combination with my statement that iddawc is not an
appropriate dependency for us.

I'd much rather give applications the ability to use their own OAuth
code, and then maintain within libpq only the flows that are broadly
useful. This ties back to (0) above.

2) Breaking the refresh token into its own pseudoflow is, I think,
passing the buck onto the user for something that's incredibly security
sensitive. The refresh token is powerful; I don't really want it to be
printed anywhere, let alone copy-pasted by the user. Imagine the
phishing opportunities.

If we want to support refresh tokens, I believe we should be developing
a plan to cache and secure them within the client. They should be used
as an accelerator for other flows, not as their own flow.

3) I don't like the departure from the OAUTHBEARER mechanism that's
presented here. For one, since I can't see a sample plugin that makes
use of the "flow type" magic numbers that have been added, I don't
really understand why the extension to the mechanism is necessary.

For two, if we think OAUTHBEARER is insufficient, the people who wrote
it would probably like to hear about it. Claiming support for a spec,
and then implementing an extension without review from the people who
wrote the spec, is not something I'm personally interested in doing.

4) The test suite is still broken, so it's difficult to see these things
in practice for review purposes.

> - Implement libpq changes without iddawc

This in particular will be much easier with a functioning test suite,
and with a smaller number of flows.

> - Prototype GUI flow with pgAdmin

Cool!

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> How does this differ from the previous proposal? The OAUTHBEARER SASL
> mechanism already relies on OIDC for discovery. (I think that decision
> is confusing from an architectural and naming standpoint, but I don't
> think they really had an alternative...)
Mostly terminology questions here. OAUTHBEARER SASL appears to be the
spec about using OAUTH2 tokens for Authentication.
While any OAUTH2 can generally work, we propose to specifically
highlight that only OIDC providers can be supported, as we need the
discovery document.
And we won't be able to support Github under that requirement.
Since the original patch used that too - no change on that, just
confirmation that we need OIDC compliance.

> 0) The original hook proposal upthread, I thought, was about allowing
> libpq's flow implementation to be switched out by the application. I
> don't see that approach taken here. It's fine if that turned out to be a
> bad idea, of course, but this patch doesn't seem to match what we were
> talking about.
We still plan to allow the client to pass the token. Which is a
generic way to implement its own OAUTH flows.

> 1) I'm really concerned about the sudden explosion of flows. We went
> from one flow (Device Authorization) to six. It's going to be hard
> enough to validate that *one* flow is useful and can be securely
> deployed by end users; I don't think we're going to be able to maintain
> six, especially in combination with my statement that iddawc is not an
> appropriate dependency for us.

> I'd much rather give applications the ability to use their own OAuth
> code, and then maintain within libpq only the flows that are broadly
> useful. This ties back to (0) above.
We consider the following set of flows to be minimum required:
- Client Credentials - For Service to Service scenarios.
- Authorization Code with PKCE - For rich clients,including pgAdmin.
- Device code - for psql (and possibly other non-GUI clients).
- Refresh code (separate discussion)
Which is pretty much the list described here:
https://oauth.net/2/grant-types/ and in OAUTH2 specs.
Client Credentials is very simple, so does Refresh Code.
If you prefer to pick one of the richer flows, Authorization code for
GUI scenarios is probably much more widely used.
Plus it's easier to implement too, as interaction goes through a
series of callbacks. No polling required.

> 2) Breaking the refresh token into its own pseudoflow is, I think,
> passing the buck onto the user for something that's incredibly security
> sensitive. The refresh token is powerful; I don't really want it to be
> printed anywhere, let alone copy-pasted by the user. Imagine the
> phishing opportunities.

> If we want to support refresh tokens, I believe we should be developing
> a plan to cache and secure them within the client. They should be used
> as an accelerator for other flows, not as their own flow.
It's considered a separate "grant_type" in the specs / APIs.
https://openid.net/specs/openid-connect-core-1_0.html#RefreshTokens

For the clients, it would be storing the token and using it to authenticate.
On the question of sensitivity, secure credentials stores are
different for each platform, with a lot of cloud offerings for this.
pgAdmin, for example, has its own way to secure credentials to avoid
asking users for passwords every time the app is opened.
I believe we should delegate the refresh token management to the clients.

>3) I don't like the departure from the OAUTHBEARER mechanism that's
> presented here. For one, since I can't see a sample plugin that makes
> use of the "flow type" magic numbers that have been added, I don't
> really understand why the extension to the mechanism is necessary.
I don't think it's much of a departure, but rather a separation of
responsibilities between libpq and upstream clients.
As libpq can be used in different apps, the client would need
different types of flows/grants.
I.e. those need to be provided to libpq at connection initialization
or some other point.
We will change to "grant_type" though and use string to be closer to the spec.
What do you think is the best way for the client to signal which OAUTH
flow should be used?

On Wed, Nov 23, 2022 at 12:05 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On 11/23/22 01:58, mahendrakar s wrote:
> > We validated on  libpq handling OAuth natively with different flows
> > with different OIDC certified providers.
> >
> > Flows: Device Code, Client Credentials and Refresh Token.
> > Providers: Microsoft, Google and Okta.
>
> Great, thank you!
>
> > Also validated with OAuth provider Github.
>
> (How did you get discovery working? I tried this and had to give up
> eventually.)
>
> > We propose using OpenID Connect (OIDC) as the protocol, instead of
> > OAuth, as it is:
> > - Discovery mechanism to bridge the differences and provide metadata.
> > - Stricter protocol and certification process to reliably identify
> > which providers can be supported.
> > - OIDC is designed for authentication, while the main purpose of OAUTH is to
> > authorize applications on behalf of the user.
>
> How does this differ from the previous proposal? The OAUTHBEARER SASL
> mechanism already relies on OIDC for discovery. (I think that decision
> is confusing from an architectural and naming standpoint, but I don't
> think they really had an alternative...)
>
> > Github is not OIDC certified, so won’t be supported with this proposal.
> > However, it may be supported in the future through the ability for the
> > extension to provide custom discovery document content.
>
> Right.
>
> > OpenID configuration has a well-known discovery mechanism
> > for the provider configuration URI which is
> > defined in OpenID Connect. It allows libpq to fetch
> > metadata about provider (i.e endpoints, supported grants, response types, etc).
>
> Sure, but this is already how the original PoC works. The test suite
> implements an OIDC provider, for instance. Is there something different
> to this that I'm missing?
>
> > In the attached patch (based on V2 patch in the thread and does not
> > contain Samay's changes):
> > - Provider can configure issuer url and scope through the options hook.)
> > - Server passes on an open discovery url and scope to libpq.
> > - Libpq handles OAuth flow based on the flow_type sent in the
> > connection string [1].
> > - Added callbacks to notify a structure to client tools if OAuth flow
> > requires user interaction.
> > - Pg backend uses hooks to validate bearer token.
>
> Thank you for the sample!
>
> > Note that authentication code flow with PKCE for GUI clients is not
> > implemented yet.
> >
> > Proposed next steps:
> > - Broaden discussion to reach agreement on the approach.
>
> High-level thoughts on this particular patch (I assume you're not
> looking for low-level implementation comments yet):
>
> 0) The original hook proposal upthread, I thought, was about allowing
> libpq's flow implementation to be switched out by the application. I
> don't see that approach taken here. It's fine if that turned out to be a
> bad idea, of course, but this patch doesn't seem to match what we were
> talking about.
>
> 1) I'm really concerned about the sudden explosion of flows. We went
> from one flow (Device Authorization) to six. It's going to be hard
> enough to validate that *one* flow is useful and can be securely
> deployed by end users; I don't think we're going to be able to maintain
> six, especially in combination with my statement that iddawc is not an
> appropriate dependency for us.
>
> I'd much rather give applications the ability to use their own OAuth
> code, and then maintain within libpq only the flows that are broadly
> useful. This ties back to (0) above.
>
> 2) Breaking the refresh token into its own pseudoflow is, I think,
> passing the buck onto the user for something that's incredibly security
> sensitive. The refresh token is powerful; I don't really want it to be
> printed anywhere, let alone copy-pasted by the user. Imagine the
> phishing opportunities.
>
> If we want to support refresh tokens, I believe we should be developing
> a plan to cache and secure them within the client. They should be used
> as an accelerator for other flows, not as their own flow.
>
> 3) I don't like the departure from the OAUTHBEARER mechanism that's
> presented here. For one, since I can't see a sample plugin that makes
> use of the "flow type" magic numbers that have been added, I don't
> really understand why the extension to the mechanism is necessary.
>
> For two, if we think OAUTHBEARER is insufficient, the people who wrote
> it would probably like to hear about it. Claiming support for a spec,
> and then implementing an extension without review from the people who
> wrote the spec, is not something I'm personally interested in doing.
>
> 4) The test suite is still broken, so it's difficult to see these things
> in practice for review purposes.
>
> > - Implement libpq changes without iddawc
>
> This in particular will be much easier with a functioning test suite,
> and with a smaller number of flows.
>
> > - Prototype GUI flow with pgAdmin
>
> Cool!
>
> Thanks,
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
mahendrakar s
Date:
Hi Jacob,

I had validated Github by skipping the discovery mechanism and letting
the provider extension pass on the endpoints. This is just for
validation purposes.
If it needs to be supported, then need a way to send the discovery
document from extension.


Thanks,
Mahendrakar.

On Thu, 24 Nov 2022 at 09:16, Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
>
> > How does this differ from the previous proposal? The OAUTHBEARER SASL
> > mechanism already relies on OIDC for discovery. (I think that decision
> > is confusing from an architectural and naming standpoint, but I don't
> > think they really had an alternative...)
> Mostly terminology questions here. OAUTHBEARER SASL appears to be the
> spec about using OAUTH2 tokens for Authentication.
> While any OAUTH2 can generally work, we propose to specifically
> highlight that only OIDC providers can be supported, as we need the
> discovery document.
> And we won't be able to support Github under that requirement.
> Since the original patch used that too - no change on that, just
> confirmation that we need OIDC compliance.
>
> > 0) The original hook proposal upthread, I thought, was about allowing
> > libpq's flow implementation to be switched out by the application. I
> > don't see that approach taken here. It's fine if that turned out to be a
> > bad idea, of course, but this patch doesn't seem to match what we were
> > talking about.
> We still plan to allow the client to pass the token. Which is a
> generic way to implement its own OAUTH flows.
>
> > 1) I'm really concerned about the sudden explosion of flows. We went
> > from one flow (Device Authorization) to six. It's going to be hard
> > enough to validate that *one* flow is useful and can be securely
> > deployed by end users; I don't think we're going to be able to maintain
> > six, especially in combination with my statement that iddawc is not an
> > appropriate dependency for us.
>
> > I'd much rather give applications the ability to use their own OAuth
> > code, and then maintain within libpq only the flows that are broadly
> > useful. This ties back to (0) above.
> We consider the following set of flows to be minimum required:
> - Client Credentials - For Service to Service scenarios.
> - Authorization Code with PKCE - For rich clients,including pgAdmin.
> - Device code - for psql (and possibly other non-GUI clients).
> - Refresh code (separate discussion)
> Which is pretty much the list described here:
> https://oauth.net/2/grant-types/ and in OAUTH2 specs.
> Client Credentials is very simple, so does Refresh Code.
> If you prefer to pick one of the richer flows, Authorization code for
> GUI scenarios is probably much more widely used.
> Plus it's easier to implement too, as interaction goes through a
> series of callbacks. No polling required.
>
> > 2) Breaking the refresh token into its own pseudoflow is, I think,
> > passing the buck onto the user for something that's incredibly security
> > sensitive. The refresh token is powerful; I don't really want it to be
> > printed anywhere, let alone copy-pasted by the user. Imagine the
> > phishing opportunities.
>
> > If we want to support refresh tokens, I believe we should be developing
> > a plan to cache and secure them within the client. They should be used
> > as an accelerator for other flows, not as their own flow.
> It's considered a separate "grant_type" in the specs / APIs.
> https://openid.net/specs/openid-connect-core-1_0.html#RefreshTokens
>
> For the clients, it would be storing the token and using it to authenticate.
> On the question of sensitivity, secure credentials stores are
> different for each platform, with a lot of cloud offerings for this.
> pgAdmin, for example, has its own way to secure credentials to avoid
> asking users for passwords every time the app is opened.
> I believe we should delegate the refresh token management to the clients.
>
> >3) I don't like the departure from the OAUTHBEARER mechanism that's
> > presented here. For one, since I can't see a sample plugin that makes
> > use of the "flow type" magic numbers that have been added, I don't
> > really understand why the extension to the mechanism is necessary.
> I don't think it's much of a departure, but rather a separation of
> responsibilities between libpq and upstream clients.
> As libpq can be used in different apps, the client would need
> different types of flows/grants.
> I.e. those need to be provided to libpq at connection initialization
> or some other point.
> We will change to "grant_type" though and use string to be closer to the spec.
> What do you think is the best way for the client to signal which OAUTH
> flow should be used?
>
> On Wed, Nov 23, 2022 at 12:05 PM Jacob Champion <jchampion@timescale.com> wrote:
> >
> > On 11/23/22 01:58, mahendrakar s wrote:
> > > We validated on  libpq handling OAuth natively with different flows
> > > with different OIDC certified providers.
> > >
> > > Flows: Device Code, Client Credentials and Refresh Token.
> > > Providers: Microsoft, Google and Okta.
> >
> > Great, thank you!
> >
> > > Also validated with OAuth provider Github.
> >
> > (How did you get discovery working? I tried this and had to give up
> > eventually.)
> >
> > > We propose using OpenID Connect (OIDC) as the protocol, instead of
> > > OAuth, as it is:
> > > - Discovery mechanism to bridge the differences and provide metadata.
> > > - Stricter protocol and certification process to reliably identify
> > > which providers can be supported.
> > > - OIDC is designed for authentication, while the main purpose of OAUTH is to
> > > authorize applications on behalf of the user.
> >
> > How does this differ from the previous proposal? The OAUTHBEARER SASL
> > mechanism already relies on OIDC for discovery. (I think that decision
> > is confusing from an architectural and naming standpoint, but I don't
> > think they really had an alternative...)
> >
> > > Github is not OIDC certified, so won’t be supported with this proposal.
> > > However, it may be supported in the future through the ability for the
> > > extension to provide custom discovery document content.
> >
> > Right.
> >
> > > OpenID configuration has a well-known discovery mechanism
> > > for the provider configuration URI which is
> > > defined in OpenID Connect. It allows libpq to fetch
> > > metadata about provider (i.e endpoints, supported grants, response types, etc).
> >
> > Sure, but this is already how the original PoC works. The test suite
> > implements an OIDC provider, for instance. Is there something different
> > to this that I'm missing?
> >
> > > In the attached patch (based on V2 patch in the thread and does not
> > > contain Samay's changes):
> > > - Provider can configure issuer url and scope through the options hook.)
> > > - Server passes on an open discovery url and scope to libpq.
> > > - Libpq handles OAuth flow based on the flow_type sent in the
> > > connection string [1].
> > > - Added callbacks to notify a structure to client tools if OAuth flow
> > > requires user interaction.
> > > - Pg backend uses hooks to validate bearer token.
> >
> > Thank you for the sample!
> >
> > > Note that authentication code flow with PKCE for GUI clients is not
> > > implemented yet.
> > >
> > > Proposed next steps:
> > > - Broaden discussion to reach agreement on the approach.
> >
> > High-level thoughts on this particular patch (I assume you're not
> > looking for low-level implementation comments yet):
> >
> > 0) The original hook proposal upthread, I thought, was about allowing
> > libpq's flow implementation to be switched out by the application. I
> > don't see that approach taken here. It's fine if that turned out to be a
> > bad idea, of course, but this patch doesn't seem to match what we were
> > talking about.
> >
> > 1) I'm really concerned about the sudden explosion of flows. We went
> > from one flow (Device Authorization) to six. It's going to be hard
> > enough to validate that *one* flow is useful and can be securely
> > deployed by end users; I don't think we're going to be able to maintain
> > six, especially in combination with my statement that iddawc is not an
> > appropriate dependency for us.
> >
> > I'd much rather give applications the ability to use their own OAuth
> > code, and then maintain within libpq only the flows that are broadly
> > useful. This ties back to (0) above.
> >
> > 2) Breaking the refresh token into its own pseudoflow is, I think,
> > passing the buck onto the user for something that's incredibly security
> > sensitive. The refresh token is powerful; I don't really want it to be
> > printed anywhere, let alone copy-pasted by the user. Imagine the
> > phishing opportunities.
> >
> > If we want to support refresh tokens, I believe we should be developing
> > a plan to cache and secure them within the client. They should be used
> > as an accelerator for other flows, not as their own flow.
> >
> > 3) I don't like the departure from the OAUTHBEARER mechanism that's
> > presented here. For one, since I can't see a sample plugin that makes
> > use of the "flow type" magic numbers that have been added, I don't
> > really understand why the extension to the mechanism is necessary.
> >
> > For two, if we think OAUTHBEARER is insufficient, the people who wrote
> > it would probably like to hear about it. Claiming support for a spec,
> > and then implementing an extension without review from the people who
> > wrote the spec, is not something I'm personally interested in doing.
> >
> > 4) The test suite is still broken, so it's difficult to see these things
> > in practice for review purposes.
> >
> > > - Implement libpq changes without iddawc
> >
> > This in particular will be much easier with a functioning test suite,
> > and with a smaller number of flows.
> >
> > > - Prototype GUI flow with pgAdmin
> >
> > Cool!
> >
> > Thanks,
> > --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On 11/23/22 19:45, Andrey Chudnovsky wrote:
> Mostly terminology questions here. OAUTHBEARER SASL appears to be the
> spec about using OAUTH2 tokens for Authentication.
> While any OAUTH2 can generally work, we propose to specifically
> highlight that only OIDC providers can be supported, as we need the
> discovery document.

*If* you're using in-band discovery, yes. But I thought your use case
was explicitly tailored to out-of-band token retrieval:

> The client knows how to get a token for a particular principal
> and doesn't need any additional information other than human readable
> messages.

In that case, isn't OAuth sufficient? There's definitely a need to
document the distinction, but I don't think we have to require OIDC as
long as the client application makes up for the missing information.
(OAUTHBEARER makes the openid-configuration error member optional,
presumably for this reason.)

>> 0) The original hook proposal upthread, I thought, was about allowing
>> libpq's flow implementation to be switched out by the application. I
>> don't see that approach taken here. It's fine if that turned out to be a
>> bad idea, of course, but this patch doesn't seem to match what we were
>> talking about.
> We still plan to allow the client to pass the token. Which is a
> generic way to implement its own OAUTH flows.

Okay. But why push down the implementation into the server?

To illustrate what I mean, here's the architecture of my proposed patchset:

  +-------+                                          +----------+
  |       | -------------- Empty Token ------------> |          |
  | libpq | <----- Error Result (w/ Discovery ) ---- |          |
  |       |                                          |          |
  | +--------+                     +--------------+  |          |
  | | iddawc | <--- [ Flow ] ----> | Issuer/      |  | Postgres |
  | |        | <-- Access Token -- | Authz Server |  |          |
  | +--------+                     +--------------+  |   +-----------+
  |       |                                          |   |           |
  |       | -------------- Access Token -----------> | > | Validator |
  |       | <---- Authorization Success/Failure ---- | < |           |
  |       |                                          |   +-----------+
  +-------+                                          +----------+

In this implementation, there's only one black box: the validator, which
is responsible for taking an access token from an untrusted client,
verifying that it was issued correctly for the Postgres service, and
either 1) determining whether the bearer is authorized to access the
database, or 2) determining the authenticated ID of the bearer so that
the HBA can decide whether they're authorized. (Or both.)

This approach is limited by the flows that we explicitly enable within
libpq and its OAuth implementation library. You mentioned that you
wanted to support other flows, including clients with out-of-band
knowledge, and I suggested:

> If you wanted to override [iddawc's]
> behavior as a client, you could replace the builtin flow with your
> own, by registering a set of callbacks.

In other words, the hooks would replace iddawc in the above diagram.
In my mind, something like this:

     +-------+                                       +----------+
  +------+   | ----------- Empty Token ------------> | Postgres |
  |      | < | <---------- Error Result ------------ |          |
  | Hook |   |                                       |   +-----------+
  |      |   |                                       |   |           |
  +------+ > | ------------ Access Token ----------> | > | Validator |
     |       | <--- Authorization Success/Failure -- | < |           |
     | libpq |                                       |   +-----------+
     +-------+                                       +----------+

Now there's a second black box -- the client hook -- which takes an
OAUTHBEARER error result (which may or may not have OIDC discovery
information) and returns the access token. How it does this is
unspecified -- it'll probably use some OAuth 2.0 flow, but maybe not.
Maybe it sends the user to a web browser; maybe it uses some of the
magic provider-specific libraries you mentioned upthread. It might have
a refresh token cached so it doesn't have to involve the user at all.

Crucially, though, the two black boxes remain independent of each other.
They have well-defined inputs and outputs (the client hook could be
roughly described as "implement get_auth_token()"). Their correctness
can be independently verified against published OAuth specs and/or
provider documentation. And the client application still makes a single
call to PQconnect*().

Compare this to the architecture proposed by your patch:

  Client App
  +----------------------+
  |             +-------+                                +----------+
  |             | libpq |                                | Postgres |
  | PQconnect > |       |                                |   +-------+
  |          +------+   | ------- Flow Type (!) -------> | > |       |
  |     +- < | Hook | < | <------- Error Result -------- | < |       |
  | [ get    +------+   |                                |   |       |
  |   token ]   |       |                                |   |       |
  |     |       |       |                                |   | Hooks |
  |     v       |       |                                |   |       |
  | PQconnect > | ----> | ------ Access Token ---------> | > |       |
  |             |       | <--- Authz Success/Failure --- | < |       |
  |             +-------+                                |   +-------+
  +----------------------+                               +----------+

Rather than decouple things, I think this proposal drives a spike
through the client app, libpq, and the server. Please correct me if I've
misunderstood pieces of the patch, but the following is my view of it:

What used to be a validator hook on the server side now actively
participates in the client-side flow for some reason. (I still don't
understand what the server is supposed to do with that knowledge.
Changing your authz requirements based on the flow the client wants to
use seems like a good way to introduce bugs.)

The client-side hook is now coupled to the application logic: you have
to know to expect an error from the first PQconnect*() call, then check
whatever magic your hook has done for you to be able to set up the
second call to PQconnect*() with the correctly scoped bearer token. So
if you want to switch between the internal libpq OAuth implementation
and your own hook, you have to rewrite your app logic.

On top of all that, the "flow type code" being sent is a custom
extension to OAUTHBEARER that appears to be incompatible with the RFC's
discovery exchange (which is done by sending an empty auth token during
the first round trip).

> We consider the following set of flows to be minimum required:
> - Client Credentials - For Service to Service scenarios.

Okay, that's simple enough that I think it could probably be maintained
inside libpq with minimal cost. At the same time, is it complicated
enough that you need libpq to do it for you?

Maybe once we get the hooks ironed out, it'll be more obvious what the
tradeoff is...

> If you prefer to pick one of the richer flows, Authorization code for
> GUI scenarios is probably much more widely used.
> Plus it's easier to implement too, as interaction goes through a
> series of callbacks. No polling required.

I don't think flows requiring the invocation of web browsers and custom
URL handlers are a clear fit for libpq. For a first draft, at least, I
think that use case should be pushed upward into the client application
via a custom hook.

>> If we want to support refresh tokens, I believe we should be developing
>> a plan to cache and secure them within the client. They should be used
>> as an accelerator for other flows, not as their own flow.
> It's considered a separate "grant_type" in the specs / APIs.
> https://openid.net/specs/openid-connect-core-1_0.html#RefreshTokens

Yes, but that doesn't mean we have to expose it to users via a
connection option. You don't get a refresh token out of the blue; you
get it by going through some other flow, and then you use it in
preference to going through that flow again later.

> For the clients, it would be storing the token and using it to authenticate.
> On the question of sensitivity, secure credentials stores are
> different for each platform, with a lot of cloud offerings for this.
> pgAdmin, for example, has its own way to secure credentials to avoid
> asking users for passwords every time the app is opened.
> I believe we should delegate the refresh token management to the clients.

Delegating to client apps would be fine (and implicitly handled by a
token hook, because the client app would receive the refresh token
directly rather than going through libpq). Delegating to end users, not
so much. Printing a refresh token to stderr as proposed here is, I
think, making things unnecessarily difficult (and/or dangerous) for  users.

>> 3) I don't like the departure from the OAUTHBEARER mechanism that's
>> presented here. For one, since I can't see a sample plugin that makes
>> use of the "flow type" magic numbers that have been added, I don't
>> really understand why the extension to the mechanism is necessary.
> I don't think it's much of a departure, but rather a separation of
> responsibilities between libpq and upstream clients.

Given the proposed architectures above, 1) I think this is further
coupling the components, not separating them; and 2) I can't agree that
an incompatible discovery mechanism is "not much of a departure". If
OAUTHBEARER's functionality isn't good enough for some reason, let's
talk about why.

> As libpq can be used in different apps, the client would need
> different types of flows/grants.
> I.e. those need to be provided to libpq at connection initialization
> or some other point.

Why do libpq (or the server!) need to know those things at all, if
they're not going to implement the flow?

> We will change to "grant_type" though and use string to be closer to the spec.
> What do you think is the best way for the client to signal which OAUTH
> flow should be used?

libpq should not need to know the grant type in use if the client is
bypassing its internal implementation entirely.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On 11/24/22 00:20, mahendrakar s wrote:
> I had validated Github by skipping the discovery mechanism and letting
> the provider extension pass on the endpoints. This is just for
> validation purposes.
> If it needs to be supported, then need a way to send the discovery
> document from extension.

Yeah. I had originally bounced around the idea that we could send a
data:// URL, but I think that opens up problems.

You're supposed to be able to link the issuer URI with the URI you got
the configuration from, and if they're different, you bail out. If a
server makes up its own OpenID configuration, we'd have to bypass that
safety check, and decide what the risks and mitigations are... Not sure
it's worth it.

Especially if you could just lobby GitHub to, say, provide an OpenID
config. (Maybe there's a security-related reason they don't.)

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
Jacob,
Thanks for your feedback.
I think we can focus on the roles and responsibilities of the components first.
Details of the patch can be elaborated. Like "flow type code" is a
mistake on our side, and we will use the term "grant_type" which is
defined by OIDC spec. As well as details of usage of refresh_token.

> Rather than decouple things, I think this proposal drives a spike
> through the client app, libpq, and the server. Please correct me if I've
> misunderstood pieces of the patch, but the following is my view of it:

> What used to be a validator hook on the server side now actively
> participates in the client-side flow for some reason. (I still don't
> understand what the server is supposed to do with that knowledge.
> Changing your authz requirements based on the flow the client wants to
> use seems like a good way to introduce bugs.)

> The client-side hook is now coupled to the application logic: you have
> to know to expect an error from the first PQconnect*() call, then check
> whatever magic your hook has done for you to be able to set up the
> second call to PQconnect*() with the correctly scoped bearer token. So
> if you want to switch between the internal libpq OAuth implementation
> and your own hook, you have to rewrite your app logic.

Basically Yes. We propose an increase of the server side hook responsibility.
From just validating the token, to also return the provider root URL
and required audience. And possibly provide more metadata in the
future.
Which is in our opinion aligned with SASL protocol, where the server
side is responsible for telling the client auth requirements based on
the requested role in the startup packet.

Our understanding is that in the original patch that information came
purely from hba, and we propose extension being able to control that
metadata.
As we see extension as being owned by the identity provider, compared
to HBA which is owned by the server administrator or cloud provider.

This change of the roles is based on the vision of 4 independent actor
types in the ecosystem:
1. Identity Providers (Okta, Google, Microsoft, other OIDC providers).
   - Publish open source extensions for PostgreSQL.
   - Don't have to own the server deployments, and must ensure their
extensions can work in any environment. This is where we think
additional hook responsibility helps.
2. Server Owners / PAAS providers (On premise admins, Cloud providers,
multi-cloud PAAS providers).
   - Install extensions and configure HBA to allow clients to
authenticate with the identity providers of their choice.
3. Client Application Developers (Data Wis, integration tools,
PgAdmin, monitoring tools, e.t.c.)
   - Independent from specific Identity providers or server providers.
Write one code for all identity providers.
   - Rely on application deployment owners to configure which OIDC
provider to use across client and server setups.
4. Application Deployment Owners (End customers setting up applications)
   - The only actor actually aware of which identity provider to use.
Configures the stack based on the Identity and PostgreSQL deployments
they have.

The critical piece of the vision is (3.) above is applications
agnostic of the identity providers. Those applications rely on
properly configured servers and rich driver logic (libpq,
com.postgresql, npgsql) to allow their application to popup auth
windows or do service-to-service authentication with any provider. In
our view that would significantly democratize the deployment of OAUTH
authentication in the community.

In order to allow this separation, we propose:
1. HBA + Extension is the single source of truth of Provider root URL
+ Required Audience for each role. If some backfill for missing OIDC
discovery is needed, the provider-specific extension would be
providing it.
2. Client Application knows which grant_type to use in which scenario.
But can be coded without knowledge of a specific provider. So can't
provide discovery details.
3. Driver (libpq, others) - coordinate the authentication flow based
on client grant_type and identity provider metadata to allow client
applications to use any flow with any provider in a unified way.

Yes, this would require a little more complicated flow between
components than in your original patch. And yes, more complexity comes
with more opportunity to make bugs.
However, I see PG Server and Libpq as the places which can have more
complexity. For the purpose of making work for the community
participants easier and simplify adoption.

Does this make sense to you?


On Tue, Nov 29, 2022 at 1:20 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On 11/24/22 00:20, mahendrakar s wrote:
> > I had validated Github by skipping the discovery mechanism and letting
> > the provider extension pass on the endpoints. This is just for
> > validation purposes.
> > If it needs to be supported, then need a way to send the discovery
> > document from extension.
>
> Yeah. I had originally bounced around the idea that we could send a
> data:// URL, but I think that opens up problems.
>
> You're supposed to be able to link the issuer URI with the URI you got
> the configuration from, and if they're different, you bail out. If a
> server makes up its own OpenID configuration, we'd have to bypass that
> safety check, and decide what the risks and mitigations are... Not sure
> it's worth it.
>
> Especially if you could just lobby GitHub to, say, provide an OpenID
> config. (Maybe there's a security-related reason they don't.)
>
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Dec 5, 2022 at 4:15 PM Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
> I think we can focus on the roles and responsibilities of the components first.
> Details of the patch can be elaborated. Like "flow type code" is a
> mistake on our side, and we will use the term "grant_type" which is
> defined by OIDC spec. As well as details of usage of refresh_token.

(For the record, whether we call it "flow type" or "grant type"
doesn't address my concern.)

> Basically Yes. We propose an increase of the server side hook responsibility.
> From just validating the token, to also return the provider root URL
> and required audience. And possibly provide more metadata in the
> future.

I think it's okay to have the extension and HBA collaborate to provide
discovery information. Your proposal goes further than that, though,
and makes the server aware of the chosen client flow. That appears to
be an architectural violation: why does an OAuth resource server need
to know the client flow at all?

> Which is in our opinion aligned with SASL protocol, where the server
> side is responsible for telling the client auth requirements based on
> the requested role in the startup packet.

You've proposed an alternative SASL mechanism. There's nothing wrong
with that, per se, but I think it should be clear why we've chosen
something nonstandard.

> Our understanding is that in the original patch that information came
> purely from hba, and we propose extension being able to control that
> metadata.
> As we see extension as being owned by the identity provider, compared
> to HBA which is owned by the server administrator or cloud provider.

That seems reasonable, considering how tightly coupled the Issuer and
the token validation process are.

> 2. Server Owners / PAAS providers (On premise admins, Cloud providers,
> multi-cloud PAAS providers).
>    - Install extensions and configure HBA to allow clients to
> authenticate with the identity providers of their choice.

(For a future conversation: they need to set up authorization, too,
with custom scopes or some other magic. It's not enough to check who
the token belongs to; even if Postgres is just using the verified
email from OpenID as an authenticator, you have to also know that the
user authorized the token -- and therefore the client -- to access
Postgres on their behalf.)

> 3. Client Application Developers (Data Wis, integration tools,
> PgAdmin, monitoring tools, e.t.c.)
>    - Independent from specific Identity providers or server providers.
> Write one code for all identity providers.

Ideally, yes, but that only works if all identity providers implement
the same flows in compatible ways. We're already seeing instances
where that's not the case and we'll necessarily have to deal with that
up front.

>    - Rely on application deployment owners to configure which OIDC
> provider to use across client and server setups.
> 4. Application Deployment Owners (End customers setting up applications)
>    - The only actor actually aware of which identity provider to use.
> Configures the stack based on the Identity and PostgreSQL deployments
> they have.

(I have doubts that the roles will be as decoupled in practice as you
have described them, but I'd rather defer that for now.)

> The critical piece of the vision is (3.) above is applications
> agnostic of the identity providers. Those applications rely on
> properly configured servers and rich driver logic (libpq,
> com.postgresql, npgsql) to allow their application to popup auth
> windows or do service-to-service authentication with any provider. In
> our view that would significantly democratize the deployment of OAUTH
> authentication in the community.

That seems to be restating the goal of OAuth and OIDC. Can you explain
how the incompatible change allows you to accomplish this better than
standard implementations?

> In order to allow this separation, we propose:
> 1. HBA + Extension is the single source of truth of Provider root URL
> + Required Audience for each role. If some backfill for missing OIDC
> discovery is needed, the provider-specific extension would be
> providing it.
> 2. Client Application knows which grant_type to use in which scenario.
> But can be coded without knowledge of a specific provider. So can't
> provide discovery details.
> 3. Driver (libpq, others) - coordinate the authentication flow based
> on client grant_type and identity provider metadata to allow client
> applications to use any flow with any provider in a unified way.
>
> Yes, this would require a little more complicated flow between
> components than in your original patch.

Why? I claim that standard OAUTHBEARER can handle all of that. What
does your proposed architecture (the third diagram) enable that my
proposed hook (the second diagram) doesn't?

> And yes, more complexity comes
> with more opportunity to make bugs.
> However, I see PG Server and Libpq as the places which can have more
> complexity. For the purpose of making work for the community
> participants easier and simplify adoption.
>
> Does this make sense to you?

Some of it, but it hasn't really addressed the questions from my last mail.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> I think it's okay to have the extension and HBA collaborate to provide
> discovery information. Your proposal goes further than that, though,
> and makes the server aware of the chosen client flow. That appears to
> be an architectural violation: why does an OAuth resource server need
> to know the client flow at all?

Ok. It may have left there from intermediate iterations. We did
consider making extension drive the flow for specific grant_type, but
decided against that idea. For the same reason you point to.
Is it correct that your main concern about use of grant_type was that
it's propagated to the server? Then yes, we will remove sending it to
the server.

> Ideally, yes, but that only works if all identity providers implement
> the same flows in compatible ways. We're already seeing instances
> where that's not the case and we'll necessarily have to deal with that
> up front.

Yes, based on our analysis OIDC spec is detailed enough, that
providers implementing that one, can be supported with generic code in
libpq / client.
Github specifically won't fit there though. Microsoft Azure AD,
Google, Okta (including Auth0) will.
Theoretically discovery documents can be returned from the extension
(server-side) which is provider specific. Though we didn't plan to
prioritize that.

> That seems to be restating the goal of OAuth and OIDC. Can you explain
> how the incompatible change allows you to accomplish this better than
> standard implementations?

Do you refer to passing grant_type to the server? Which we will get
rid of in the next iteration. Or other incompatible changes as well?

> Why? I claim that standard OAUTHBEARER can handle all of that. What
> does your proposed architecture (the third diagram) enable that my
> proposed hook (the second diagram) doesn't?

The hook proposed on the 2nd diagram effectively delegates all Oauth
flows implementations to the client.
We propose libpq takes care of pulling OpenId discovery and coordination.
Which is effectively Diagram 1 + more flows + server hook providing
root url/audience.

Created the diagrams with all components for 3 flows:
1. Authorization code grant (Clients with Browser access):
  +----------------------+                                         +----------+
  |             +-------+                                          |
       |
  | PQconnect   |       |                                          |
       |
  | [auth_code] |       |                                          |
+-----------+
  |          -> |       | -------------- Empty Token ------------> | >
|           |
  |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
| Pre-Auth  |
  |             |       |                                          |
|  Hook     |
  |             |       |                                          |
+-----------+
  |             |       |                        +--------------+  |          |
  |             |       | -------[GET]---------> | OIDC         |  | Postgres |
  |          +------+   | <--Provider Metadata-- | Discovery    |  |          |
  |     +- < | Hook | < |                        +--------------+  |
       |
  |     |    +------+   |                                          |
       |
  |     v       |       |                                          |
       |
  |  [get auth  |       |                                          |
       |
  |    code]    |       |                                          |
       |
  |<user action>|       |                                          |
       |
  |     |       |       |                                          |
       |
  |     +       |       |                                          |
       |
  | PQconnect > | +--------+                     +--------------+  |
       |
  |             | | iddawc | <-- [ Auth code ]-> | Issuer/      |  |          |
  |             | |        | <-- Access Token -- | Authz Server |  |          |
  |             | +--------+                     +--------------+  |          |
  |             |       |                                          |
+-----------+
  |             |       | -------------- Access Token -----------> | >
| Validator |
  |             |       | <---- Authorization Success/Failure ---- | <
|   Hook    |
  |          +------+   |                                          |
+-----------+
  |      +-< | Hook |   |                                          |
       |
  |      v   +------+   |                                          |
       |
  |[store       +-------+                                          |
       |
  |  refresh_token]                                                +----------+
  +----------------------+

2. Device code grant
  +----------------------+                                         +----------+
  |             +-------+                                          |
       |
  | PQconnect   |       |                                          |
       |
  | [auth_code] |       |                                          |
+-----------+
  |          -> |       | -------------- Empty Token ------------> | >
|           |
  |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
| Pre-Auth  |
  |             |       |                                          |
|  Hook     |
  |             |       |                                          |
+-----------+
  |             |       |                        +--------------+  |          |
  |             |       | -------[GET]---------> | OIDC         |  | Postgres |
  |          +------+   | <--Provider Metadata-- | Discovery    |  |          |
  |     +- < | Hook | < |                        +--------------+  |
       |
  |     |    +------+   |                                          |
       |
  |     v       |       |                                          |
       |
  |  [device    | +---------+                     +--------------+ |
       |
  |    code]    | | iddawc  |                     | Issuer/      | |
       |
  |<user action>| |         | --[ Device code ]-> | Authz Server | |
       |
  |             | |<polling>| --[ Device code ]-> |              | |
       |
  |             | |         | --[ Device code ]-> |              | |
       |
  |             | |         |                     |              | |          |
  |             | |         | <-- Access Token -- |              | |          |
  |             | +---------+                     +--------------+ |          |
  |             |       |                                          |
+-----------+
  |             |       | -------------- Access Token -----------> | >
| Validator |
  |             |       | <---- Authorization Success/Failure ---- | <
|   Hook    |
  |          +------+   |                                          |
+-----------+
  |      +-< | Hook |   |                                          |
       |
  |      v   +------+   |                                          |
       |
  |[store       +-------+                                          |
       |
  |  refresh_token]                                                +----------+
  +----------------------+

3. Non-interactive flows (Client Secret / Refresh_Token)
  +----------------------+                                         +----------+
  |             +-------+                                          |
       |
  | PQconnect   |       |                                          |
       |
  | [grant_type]|       |                                          |          |
  |          -> |       |                                          |
+-----------+
  |             |       | -------------- Empty Token ------------> | >
|           |
  |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
| Pre-Auth  |
  |             |       |                                          |
|  Hook     |
  |             |       |                                          |
+-----------+
  |             |       |                        +--------------+  |          |
  |             |       | -------[GET]---------> | OIDC         |  | Postgres |
  |             |       | <--Provider Metadata-- | Discovery    |  |          |
  |             |       |                        +--------------+  |
       |
  |             |       |                                          |
       |
  |             | +--------+                     +--------------+  |
       |
  |             | | iddawc | <-- [ Secret ]----> | Issuer/      |  |          |
  |             | |        | <-- Access Token -- | Authz Server |  |          |
  |             | +--------+                     +--------------+  |          |
  |             |       |                                          |
+-----------+
  |             |       | -------------- Access Token -----------> | >
| Validator |
  |             |       | <---- Authorization Success/Failure ---- | <
|   Hook    |
  |             |       |                                          |
+-----------+
  |             +-------+                                          +----------+
  +----------------------+

I think what was the most confusing in our latest patch is that
flow_type was passed to the server.
We are not proposing this going forward.

> (For a future conversation: they need to set up authorization, too,
> with custom scopes or some other magic. It's not enough to check who
> the token belongs to; even if Postgres is just using the verified
> email from OpenID as an authenticator, you have to also know that the
> user authorized the token -- and therefore the client -- to access
> Postgres on their behalf.)

My understanding is that metadata in the tokens is provider specific,
so server side hook would be the right place to handle that.
Plus I can envision for some providers it can make sense to make a
remote call to pull some information.

The way we implement Azure AD auth today in PAAS PostgreSQL offering:
- Server administrator uses special extension functions to create
Azure AD enabled PostgreSQL roles.
- PostgreSQL extension maps Roles to unique identity Ids (UID) in the Directory.
- Connection flow: If the token is valid and Role => UID mapping
matches, we authenticate as the Role.
- Then its native PostgreSQL role based access control takes care of privileges.

This is the same for both User- and System-to-system authorization.
Though I assume different providers may treat user- and system-
identities differently. So their extension would handle that.

Thanks!
Andrey.

On Wed, Dec 7, 2022 at 11:06 AM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Mon, Dec 5, 2022 at 4:15 PM Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
> > I think we can focus on the roles and responsibilities of the components first.
> > Details of the patch can be elaborated. Like "flow type code" is a
> > mistake on our side, and we will use the term "grant_type" which is
> > defined by OIDC spec. As well as details of usage of refresh_token.
>
> (For the record, whether we call it "flow type" or "grant type"
> doesn't address my concern.)
>
> > Basically Yes. We propose an increase of the server side hook responsibility.
> > From just validating the token, to also return the provider root URL
> > and required audience. And possibly provide more metadata in the
> > future.
>
> I think it's okay to have the extension and HBA collaborate to provide
> discovery information. Your proposal goes further than that, though,
> and makes the server aware of the chosen client flow. That appears to
> be an architectural violation: why does an OAuth resource server need
> to know the client flow at all?
>
> > Which is in our opinion aligned with SASL protocol, where the server
> > side is responsible for telling the client auth requirements based on
> > the requested role in the startup packet.
>
> You've proposed an alternative SASL mechanism. There's nothing wrong
> with that, per se, but I think it should be clear why we've chosen
> something nonstandard.
>
> > Our understanding is that in the original patch that information came
> > purely from hba, and we propose extension being able to control that
> > metadata.
> > As we see extension as being owned by the identity provider, compared
> > to HBA which is owned by the server administrator or cloud provider.
>
> That seems reasonable, considering how tightly coupled the Issuer and
> the token validation process are.
>
> > 2. Server Owners / PAAS providers (On premise admins, Cloud providers,
> > multi-cloud PAAS providers).
> >    - Install extensions and configure HBA to allow clients to
> > authenticate with the identity providers of their choice.
>
> (For a future conversation: they need to set up authorization, too,
> with custom scopes or some other magic. It's not enough to check who
> the token belongs to; even if Postgres is just using the verified
> email from OpenID as an authenticator, you have to also know that the
> user authorized the token -- and therefore the client -- to access
> Postgres on their behalf.)
>
> > 3. Client Application Developers (Data Wis, integration tools,
> > PgAdmin, monitoring tools, e.t.c.)
> >    - Independent from specific Identity providers or server providers.
> > Write one code for all identity providers.
>
> Ideally, yes, but that only works if all identity providers implement
> the same flows in compatible ways. We're already seeing instances
> where that's not the case and we'll necessarily have to deal with that
> up front.
>
> >    - Rely on application deployment owners to configure which OIDC
> > provider to use across client and server setups.
> > 4. Application Deployment Owners (End customers setting up applications)
> >    - The only actor actually aware of which identity provider to use.
> > Configures the stack based on the Identity and PostgreSQL deployments
> > they have.
>
> (I have doubts that the roles will be as decoupled in practice as you
> have described them, but I'd rather defer that for now.)
>
> > The critical piece of the vision is (3.) above is applications
> > agnostic of the identity providers. Those applications rely on
> > properly configured servers and rich driver logic (libpq,
> > com.postgresql, npgsql) to allow their application to popup auth
> > windows or do service-to-service authentication with any provider. In
> > our view that would significantly democratize the deployment of OAUTH
> > authentication in the community.
>
> That seems to be restating the goal of OAuth and OIDC. Can you explain
> how the incompatible change allows you to accomplish this better than
> standard implementations?
>
> > In order to allow this separation, we propose:
> > 1. HBA + Extension is the single source of truth of Provider root URL
> > + Required Audience for each role. If some backfill for missing OIDC
> > discovery is needed, the provider-specific extension would be
> > providing it.
> > 2. Client Application knows which grant_type to use in which scenario.
> > But can be coded without knowledge of a specific provider. So can't
> > provide discovery details.
> > 3. Driver (libpq, others) - coordinate the authentication flow based
> > on client grant_type and identity provider metadata to allow client
> > applications to use any flow with any provider in a unified way.
> >
> > Yes, this would require a little more complicated flow between
> > components than in your original patch.
>
> Why? I claim that standard OAUTHBEARER can handle all of that. What
> does your proposed architecture (the third diagram) enable that my
> proposed hook (the second diagram) doesn't?
>
> > And yes, more complexity comes
> > with more opportunity to make bugs.
> > However, I see PG Server and Libpq as the places which can have more
> > complexity. For the purpose of making work for the community
> > participants easier and simplify adoption.
> >
> > Does this make sense to you?
>
> Some of it, but it hasn't really addressed the questions from my last mail.
>
> Thanks,
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
That being said, the Diagram 2 would look like this with our proposal:
  +----------------------+                                         +----------+
  |             +-------+                                          | Postgres |
  | PQconnect ->|       |                                          |
       |
  |             |       |                                          |
+-----------+
  |             |       | -------------- Empty Token ------------> | >
|           |
  |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
| Pre-Auth  |
  |          +------+   |                                          |
|  Hook     |
  |     +- < | Hook |   |                                          |
+-----------+
  |     |    +------+   |                                          |          |
  |     v       |       |                                          |
       |
  |  [get token]|       |                                          |
       |
  |     |       |       |                                          |
       |
  |     +       |       |                                          |
+-----------+
  | PQconnect > |       | -------------- Access Token -----------> | >
| Validator |
  |             |       | <---- Authorization Success/Failure ---- | <
|   Hook    |
  |             |       |                                          |
+-----------+
  |             +-------+                                          |
       | +----------------------+
+----------+


With the application taking care of all Token acquisition logic. While
the server-side hook is participating in the pre-authentication reply.

That is definitely a required scenario for the long term and the
easiest to implement in the client core.
And if we can do at least that flow in PG16 it will be a strong
foundation to provide more support for specific grants in libpq going
forward.

Does the diagram above look good to you? We can then start cleaning up
the patch to get that in first.

Thanks!
Andrey.


On Wed, Dec 7, 2022 at 3:22 PM Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
>
> > I think it's okay to have the extension and HBA collaborate to provide
> > discovery information. Your proposal goes further than that, though,
> > and makes the server aware of the chosen client flow. That appears to
> > be an architectural violation: why does an OAuth resource server need
> > to know the client flow at all?
>
> Ok. It may have left there from intermediate iterations. We did
> consider making extension drive the flow for specific grant_type, but
> decided against that idea. For the same reason you point to.
> Is it correct that your main concern about use of grant_type was that
> it's propagated to the server? Then yes, we will remove sending it to
> the server.
>
> > Ideally, yes, but that only works if all identity providers implement
> > the same flows in compatible ways. We're already seeing instances
> > where that's not the case and we'll necessarily have to deal with that
> > up front.
>
> Yes, based on our analysis OIDC spec is detailed enough, that
> providers implementing that one, can be supported with generic code in
> libpq / client.
> Github specifically won't fit there though. Microsoft Azure AD,
> Google, Okta (including Auth0) will.
> Theoretically discovery documents can be returned from the extension
> (server-side) which is provider specific. Though we didn't plan to
> prioritize that.
>
> > That seems to be restating the goal of OAuth and OIDC. Can you explain
> > how the incompatible change allows you to accomplish this better than
> > standard implementations?
>
> Do you refer to passing grant_type to the server? Which we will get
> rid of in the next iteration. Or other incompatible changes as well?
>
> > Why? I claim that standard OAUTHBEARER can handle all of that. What
> > does your proposed architecture (the third diagram) enable that my
> > proposed hook (the second diagram) doesn't?
>
> The hook proposed on the 2nd diagram effectively delegates all Oauth
> flows implementations to the client.
> We propose libpq takes care of pulling OpenId discovery and coordination.
> Which is effectively Diagram 1 + more flows + server hook providing
> root url/audience.
>
> Created the diagrams with all components for 3 flows:
> 1. Authorization code grant (Clients with Browser access):
>   +----------------------+                                         +----------+
>   |             +-------+                                          |
>        |
>   | PQconnect   |       |                                          |
>        |
>   | [auth_code] |       |                                          |
> +-----------+
>   |          -> |       | -------------- Empty Token ------------> | >
> |           |
>   |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
> | Pre-Auth  |
>   |             |       |                                          |
> |  Hook     |
>   |             |       |                                          |
> +-----------+
>   |             |       |                        +--------------+  |          |
>   |             |       | -------[GET]---------> | OIDC         |  | Postgres |
>   |          +------+   | <--Provider Metadata-- | Discovery    |  |          |
>   |     +- < | Hook | < |                        +--------------+  |
>        |
>   |     |    +------+   |                                          |
>        |
>   |     v       |       |                                          |
>        |
>   |  [get auth  |       |                                          |
>        |
>   |    code]    |       |                                          |
>        |
>   |<user action>|       |                                          |
>        |
>   |     |       |       |                                          |
>        |
>   |     +       |       |                                          |
>        |
>   | PQconnect > | +--------+                     +--------------+  |
>        |
>   |             | | iddawc | <-- [ Auth code ]-> | Issuer/      |  |          |
>   |             | |        | <-- Access Token -- | Authz Server |  |          |
>   |             | +--------+                     +--------------+  |          |
>   |             |       |                                          |
> +-----------+
>   |             |       | -------------- Access Token -----------> | >
> | Validator |
>   |             |       | <---- Authorization Success/Failure ---- | <
> |   Hook    |
>   |          +------+   |                                          |
> +-----------+
>   |      +-< | Hook |   |                                          |
>        |
>   |      v   +------+   |                                          |
>        |
>   |[store       +-------+                                          |
>        |
>   |  refresh_token]                                                +----------+
>   +----------------------+
>
> 2. Device code grant
>   +----------------------+                                         +----------+
>   |             +-------+                                          |
>        |
>   | PQconnect   |       |                                          |
>        |
>   | [auth_code] |       |                                          |
> +-----------+
>   |          -> |       | -------------- Empty Token ------------> | >
> |           |
>   |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
> | Pre-Auth  |
>   |             |       |                                          |
> |  Hook     |
>   |             |       |                                          |
> +-----------+
>   |             |       |                        +--------------+  |          |
>   |             |       | -------[GET]---------> | OIDC         |  | Postgres |
>   |          +------+   | <--Provider Metadata-- | Discovery    |  |          |
>   |     +- < | Hook | < |                        +--------------+  |
>        |
>   |     |    +------+   |                                          |
>        |
>   |     v       |       |                                          |
>        |
>   |  [device    | +---------+                     +--------------+ |
>        |
>   |    code]    | | iddawc  |                     | Issuer/      | |
>        |
>   |<user action>| |         | --[ Device code ]-> | Authz Server | |
>        |
>   |             | |<polling>| --[ Device code ]-> |              | |
>        |
>   |             | |         | --[ Device code ]-> |              | |
>        |
>   |             | |         |                     |              | |          |
>   |             | |         | <-- Access Token -- |              | |          |
>   |             | +---------+                     +--------------+ |          |
>   |             |       |                                          |
> +-----------+
>   |             |       | -------------- Access Token -----------> | >
> | Validator |
>   |             |       | <---- Authorization Success/Failure ---- | <
> |   Hook    |
>   |          +------+   |                                          |
> +-----------+
>   |      +-< | Hook |   |                                          |
>        |
>   |      v   +------+   |                                          |
>        |
>   |[store       +-------+                                          |
>        |
>   |  refresh_token]                                                +----------+
>   +----------------------+
>
> 3. Non-interactive flows (Client Secret / Refresh_Token)
>   +----------------------+                                         +----------+
>   |             +-------+                                          |
>        |
>   | PQconnect   |       |                                          |
>        |
>   | [grant_type]|       |                                          |          |
>   |          -> |       |                                          |
> +-----------+
>   |             |       | -------------- Empty Token ------------> | >
> |           |
>   |             | libpq | <----- Error(w\ Root URL + Audience ) -- | <
> | Pre-Auth  |
>   |             |       |                                          |
> |  Hook     |
>   |             |       |                                          |
> +-----------+
>   |             |       |                        +--------------+  |          |
>   |             |       | -------[GET]---------> | OIDC         |  | Postgres |
>   |             |       | <--Provider Metadata-- | Discovery    |  |          |
>   |             |       |                        +--------------+  |
>        |
>   |             |       |                                          |
>        |
>   |             | +--------+                     +--------------+  |
>        |
>   |             | | iddawc | <-- [ Secret ]----> | Issuer/      |  |          |
>   |             | |        | <-- Access Token -- | Authz Server |  |          |
>   |             | +--------+                     +--------------+  |          |
>   |             |       |                                          |
> +-----------+
>   |             |       | -------------- Access Token -----------> | >
> | Validator |
>   |             |       | <---- Authorization Success/Failure ---- | <
> |   Hook    |
>   |             |       |                                          |
> +-----------+
>   |             +-------+                                          +----------+
>   +----------------------+
>
> I think what was the most confusing in our latest patch is that
> flow_type was passed to the server.
> We are not proposing this going forward.
>
> > (For a future conversation: they need to set up authorization, too,
> > with custom scopes or some other magic. It's not enough to check who
> > the token belongs to; even if Postgres is just using the verified
> > email from OpenID as an authenticator, you have to also know that the
> > user authorized the token -- and therefore the client -- to access
> > Postgres on their behalf.)
>
> My understanding is that metadata in the tokens is provider specific,
> so server side hook would be the right place to handle that.
> Plus I can envision for some providers it can make sense to make a
> remote call to pull some information.
>
> The way we implement Azure AD auth today in PAAS PostgreSQL offering:
> - Server administrator uses special extension functions to create
> Azure AD enabled PostgreSQL roles.
> - PostgreSQL extension maps Roles to unique identity Ids (UID) in the Directory.
> - Connection flow: If the token is valid and Role => UID mapping
> matches, we authenticate as the Role.
> - Then its native PostgreSQL role based access control takes care of privileges.
>
> This is the same for both User- and System-to-system authorization.
> Though I assume different providers may treat user- and system-
> identities differently. So their extension would handle that.
>
> Thanks!
> Andrey.
>
> On Wed, Dec 7, 2022 at 11:06 AM Jacob Champion <jchampion@timescale.com> wrote:
> >
> > On Mon, Dec 5, 2022 at 4:15 PM Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
> > > I think we can focus on the roles and responsibilities of the components first.
> > > Details of the patch can be elaborated. Like "flow type code" is a
> > > mistake on our side, and we will use the term "grant_type" which is
> > > defined by OIDC spec. As well as details of usage of refresh_token.
> >
> > (For the record, whether we call it "flow type" or "grant type"
> > doesn't address my concern.)
> >
> > > Basically Yes. We propose an increase of the server side hook responsibility.
> > > From just validating the token, to also return the provider root URL
> > > and required audience. And possibly provide more metadata in the
> > > future.
> >
> > I think it's okay to have the extension and HBA collaborate to provide
> > discovery information. Your proposal goes further than that, though,
> > and makes the server aware of the chosen client flow. That appears to
> > be an architectural violation: why does an OAuth resource server need
> > to know the client flow at all?
> >
> > > Which is in our opinion aligned with SASL protocol, where the server
> > > side is responsible for telling the client auth requirements based on
> > > the requested role in the startup packet.
> >
> > You've proposed an alternative SASL mechanism. There's nothing wrong
> > with that, per se, but I think it should be clear why we've chosen
> > something nonstandard.
> >
> > > Our understanding is that in the original patch that information came
> > > purely from hba, and we propose extension being able to control that
> > > metadata.
> > > As we see extension as being owned by the identity provider, compared
> > > to HBA which is owned by the server administrator or cloud provider.
> >
> > That seems reasonable, considering how tightly coupled the Issuer and
> > the token validation process are.
> >
> > > 2. Server Owners / PAAS providers (On premise admins, Cloud providers,
> > > multi-cloud PAAS providers).
> > >    - Install extensions and configure HBA to allow clients to
> > > authenticate with the identity providers of their choice.
> >
> > (For a future conversation: they need to set up authorization, too,
> > with custom scopes or some other magic. It's not enough to check who
> > the token belongs to; even if Postgres is just using the verified
> > email from OpenID as an authenticator, you have to also know that the
> > user authorized the token -- and therefore the client -- to access
> > Postgres on their behalf.)
> >
> > > 3. Client Application Developers (Data Wis, integration tools,
> > > PgAdmin, monitoring tools, e.t.c.)
> > >    - Independent from specific Identity providers or server providers.
> > > Write one code for all identity providers.
> >
> > Ideally, yes, but that only works if all identity providers implement
> > the same flows in compatible ways. We're already seeing instances
> > where that's not the case and we'll necessarily have to deal with that
> > up front.
> >
> > >    - Rely on application deployment owners to configure which OIDC
> > > provider to use across client and server setups.
> > > 4. Application Deployment Owners (End customers setting up applications)
> > >    - The only actor actually aware of which identity provider to use.
> > > Configures the stack based on the Identity and PostgreSQL deployments
> > > they have.
> >
> > (I have doubts that the roles will be as decoupled in practice as you
> > have described them, but I'd rather defer that for now.)
> >
> > > The critical piece of the vision is (3.) above is applications
> > > agnostic of the identity providers. Those applications rely on
> > > properly configured servers and rich driver logic (libpq,
> > > com.postgresql, npgsql) to allow their application to popup auth
> > > windows or do service-to-service authentication with any provider. In
> > > our view that would significantly democratize the deployment of OAUTH
> > > authentication in the community.
> >
> > That seems to be restating the goal of OAuth and OIDC. Can you explain
> > how the incompatible change allows you to accomplish this better than
> > standard implementations?
> >
> > > In order to allow this separation, we propose:
> > > 1. HBA + Extension is the single source of truth of Provider root URL
> > > + Required Audience for each role. If some backfill for missing OIDC
> > > discovery is needed, the provider-specific extension would be
> > > providing it.
> > > 2. Client Application knows which grant_type to use in which scenario.
> > > But can be coded without knowledge of a specific provider. So can't
> > > provide discovery details.
> > > 3. Driver (libpq, others) - coordinate the authentication flow based
> > > on client grant_type and identity provider metadata to allow client
> > > applications to use any flow with any provider in a unified way.
> > >
> > > Yes, this would require a little more complicated flow between
> > > components than in your original patch.
> >
> > Why? I claim that standard OAUTHBEARER can handle all of that. What
> > does your proposed architecture (the third diagram) enable that my
> > proposed hook (the second diagram) doesn't?
> >
> > > And yes, more complexity comes
> > > with more opportunity to make bugs.
> > > However, I see PG Server and Libpq as the places which can have more
> > > complexity. For the purpose of making work for the community
> > > participants easier and simplify adoption.
> > >
> > > Does this make sense to you?
> >
> > Some of it, but it hasn't really addressed the questions from my last mail.
> >
> > Thanks,
> > --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Wed, Dec 7, 2022 at 3:22 PM Andrey Chudnovsky
<achudnovskij@gmail.com> wrote:
> 
>> I think it's okay to have the extension and HBA collaborate to
>> provide discovery information. Your proposal goes further than
>> that, though, and makes the server aware of the chosen client flow.
>> That appears to be an architectural violation: why does an OAuth
>> resource server need to know the client flow at all?
> 
> Ok. It may have left there from intermediate iterations. We did 
> consider making extension drive the flow for specific grant_type,
> but decided against that idea. For the same reason you point to. Is
> it correct that your main concern about use of grant_type was that 
> it's propagated to the server? Then yes, we will remove sending it
> to the server.

Okay. Yes, that was my primary concern.

>> Ideally, yes, but that only works if all identity providers
>> implement the same flows in compatible ways. We're already seeing
>> instances where that's not the case and we'll necessarily have to
>> deal with that up front.
> 
> Yes, based on our analysis OIDC spec is detailed enough, that 
> providers implementing that one, can be supported with generic code
> in libpq / client. Github specifically won't fit there though.
> Microsoft Azure AD, Google, Okta (including Auth0) will. 
> Theoretically discovery documents can be returned from the extension 
> (server-side) which is provider specific. Though we didn't plan to 
> prioritize that.

As another example, Google's device authorization grant is incompatible
with the spec (which they co-authored). I want to say I had problems
with Azure AD not following that spec either, but I don't remember
exactly what they were. I wouldn't be surprised to find more tiny
departures once we get deeper into implementation.

>> That seems to be restating the goal of OAuth and OIDC. Can you
>> explain how the incompatible change allows you to accomplish this
>> better than standard implementations?
> 
> Do you refer to passing grant_type to the server? Which we will get 
> rid of in the next iteration. Or other incompatible changes as well?

Just the grant type, yeah.

>> Why? I claim that standard OAUTHBEARER can handle all of that.
>> What does your proposed architecture (the third diagram) enable
>> that my proposed hook (the second diagram) doesn't?
> 
> The hook proposed on the 2nd diagram effectively delegates all Oauth 
> flows implementations to the client. We propose libpq takes care of
> pulling OpenId discovery and coordination. Which is effectively
> Diagram 1 + more flows + server hook providing root url/audience.
> 
> Created the diagrams with all components for 3 flows: [snip]

(I'll skip ahead to your later mail on this.)

>> (For a future conversation: they need to set up authorization,
>> too, with custom scopes or some other magic. It's not enough to
>> check who the token belongs to; even if Postgres is just using the
>> verified email from OpenID as an authenticator, you have to also
>> know that the user authorized the token -- and therefore the client
>> -- to access Postgres on their behalf.)
> 
> My understanding is that metadata in the tokens is provider
> specific, so server side hook would be the right place to handle
> that. Plus I can envision for some providers it can make sense to
> make a remote call to pull some information.

The server hook is the right place to check the scopes, yes, but I think
the DBA should be able to specify what those scopes are to begin with.
The provider of the extension shouldn't be expected by the architecture
to hardcode those decisions, even if Azure AD chooses to short-circuit
that choice and provide magic instead.

On 12/7/22 20:25, Andrey Chudnovsky wrote:
> That being said, the Diagram 2 would look like this with our proposal:
> [snip]
> 
> With the application taking care of all Token acquisition logic. While
> the server-side hook is participating in the pre-authentication reply.
> 
> That is definitely a required scenario for the long term and the
> easiest to implement in the client core.> And if we can do at least that flow in PG16 it will be a strong
> foundation to provide more support for specific grants in libpq going
> forward.

Agreed.
> Does the diagram above look good to you? We can then start cleaning up
> the patch to get that in first.

I maintain that the hook doesn't need to hand back artifacts to the
client for a second PQconnect call. It can just use those artifacts to
obtain the access token and hand that right back to libpq. (I think any
requirement that clients be rewritten to call PQconnect twice will
probably be a sticking point for adoption of an OAuth patch.)

That said, now that your proposal is also compatible with OAUTHBEARER, I
can pony up some code to hopefully prove my point. (I don't know if I'll
be able to do that by the holidays though.)

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> The server hook is the right place to check the scopes, yes, but I think
> the DBA should be able to specify what those scopes are to begin with.
> The provider of the extension shouldn't be expected by the architecture
> to hardcode those decisions, even if Azure AD chooses to short-circuit
> that choice and provide magic instead.

Hardcode is definitely not expected, but customization for identity
provider specific, I think, should be allowed.
I can provide a couple of advanced use cases which happen in the cloud
deployments world, and require per-role management:
- Multi-tenant deployments, when root provider URL would be different
for different roles, based on which tenant they come from.
- Federation to multiple providers. Solutions like Amazon Cognito
which offer a layer of abstraction with several providers
transparently supported.

If your concern is extension not honoring the DBA configured values:
Would a server-side logic to prefer HBA value over extension-provided
resolve this concern?
We are definitely biased towards the cloud deployment scenarios, where
direct access to .hba files is usually not offered at all.
Let's find the middle ground here.

A separate reason for creating this pre-authentication hook is further
extensibility to support more metadata.
Specifically when we add support for OAUTH flows to libpq, server-side
extensions can help bridge the gap between the identity provider
implementation and OAUTH/OIDC specs.
For example, that could allow the Github extension to provide an OIDC
discovery document.

I definitely see identity providers as institutional actors here which
can be given some power through the extension hooks to customize the
behavior within the framework.

> I maintain that the hook doesn't need to hand back artifacts to the
> client for a second PQconnect call. It can just use those artifacts to
> obtain the access token and hand that right back to libpq. (I think any
> requirement that clients be rewritten to call PQconnect twice will
> probably be a sticking point for adoption of an OAuth patch.)

Obtaining a token is an asynchronous process with a human in the loop.
Not sure if expecting a hook function to return a token synchronously
is the best option here.
Can that be an optional return value of the hook in cases when a token
can be obtained synchronously?

On Thu, Dec 8, 2022 at 4:41 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Wed, Dec 7, 2022 at 3:22 PM Andrey Chudnovsky
> <achudnovskij@gmail.com> wrote:
> >
> >> I think it's okay to have the extension and HBA collaborate to
> >> provide discovery information. Your proposal goes further than
> >> that, though, and makes the server aware of the chosen client flow.
> >> That appears to be an architectural violation: why does an OAuth
> >> resource server need to know the client flow at all?
> >
> > Ok. It may have left there from intermediate iterations. We did
> > consider making extension drive the flow for specific grant_type,
> > but decided against that idea. For the same reason you point to. Is
> > it correct that your main concern about use of grant_type was that
> > it's propagated to the server? Then yes, we will remove sending it
> > to the server.
>
> Okay. Yes, that was my primary concern.
>
> >> Ideally, yes, but that only works if all identity providers
> >> implement the same flows in compatible ways. We're already seeing
> >> instances where that's not the case and we'll necessarily have to
> >> deal with that up front.
> >
> > Yes, based on our analysis OIDC spec is detailed enough, that
> > providers implementing that one, can be supported with generic code
> > in libpq / client. Github specifically won't fit there though.
> > Microsoft Azure AD, Google, Okta (including Auth0) will.
> > Theoretically discovery documents can be returned from the extension
> > (server-side) which is provider specific. Though we didn't plan to
> > prioritize that.
>
> As another example, Google's device authorization grant is incompatible
> with the spec (which they co-authored). I want to say I had problems
> with Azure AD not following that spec either, but I don't remember
> exactly what they were. I wouldn't be surprised to find more tiny
> departures once we get deeper into implementation.
>
> >> That seems to be restating the goal of OAuth and OIDC. Can you
> >> explain how the incompatible change allows you to accomplish this
> >> better than standard implementations?
> >
> > Do you refer to passing grant_type to the server? Which we will get
> > rid of in the next iteration. Or other incompatible changes as well?
>
> Just the grant type, yeah.
>
> >> Why? I claim that standard OAUTHBEARER can handle all of that.
> >> What does your proposed architecture (the third diagram) enable
> >> that my proposed hook (the second diagram) doesn't?
> >
> > The hook proposed on the 2nd diagram effectively delegates all Oauth
> > flows implementations to the client. We propose libpq takes care of
> > pulling OpenId discovery and coordination. Which is effectively
> > Diagram 1 + more flows + server hook providing root url/audience.
> >
> > Created the diagrams with all components for 3 flows: [snip]
>
> (I'll skip ahead to your later mail on this.)
>
> >> (For a future conversation: they need to set up authorization,
> >> too, with custom scopes or some other magic. It's not enough to
> >> check who the token belongs to; even if Postgres is just using the
> >> verified email from OpenID as an authenticator, you have to also
> >> know that the user authorized the token -- and therefore the client
> >> -- to access Postgres on their behalf.)
> >
> > My understanding is that metadata in the tokens is provider
> > specific, so server side hook would be the right place to handle
> > that. Plus I can envision for some providers it can make sense to
> > make a remote call to pull some information.
>
> The server hook is the right place to check the scopes, yes, but I think
> the DBA should be able to specify what those scopes are to begin with.
> The provider of the extension shouldn't be expected by the architecture
> to hardcode those decisions, even if Azure AD chooses to short-circuit
> that choice and provide magic instead.
>
> On 12/7/22 20:25, Andrey Chudnovsky wrote:
> > That being said, the Diagram 2 would look like this with our proposal:
> > [snip]
> >
> > With the application taking care of all Token acquisition logic. While
> > the server-side hook is participating in the pre-authentication reply.
> >
> > That is definitely a required scenario for the long term and the
> > easiest to implement in the client core.> And if we can do at least that flow in PG16 it will be a strong
> > foundation to provide more support for specific grants in libpq going
> > forward.
>
> Agreed.
> > Does the diagram above look good to you? We can then start cleaning up
> > the patch to get that in first.
>
> I maintain that the hook doesn't need to hand back artifacts to the
> client for a second PQconnect call. It can just use those artifacts to
> obtain the access token and hand that right back to libpq. (I think any
> requirement that clients be rewritten to call PQconnect twice will
> probably be a sticking point for adoption of an OAuth patch.)
>
> That said, now that your proposal is also compatible with OAUTHBEARER, I
> can pony up some code to hopefully prove my point. (I don't know if I'll
> be able to do that by the holidays though.)
>
> Thanks!
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Dec 12, 2022 at 9:06 PM Andrey Chudnovsky
<achudnovskij@gmail.com> wrote:
> If your concern is extension not honoring the DBA configured values:
> Would a server-side logic to prefer HBA value over extension-provided
> resolve this concern?

Yeah. It also seals the role of the extension here as "optional".

> We are definitely biased towards the cloud deployment scenarios, where
> direct access to .hba files is usually not offered at all.
> Let's find the middle ground here.

Sure. I don't want to make this difficult in cloud scenarios --
obviously I'd like for Timescale Cloud to be able to make use of this
too. But if we make this easy for a lone DBA (who doesn't have any
institutional power with the providers) to use correctly and securely,
then it should follow that the providers who _do_ have power and
resources will have an easy time of it as well. The reverse isn't
necessarily true. So I'm definitely planning to focus on the DBA case
first.

> A separate reason for creating this pre-authentication hook is further
> extensibility to support more metadata.
> Specifically when we add support for OAUTH flows to libpq, server-side
> extensions can help bridge the gap between the identity provider
> implementation and OAUTH/OIDC specs.
> For example, that could allow the Github extension to provide an OIDC
> discovery document.
>
> I definitely see identity providers as institutional actors here which
> can be given some power through the extension hooks to customize the
> behavior within the framework.

We'll probably have to make some compromises in this area, but I think
they should be carefully considered exceptions and not a core feature
of the mechanism. The gaps you point out are just fragmentation, and
adding custom extensions to deal with it leads to further
fragmentation instead of providing pressure on providers to just
implement the specs. Worst case, we open up new exciting security
flaws, and then no one can analyze them independently because no one
other than the provider knows how the two sides work together anymore.

Don't get me wrong; it would be naive to proceed as if the OAUTHBEARER
spec were perfect, because it's clearly not. But if we need to make
extensions to it, we can participate in IETF discussions and make our
case publicly for review, rather than enshrining MS/GitHub/Google/etc.
versions of the RFC and enabling that proliferation as a Postgres core
feature.

> Obtaining a token is an asynchronous process with a human in the loop.
> Not sure if expecting a hook function to return a token synchronously
> is the best option here.
> Can that be an optional return value of the hook in cases when a token
> can be obtained synchronously?

I don't think the hook is generally going to be able to return a token
synchronously, and I expect the final design to be async-first. As far
as I know, this will need to be solved for the builtin flows as well
(you don't want a synchronous HTTP call to block your PQconnectPoll
architecture), so the hook should be able to make use of whatever
solution we land on for that.

This is hand-wavy, and I don't expect it to be easy to solve. I just
don't think we have to solve it twice.

Have a good end to the year!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
mahendrakar s
Date:
Hi All,

Changes added to Jacob's patch(v2) as per the discussion in the thread.

The changes allow the customer to send the OAUTH BEARER token through psql connection string.

Example:
psql  -U user@example.com -d 'dbname=postgres oauth_bearer_token=abc'

To configure OAUTH, the pg_hba.conf line look like:
local   all             all                                     oauth   provider=oauth_provider issuer="https://example.com" scope="openid email"

We also added hook to libpq to pass on the metadata about the issuer.

Thanks,
Mahendrakar.


On Sat, 17 Dec 2022 at 04:48, Jacob Champion <jchampion@timescale.com> wrote:
>
> On Mon, Dec 12, 2022 at 9:06 PM Andrey Chudnovsky
> <achudnovskij@gmail.com> wrote:
> > If your concern is extension not honoring the DBA configured values:
> > Would a server-side logic to prefer HBA value over extension-provided
> > resolve this concern?
>
> Yeah. It also seals the role of the extension here as "optional".
>
> > We are definitely biased towards the cloud deployment scenarios, where
> > direct access to .hba files is usually not offered at all.
> > Let's find the middle ground here.
>
> Sure. I don't want to make this difficult in cloud scenarios --
> obviously I'd like for Timescale Cloud to be able to make use of this
> too. But if we make this easy for a lone DBA (who doesn't have any
> institutional power with the providers) to use correctly and securely,
> then it should follow that the providers who _do_ have power and
> resources will have an easy time of it as well. The reverse isn't
> necessarily true. So I'm definitely planning to focus on the DBA case
> first.
>
> > A separate reason for creating this pre-authentication hook is further
> > extensibility to support more metadata.
> > Specifically when we add support for OAUTH flows to libpq, server-side
> > extensions can help bridge the gap between the identity provider
> > implementation and OAUTH/OIDC specs.
> > For example, that could allow the Github extension to provide an OIDC
> > discovery document.
> >
> > I definitely see identity providers as institutional actors here which
> > can be given some power through the extension hooks to customize the
> > behavior within the framework.
>
> We'll probably have to make some compromises in this area, but I think
> they should be carefully considered exceptions and not a core feature
> of the mechanism. The gaps you point out are just fragmentation, and
> adding custom extensions to deal with it leads to further
> fragmentation instead of providing pressure on providers to just
> implement the specs. Worst case, we open up new exciting security
> flaws, and then no one can analyze them independently because no one
> other than the provider knows how the two sides work together anymore.
>
> Don't get me wrong; it would be naive to proceed as if the OAUTHBEARER
> spec were perfect, because it's clearly not. But if we need to make
> extensions to it, we can participate in IETF discussions and make our
> case publicly for review, rather than enshrining MS/GitHub/Google/etc.
> versions of the RFC and enabling that proliferation as a Postgres core
> feature.
>
> > Obtaining a token is an asynchronous process with a human in the loop.
> > Not sure if expecting a hook function to return a token synchronously
> > is the best option here.
> > Can that be an optional return value of the hook in cases when a token
> > can be obtained synchronously?
>
> I don't think the hook is generally going to be able to return a token
> synchronously, and I expect the final design to be async-first. As far
> as I know, this will need to be solved for the builtin flows as well
> (you don't want a synchronous HTTP call to block your PQconnectPoll
> architecture), so the hook should be able to make use of whatever
> solution we land on for that.
>
> This is hand-wavy, and I don't expect it to be easy to solve. I just
> don't think we have to solve it twice.
>
> Have a good end to the year!
> --Jacob
Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
More information on the latest patch.

1. We aligned the implementation with the barebone SASL for OAUTH
described here - https://www.rfc-editor.org/rfc/rfc7628
The flow can be explained in the diagram below:

  +----------------------+                                 +----------+
  |             +-------+                                  | Postgres |
  | PQconnect ->|       |                                  |          |
  |             |       |                                  |   +-----------+
  |             |       | ---------- Empty Token---------> | > |           |
  |             | libpq | <-- Error(Discovery + Scope ) -- | < | Pre-Auth  |
  |          +------+   |                                  |   |  Hook     |
  |     +- < | Hook |   |                                  |   +-----------+
  |     |    +------+   |                                  |          |
  |     v       |       |                                  |          |
  |  [get token]|       |                                  |          |
  |     |       |       |                                  |          |
  |     +       |       |                                  |   +-----------+
  | PQconnect > |       | --------- Access Token --------> | > | Validator |
  |             |       | <---------- Auth Result -------- | < |   Hook    |
  |             |       |                                  |   +-----------+
  |             +-------+                                  |          |
  +----------------------+                                 +----------+

2. Removed Device Code implementation in libpq. Several reasons:
   - Reduce scope and focus on the protocol first.
   - Device code implementation uses iddawc dependency. Taking this
dependency is a controversial step which requires broader discussion.
   - Device code implementation without iddaws would significantly
increase the scope of the patch, as libpq needs to poll the token
endpoint, setup different API calls, e.t.c.
   - That flow should canonically only be used for clients which can't
invoke browsers. If it is the only flow to be implemented, it can be
used in the context when it's not expected by the OAUTH protocol.

3. Temporarily removed test suite. We are actively working on aligning
the tests with the latest changes. Will add a patch with tests soon.

We will change the "V3" prefix to make it the next after the previous
iterations.

Thanks!
Andrey.

On Thu, Jan 12, 2023 at 11:08 AM mahendrakar s
<mahendrakarforpg@gmail.com> wrote:
>
> Hi All,
>
> Changes added to Jacob's patch(v2) as per the discussion in the thread.
>
> The changes allow the customer to send the OAUTH BEARER token through psql connection string.
>
> Example:
> psql  -U user@example.com -d 'dbname=postgres oauth_bearer_token=abc'
>
> To configure OAUTH, the pg_hba.conf line look like:
> local   all             all                                     oauth   provider=oauth_provider
issuer="https://example.com"scope="openid email"
 
>
> We also added hook to libpq to pass on the metadata about the issuer.
>
> Thanks,
> Mahendrakar.
>
>
> On Sat, 17 Dec 2022 at 04:48, Jacob Champion <jchampion@timescale.com> wrote:
> >
> > On Mon, Dec 12, 2022 at 9:06 PM Andrey Chudnovsky
> > <achudnovskij@gmail.com> wrote:
> > > If your concern is extension not honoring the DBA configured values:
> > > Would a server-side logic to prefer HBA value over extension-provided
> > > resolve this concern?
> >
> > Yeah. It also seals the role of the extension here as "optional".
> >
> > > We are definitely biased towards the cloud deployment scenarios, where
> > > direct access to .hba files is usually not offered at all.
> > > Let's find the middle ground here.
> >
> > Sure. I don't want to make this difficult in cloud scenarios --
> > obviously I'd like for Timescale Cloud to be able to make use of this
> > too. But if we make this easy for a lone DBA (who doesn't have any
> > institutional power with the providers) to use correctly and securely,
> > then it should follow that the providers who _do_ have power and
> > resources will have an easy time of it as well. The reverse isn't
> > necessarily true. So I'm definitely planning to focus on the DBA case
> > first.
> >
> > > A separate reason for creating this pre-authentication hook is further
> > > extensibility to support more metadata.
> > > Specifically when we add support for OAUTH flows to libpq, server-side
> > > extensions can help bridge the gap between the identity provider
> > > implementation and OAUTH/OIDC specs.
> > > For example, that could allow the Github extension to provide an OIDC
> > > discovery document.
> > >
> > > I definitely see identity providers as institutional actors here which
> > > can be given some power through the extension hooks to customize the
> > > behavior within the framework.
> >
> > We'll probably have to make some compromises in this area, but I think
> > they should be carefully considered exceptions and not a core feature
> > of the mechanism. The gaps you point out are just fragmentation, and
> > adding custom extensions to deal with it leads to further
> > fragmentation instead of providing pressure on providers to just
> > implement the specs. Worst case, we open up new exciting security
> > flaws, and then no one can analyze them independently because no one
> > other than the provider knows how the two sides work together anymore.
> >
> > Don't get me wrong; it would be naive to proceed as if the OAUTHBEARER
> > spec were perfect, because it's clearly not. But if we need to make
> > extensions to it, we can participate in IETF discussions and make our
> > case publicly for review, rather than enshrining MS/GitHub/Google/etc.
> > versions of the RFC and enabling that proliferation as a Postgres core
> > feature.
> >
> > > Obtaining a token is an asynchronous process with a human in the loop.
> > > Not sure if expecting a hook function to return a token synchronously
> > > is the best option here.
> > > Can that be an optional return value of the hook in cases when a token
> > > can be obtained synchronously?
> >
> > I don't think the hook is generally going to be able to return a token
> > synchronously, and I expect the final design to be async-first. As far
> > as I know, this will need to be solved for the builtin flows as well
> > (you don't want a synchronous HTTP call to block your PQconnectPoll
> > architecture), so the hook should be able to make use of whatever
> > solution we land on for that.
> >
> > This is hand-wavy, and I don't expect it to be easy to solve. I just
> > don't think we have to solve it twice.
> >
> > Have a good end to the year!
> > --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Sun, Jan 15, 2023 at 12:03 PM Andrey Chudnovsky
<achudnovskij@gmail.com> wrote:
> 2. Removed Device Code implementation in libpq. Several reasons:
>    - Reduce scope and focus on the protocol first.
>    - Device code implementation uses iddawc dependency. Taking this
> dependency is a controversial step which requires broader discussion.
>    - Device code implementation without iddaws would significantly
> increase the scope of the patch, as libpq needs to poll the token
> endpoint, setup different API calls, e.t.c.
>    - That flow should canonically only be used for clients which can't
> invoke browsers. If it is the only flow to be implemented, it can be
> used in the context when it's not expected by the OAUTH protocol.

I'm not understanding the concern in the final point -- providers
generally require you to opt into device authorization, at least as far
as I can tell. So if you decide that it's not appropriate for your use
case... don't enable it. (And I haven't seen any claims that opting into
device authorization weakens the other flows in any way. So if we're
going to implement a flow in libpq, I still think device authorization
is the best choice, since it works on headless machines as well as those
with browsers.)

All of this points at a bigger question to the community: if we choose
not to provide a flow implementation in libpq, is adding OAUTHBEARER
worth the additional maintenance cost?

My personal vote would be "no". I think the hook-only approach proposed
here would ensure that only larger providers would implement it in
practice, and in that case I'd rather spend cycles on generic SASL.

> 3. Temporarily removed test suite. We are actively working on aligning
> the tests with the latest changes. Will add a patch with tests soon.

Okay. Case in point, the following change to the patch appears to be
invalid JSON:

> +   appendStringInfo(&buf,
> +       "{ "
> +           "\"status\": \"invalid_token\", "
> +           "\"openid-configuration\": \"%s\","
> +           "\"scope\": \"%s\" ",
> +           "\"issuer\": \"%s\" ",
> +       "}",

Additionally, the "issuer" field added here is not part of the RFC. I've
written my thoughts about unofficial extensions upthread but haven't
received a response, so I'm going to start being more strident: Please,
for the sake of reviewers, call out changes you've made to the spec, and
why they're justified.

The patches seem to be out of order now (and the documentation in the
commit messages has been removed).

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> All of this points at a bigger question to the community: if we choose
> not to provide a flow implementation in libpq, is adding OAUTHBEARER
> worth the additional maintenance cost?

> My personal vote would be "no". I think the hook-only approach proposed
> here would ensure that only larger providers would implement it in
> practice

Flow implementations in libpq are definitely a long term plan, and I
agree that it would democratise the adoption.
In the previous posts in this conversation I outlined the ones I think
we should support.

However, I don't see why it's strictly necessary to couple those.
As long as the SASL exchange for OAUTHBEARER mechanism is supported by
the protocol, the Client side can evolve at its own pace.

At the same time, the current implementation allows clients to start
building provider-agnostic OAUTH support. By using iddawc or OAUTH
client implementations in the respective platforms.
So I wouldn't refer to "larger providers", but rather "more motivated
clients" here. Which definitely overlaps, but keeps the system open.

> I'm not understanding the concern in the final point -- providers
> generally require you to opt into device authorization, at least as far
> as I can tell. So if you decide that it's not appropriate for your use
> case... don't enable it. (And I haven't seen any claims that opting into
> device authorization weakens the other flows in any way. So if we're
> going to implement a flow in libpq, I still think device authorization
> is the best choice, since it works on headless machines as well as those
> with browsers.)
I agree with the statement that Device code is the best first choice
if we absolutely have to pick one.
Though I don't think we have to.

While device flow can be used for all kinds of user-facing
applications, it's specifically designed for input-constrained
scenarios. As clearly stated in the Abstract here -
https://www.rfc-editor.org/rfc/rfc8628
The authorization code with pkce flow is recommended by the RFSc and
major providers for cases when it's feasible.
The long term goal is to provide both, though I don't see why the
backbone protocol implementation first wouldn't add value.

Another point is user authentication is one side of the whole story
and the other critical one is system-to-system authentication. Where
we have Client Credentials and Certificates.
With the latter it is much harder to get generically implemented, as
provider-specific tokens need to be signed.

Adding the other reasoning, I think libpq support for specific flows
can get in the further iterations, after the protocol support.

> in that case I'd rather spend cycles on generic SASL.
I see 2 approaches to generic SASL:
(a). Generic SASL is a framework used in the protocol, with the
mechanisms implemented on top and exposed to the DBAs as auth types to
configure in hba.
This is the direction we're going here, which is well aligned with the
existing hba-based auth configuration.
(b). Generic SASL exposed to developers on the server- and client-
side to extend on. It seems to be a much longer shot.
The specific points of large ambiguity are libpq distribution model
(which you pointed to) and potential pluggability of insecure
mechanisms.

I do see (a) as a sweet spot with a lot of value for various
participants with much less ambiguity.

> Additionally, the "issuer" field added here is not part of the RFC. I've
> written my thoughts about unofficial extensions upthread but haven't
> received a response, so I'm going to start being more strident: Please,
> for the sake of reviewers, call out changes you've made to the spec, and
> why they're justified.
Thanks for your feedback on this. We had this discussion as well, and
added that as a convenience for the client to identify the provider.
I don't see a reason why an issuer would be absolutely necessary, so
we will get your point that sticking to RFCs is a safer choice.

> The patches seem to be out of order now (and the documentation in the
> commit messages has been removed).
Feedback taken. Work in progress.

On Tue, Jan 17, 2023 at 2:44 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Sun, Jan 15, 2023 at 12:03 PM Andrey Chudnovsky
> <achudnovskij@gmail.com> wrote:
> > 2. Removed Device Code implementation in libpq. Several reasons:
> >    - Reduce scope and focus on the protocol first.
> >    - Device code implementation uses iddawc dependency. Taking this
> > dependency is a controversial step which requires broader discussion.
> >    - Device code implementation without iddaws would significantly
> > increase the scope of the patch, as libpq needs to poll the token
> > endpoint, setup different API calls, e.t.c.
> >    - That flow should canonically only be used for clients which can't
> > invoke browsers. If it is the only flow to be implemented, it can be
> > used in the context when it's not expected by the OAUTH protocol.
>
> I'm not understanding the concern in the final point -- providers
> generally require you to opt into device authorization, at least as far
> as I can tell. So if you decide that it's not appropriate for your use
> case... don't enable it. (And I haven't seen any claims that opting into
> device authorization weakens the other flows in any way. So if we're
> going to implement a flow in libpq, I still think device authorization
> is the best choice, since it works on headless machines as well as those
> with browsers.)
>
> All of this points at a bigger question to the community: if we choose
> not to provide a flow implementation in libpq, is adding OAUTHBEARER
> worth the additional maintenance cost?
>
> My personal vote would be "no". I think the hook-only approach proposed
> here would ensure that only larger providers would implement it in
> practice, and in that case I'd rather spend cycles on generic SASL.
>
> > 3. Temporarily removed test suite. We are actively working on aligning
> > the tests with the latest changes. Will add a patch with tests soon.
>
> Okay. Case in point, the following change to the patch appears to be
> invalid JSON:
>
> > +   appendStringInfo(&buf,
> > +       "{ "
> > +           "\"status\": \"invalid_token\", "
> > +           "\"openid-configuration\": \"%s\","
> > +           "\"scope\": \"%s\" ",
> > +           "\"issuer\": \"%s\" ",
> > +       "}",
>
> Additionally, the "issuer" field added here is not part of the RFC. I've
> written my thoughts about unofficial extensions upthread but haven't
> received a response, so I'm going to start being more strident: Please,
> for the sake of reviewers, call out changes you've made to the spec, and
> why they're justified.
>
> The patches seem to be out of order now (and the documentation in the
> commit messages has been removed).
>
> Thanks,
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
mahendrakar s
Date:
Hi All,

The "issuer" field has been removed to align  with the RFC
implementation - https://www.rfc-editor.org/rfc/rfc7628.
This patch "v6" is a single patch to support the OAUTH BEARER token
through psql connection string.
Below flow is supported. Added the documentation in the commit messages.

 +----------------------+                                 +----------+
  |             +-------+                                  | Postgres |
  | PQconnect ->|       |                                  |          |
  |             |       |                                  |   +-----------+
  |             |       | ---------- Empty Token---------> | > |           |
  |             | libpq | <-- Error(Discovery + Scope ) -- | < | Pre-Auth  |
  |          +------+   |                                  |   |  Hook     |
  |     +- < | Hook |   |                                  |   +-----------+
  |     |    +------+   |                                  |          |
  |     v       |       |                                  |          |
  |  [get token]|       |                                  |          |
  |     |       |       |                                  |          |
  |     +       |       |                                  |   +-----------+
  | PQconnect > |       | --------- Access Token --------> | > | Validator |
  |             |       | <---------- Auth Result -------- | < |   Hook    |
  |             |       |                                  |   +-----------+
  |             +-------+                                  |          |
  +----------------------+                                 +----------+

Please note that we are working on modifying/adding new tests (from
Jacob's Patch) with the latest changes. Will add a patch with tests
soon.

Thanks,
Mahendrakar.

On Wed, 18 Jan 2023 at 07:24, Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
>
> > All of this points at a bigger question to the community: if we choose
> > not to provide a flow implementation in libpq, is adding OAUTHBEARER
> > worth the additional maintenance cost?
>
> > My personal vote would be "no". I think the hook-only approach proposed
> > here would ensure that only larger providers would implement it in
> > practice
>
> Flow implementations in libpq are definitely a long term plan, and I
> agree that it would democratise the adoption.
> In the previous posts in this conversation I outlined the ones I think
> we should support.
>
> However, I don't see why it's strictly necessary to couple those.
> As long as the SASL exchange for OAUTHBEARER mechanism is supported by
> the protocol, the Client side can evolve at its own pace.
>
> At the same time, the current implementation allows clients to start
> building provider-agnostic OAUTH support. By using iddawc or OAUTH
> client implementations in the respective platforms.
> So I wouldn't refer to "larger providers", but rather "more motivated
> clients" here. Which definitely overlaps, but keeps the system open.
>
> > I'm not understanding the concern in the final point -- providers
> > generally require you to opt into device authorization, at least as far
> > as I can tell. So if you decide that it's not appropriate for your use
> > case... don't enable it. (And I haven't seen any claims that opting into
> > device authorization weakens the other flows in any way. So if we're
> > going to implement a flow in libpq, I still think device authorization
> > is the best choice, since it works on headless machines as well as those
> > with browsers.)
> I agree with the statement that Device code is the best first choice
> if we absolutely have to pick one.
> Though I don't think we have to.
>
> While device flow can be used for all kinds of user-facing
> applications, it's specifically designed for input-constrained
> scenarios. As clearly stated in the Abstract here -
> https://www.rfc-editor.org/rfc/rfc8628
> The authorization code with pkce flow is recommended by the RFSc and
> major providers for cases when it's feasible.
> The long term goal is to provide both, though I don't see why the
> backbone protocol implementation first wouldn't add value.
>
> Another point is user authentication is one side of the whole story
> and the other critical one is system-to-system authentication. Where
> we have Client Credentials and Certificates.
> With the latter it is much harder to get generically implemented, as
> provider-specific tokens need to be signed.
>
> Adding the other reasoning, I think libpq support for specific flows
> can get in the further iterations, after the protocol support.
>
> > in that case I'd rather spend cycles on generic SASL.
> I see 2 approaches to generic SASL:
> (a). Generic SASL is a framework used in the protocol, with the
> mechanisms implemented on top and exposed to the DBAs as auth types to
> configure in hba.
> This is the direction we're going here, which is well aligned with the
> existing hba-based auth configuration.
> (b). Generic SASL exposed to developers on the server- and client-
> side to extend on. It seems to be a much longer shot.
> The specific points of large ambiguity are libpq distribution model
> (which you pointed to) and potential pluggability of insecure
> mechanisms.
>
> I do see (a) as a sweet spot with a lot of value for various
> participants with much less ambiguity.
>
> > Additionally, the "issuer" field added here is not part of the RFC. I've
> > written my thoughts about unofficial extensions upthread but haven't
> > received a response, so I'm going to start being more strident: Please,
> > for the sake of reviewers, call out changes you've made to the spec, and
> > why they're justified.
> Thanks for your feedback on this. We had this discussion as well, and
> added that as a convenience for the client to identify the provider.
> I don't see a reason why an issuer would be absolutely necessary, so
> we will get your point that sticking to RFCs is a safer choice.
>
> > The patches seem to be out of order now (and the documentation in the
> > commit messages has been removed).
> Feedback taken. Work in progress.
>
> On Tue, Jan 17, 2023 at 2:44 PM Jacob Champion <jchampion@timescale.com> wrote:
> >
> > On Sun, Jan 15, 2023 at 12:03 PM Andrey Chudnovsky
> > <achudnovskij@gmail.com> wrote:
> > > 2. Removed Device Code implementation in libpq. Several reasons:
> > >    - Reduce scope and focus on the protocol first.
> > >    - Device code implementation uses iddawc dependency. Taking this
> > > dependency is a controversial step which requires broader discussion.
> > >    - Device code implementation without iddaws would significantly
> > > increase the scope of the patch, as libpq needs to poll the token
> > > endpoint, setup different API calls, e.t.c.
> > >    - That flow should canonically only be used for clients which can't
> > > invoke browsers. If it is the only flow to be implemented, it can be
> > > used in the context when it's not expected by the OAUTH protocol.
> >
> > I'm not understanding the concern in the final point -- providers
> > generally require you to opt into device authorization, at least as far
> > as I can tell. So if you decide that it's not appropriate for your use
> > case... don't enable it. (And I haven't seen any claims that opting into
> > device authorization weakens the other flows in any way. So if we're
> > going to implement a flow in libpq, I still think device authorization
> > is the best choice, since it works on headless machines as well as those
> > with browsers.)
> >
> > All of this points at a bigger question to the community: if we choose
> > not to provide a flow implementation in libpq, is adding OAUTHBEARER
> > worth the additional maintenance cost?
> >
> > My personal vote would be "no". I think the hook-only approach proposed
> > here would ensure that only larger providers would implement it in
> > practice, and in that case I'd rather spend cycles on generic SASL.
> >
> > > 3. Temporarily removed test suite. We are actively working on aligning
> > > the tests with the latest changes. Will add a patch with tests soon.
> >
> > Okay. Case in point, the following change to the patch appears to be
> > invalid JSON:
> >
> > > +   appendStringInfo(&buf,
> > > +       "{ "
> > > +           "\"status\": \"invalid_token\", "
> > > +           "\"openid-configuration\": \"%s\","
> > > +           "\"scope\": \"%s\" ",
> > > +           "\"issuer\": \"%s\" ",
> > > +       "}",
> >
> > Additionally, the "issuer" field added here is not part of the RFC. I've
> > written my thoughts about unofficial extensions upthread but haven't
> > received a response, so I'm going to start being more strident: Please,
> > for the sake of reviewers, call out changes you've made to the spec, and
> > why they're justified.
> >
> > The patches seem to be out of order now (and the documentation in the
> > commit messages has been removed).
> >
> > Thanks,
> > --Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Stephen Frost
Date:
Greetings,

* mahendrakar s (mahendrakarforpg@gmail.com) wrote:
> The "issuer" field has been removed to align  with the RFC
> implementation - https://www.rfc-editor.org/rfc/rfc7628.
> This patch "v6" is a single patch to support the OAUTH BEARER token
> through psql connection string.
> Below flow is supported. Added the documentation in the commit messages.
>
>  +----------------------+                                 +----------+
>   |             +-------+                                  | Postgres |
>   | PQconnect ->|       |                                  |          |
>   |             |       |                                  |   +-----------+
>   |             |       | ---------- Empty Token---------> | > |           |
>   |             | libpq | <-- Error(Discovery + Scope ) -- | < | Pre-Auth  |
>   |          +------+   |                                  |   |  Hook     |
>   |     +- < | Hook |   |                                  |   +-----------+
>   |     |    +------+   |                                  |          |
>   |     v       |       |                                  |          |
>   |  [get token]|       |                                  |          |
>   |     |       |       |                                  |          |
>   |     +       |       |                                  |   +-----------+
>   | PQconnect > |       | --------- Access Token --------> | > | Validator |
>   |             |       | <---------- Auth Result -------- | < |   Hook    |
>   |             |       |                                  |   +-----------+
>   |             +-------+                                  |          |
>   +----------------------+                                 +----------+
>
> Please note that we are working on modifying/adding new tests (from
> Jacob's Patch) with the latest changes. Will add a patch with tests
> soon.

Having skimmed back through this thread again, I still feel that the
direction that was originally being taken (actually support something in
libpq and the backend, be it with libiddawc or something else or even
our own code, and not just throw hooks in various places) makes a lot
more sense and is a lot closer to how Kerberos and client-side certs and
even LDAP auth work today.  That also seems like a much better answer
for our users when it comes to new authentication methods than having
extensions and making libpq developers have to write their own custom
code, not to mention that we'd still need to implement something in psql
to provide such a hook if we are to have psql actually usefully exercise
this, no?

In the Kerberos test suite we have today, we actually bring up a proper
Kerberos server, set things up, and then test end-to-end installing a
keytab for the server, getting a TGT, getting a service ticket, testing
authentication and encryption, etc.  Looking around, it seems like the
equivilant would perhaps be to use Glewlwyd and libiddawc or libcurl and
our own code to really be able to test this and show that it works and
that we're doing it correctly, and to let us know if we break something.

Thanks,

Stephen

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Feb 20, 2023 at 2:35 PM Stephen Frost <sfrost@snowman.net> wrote:
> Having skimmed back through this thread again, I still feel that the
> direction that was originally being taken (actually support something in
> libpq and the backend, be it with libiddawc or something else or even
> our own code, and not just throw hooks in various places) makes a lot
> more sense and is a lot closer to how Kerberos and client-side certs and
> even LDAP auth work today.

Cool, that helps focus the effort. Thanks!

> That also seems like a much better answer
> for our users when it comes to new authentication methods than having
> extensions and making libpq developers have to write their own custom
> code, not to mention that we'd still need to implement something in psql
> to provide such a hook if we are to have psql actually usefully exercise
> this, no?

I don't mind letting clients implement their own flows... as long as
it's optional. So even if we did use a hook in the end, I agree that
we've got to exercise it ourselves.

> In the Kerberos test suite we have today, we actually bring up a proper
> Kerberos server, set things up, and then test end-to-end installing a
> keytab for the server, getting a TGT, getting a service ticket, testing
> authentication and encryption, etc.  Looking around, it seems like the
> equivilant would perhaps be to use Glewlwyd and libiddawc or libcurl and
> our own code to really be able to test this and show that it works and
> that we're doing it correctly, and to let us know if we break something.

The original patchset includes a test server in Python -- a major
advantage being that you can test the client and server independently
of each other, since the implementation is so asymmetric. Additionally
testing against something like Glewlwyd would be a great way to stack
coverage. (If we *only* test against a packaged server, though, it'll
be harder to test our stuff in the presence of malfunctions and other
corner cases.)

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
Thanks for the feedback,

> Having skimmed back through this thread again, I still feel that the
> direction that was originally being taken (actually support something in
> libpq and the backend, be it with libiddawc or something else or even
> our own code, and not just throw hooks in various places) makes a lot
> more sense and is a lot closer to how Kerberos and client-side certs and
> even LDAP auth work today.  That also seems like a much better answer
> for our users when it comes to new authentication methods than having
> extensions and making libpq developers have to write their own custom
> code, not to mention that we'd still need to implement something in psql
> to provide such a hook if we are to have psql actually usefully exercise
> this, no?

libpq implementation is the long term plan. However, our intention is
to start with the protocol implementation which allows us to build on
top of.

While device code is the right solution for psql, having that as the
only one can result in incentive to use it in the cases it's not
intended to.
Reasonably good implementation should support all of the following:
(1.) authorization code with pkce (for GUI applications)
(2.) device code (for console user logins)
(3.) client secret
(4.) some support for client certificate flow

(1.) and (4.) require more work to get implemented, though necessary
for encouraging the most secure grant types.
As we didn't have those pieces, we're proposing starting with the
protocol, which can be used by the ecosystem to build token flow
implementations.
Then add the libpq support for individual grant types.

We originally looked at starting with bare bone protocol for PG16 and
adding libpq support in PG17.
That plan won't happen, though still splitting the work into separate
stages would make more sense in my opinion.

Several questions to follow up:
(a.) Would you support committing the protocol first? or you see libpq
implementation for grants as the prerequisite to consider the auth
type?
(b.) As of today, the server side core does not validate that the
token is actually a valid jwt token. Instead relies on the extensions
to do the validation.
Do you think server core should do the basic validation before passing
to extensions to prevent the auth type being used for anything other
than OAUTH flows?

Tests are the plan for the commit-ready implementation.

Thanks!
Andrey.

On Tue, Feb 21, 2023 at 2:24 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Mon, Feb 20, 2023 at 2:35 PM Stephen Frost <sfrost@snowman.net> wrote:
> > Having skimmed back through this thread again, I still feel that the
> > direction that was originally being taken (actually support something in
> > libpq and the backend, be it with libiddawc or something else or even
> > our own code, and not just throw hooks in various places) makes a lot
> > more sense and is a lot closer to how Kerberos and client-side certs and
> > even LDAP auth work today.
>
> Cool, that helps focus the effort. Thanks!
>
> > That also seems like a much better answer
> > for our users when it comes to new authentication methods than having
> > extensions and making libpq developers have to write their own custom
> > code, not to mention that we'd still need to implement something in psql
> > to provide such a hook if we are to have psql actually usefully exercise
> > this, no?
>
> I don't mind letting clients implement their own flows... as long as
> it's optional. So even if we did use a hook in the end, I agree that
> we've got to exercise it ourselves.
>
> > In the Kerberos test suite we have today, we actually bring up a proper
> > Kerberos server, set things up, and then test end-to-end installing a
> > keytab for the server, getting a TGT, getting a service ticket, testing
> > authentication and encryption, etc.  Looking around, it seems like the
> > equivilant would perhaps be to use Glewlwyd and libiddawc or libcurl and
> > our own code to really be able to test this and show that it works and
> > that we're doing it correctly, and to let us know if we break something.
>
> The original patchset includes a test server in Python -- a major
> advantage being that you can test the client and server independently
> of each other, since the implementation is so asymmetric. Additionally
> testing against something like Glewlwyd would be a great way to stack
> coverage. (If we *only* test against a packaged server, though, it'll
> be harder to test our stuff in the presence of malfunctions and other
> corner cases.)
>
> Thanks,
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Stephen Frost
Date:
Greetings,

* Jacob Champion (jchampion@timescale.com) wrote:
> On Mon, Feb 20, 2023 at 2:35 PM Stephen Frost <sfrost@snowman.net> wrote:
> > Having skimmed back through this thread again, I still feel that the
> > direction that was originally being taken (actually support something in
> > libpq and the backend, be it with libiddawc or something else or even
> > our own code, and not just throw hooks in various places) makes a lot
> > more sense and is a lot closer to how Kerberos and client-side certs and
> > even LDAP auth work today.
>
> Cool, that helps focus the effort. Thanks!

Great, glad to hear that.

> > That also seems like a much better answer
> > for our users when it comes to new authentication methods than having
> > extensions and making libpq developers have to write their own custom
> > code, not to mention that we'd still need to implement something in psql
> > to provide such a hook if we are to have psql actually usefully exercise
> > this, no?
>
> I don't mind letting clients implement their own flows... as long as
> it's optional. So even if we did use a hook in the end, I agree that
> we've got to exercise it ourselves.

This really doesn't feel like a great area to try and do hooks or
similar in, not the least because that approach has been tried and tried
again (PAM, GSSAPI, SASL would all be examples..) and frankly none of
them has turned out great (which is why we can't just tell people "well,
install the pam_oauth2 and watch everything work!") and this strikes me
as trying to do that yet again but worse as it's not even a dedicated
project trying to solve the problem but more like a side project.  SCRAM
was good, we've come a long way thanks to that, this feels like it
should be more in line with that rather than trying to invent yet
another new "generic" set of hooks/APIs that will just cause DBAs and
our users headaches trying to make work.

> > In the Kerberos test suite we have today, we actually bring up a proper
> > Kerberos server, set things up, and then test end-to-end installing a
> > keytab for the server, getting a TGT, getting a service ticket, testing
> > authentication and encryption, etc.  Looking around, it seems like the
> > equivilant would perhaps be to use Glewlwyd and libiddawc or libcurl and
> > our own code to really be able to test this and show that it works and
> > that we're doing it correctly, and to let us know if we break something.
>
> The original patchset includes a test server in Python -- a major
> advantage being that you can test the client and server independently
> of each other, since the implementation is so asymmetric. Additionally
> testing against something like Glewlwyd would be a great way to stack
> coverage. (If we *only* test against a packaged server, though, it'll
> be harder to test our stuff in the presence of malfunctions and other
> corner cases.)

Oh, that's even better- I agree entirely that having test code that can
be instructed to return specific errors so that we can test that our
code responds properly is great (and is why pgbackrest has things like
a stub'd out libpq, fake s3, GCS, and Azure servers, and more) and would
certainly want to keep that, even if we also build out a test that uses
a real server to provide integration testing with not-our-code too.

Thanks!

Stephen

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> This really doesn't feel like a great area to try and do hooks or
> similar in, not the least because that approach has been tried and tried
> again (PAM, GSSAPI, SASL would all be examples..) and frankly none of
> them has turned out great (which is why we can't just tell people "well,
> install the pam_oauth2 and watch everything work!") and this strikes me
> as trying to do that yet again but worse as it's not even a dedicated
> project trying to solve the problem but more like a side project.

In this case it's not intended to be an open-ended hook, but rather an
implementation of a specific rfc (rfc-7628) which defines a
client-server communication for the authentication flow.
The rfc itself does leave a lot of flexibility on specific parts of
the implementation. Which do require hooks:
(1.) Server side hook to validate the token, which is specific to the
OAUTH provider.
(2.) Client side hook to request the client to obtain the token.

On (1.), we would need a hook for the OAUTH provider extension to do
validation. We can though do some basic check that the credential is
indeed a JWT token signed by the requested issuer.

Specifically (2.) is where we can provide a layer in libpq to simplify
the integration. i.e. implement some OAUTH flows.
Though we would need some flexibility for the clients to bring their own token:
For example there are cases where the credential to obtain the token
is stored in a separate secure location and the token is returned from
a separate service or pushed from a more secure environment.

> another new "generic" set of hooks/APIs that will just cause DBAs and
> our users headaches trying to make work.
As I mentioned above, it's an rfc implementation, rather than our invention.
When it comes to DBAs and the users.
Builtin libpq implementations which allows psql and pgadmin to
seamlessly connect should suffice those needs.
While extensibility would allow the ecosystem to be open for OAUTH
providers, SAAS developers, PAAS providers and other institutional
players.

Thanks!
Andrey.

On Thu, Feb 23, 2023 at 10:47 AM Stephen Frost <sfrost@snowman.net> wrote:
>
> Greetings,
>
> * Jacob Champion (jchampion@timescale.com) wrote:
> > On Mon, Feb 20, 2023 at 2:35 PM Stephen Frost <sfrost@snowman.net> wrote:
> > > Having skimmed back through this thread again, I still feel that the
> > > direction that was originally being taken (actually support something in
> > > libpq and the backend, be it with libiddawc or something else or even
> > > our own code, and not just throw hooks in various places) makes a lot
> > > more sense and is a lot closer to how Kerberos and client-side certs and
> > > even LDAP auth work today.
> >
> > Cool, that helps focus the effort. Thanks!
>
> Great, glad to hear that.
>
> > > That also seems like a much better answer
> > > for our users when it comes to new authentication methods than having
> > > extensions and making libpq developers have to write their own custom
> > > code, not to mention that we'd still need to implement something in psql
> > > to provide such a hook if we are to have psql actually usefully exercise
> > > this, no?
> >
> > I don't mind letting clients implement their own flows... as long as
> > it's optional. So even if we did use a hook in the end, I agree that
> > we've got to exercise it ourselves.
>
> This really doesn't feel like a great area to try and do hooks or
> similar in, not the least because that approach has been tried and tried
> again (PAM, GSSAPI, SASL would all be examples..) and frankly none of
> them has turned out great (which is why we can't just tell people "well,
> install the pam_oauth2 and watch everything work!") and this strikes me
> as trying to do that yet again but worse as it's not even a dedicated
> project trying to solve the problem but more like a side project.  SCRAM
> was good, we've come a long way thanks to that, this feels like it
> should be more in line with that rather than trying to invent yet
> another new "generic" set of hooks/APIs that will just cause DBAs and
> our users headaches trying to make work.
>
> > > In the Kerberos test suite we have today, we actually bring up a proper
> > > Kerberos server, set things up, and then test end-to-end installing a
> > > keytab for the server, getting a TGT, getting a service ticket, testing
> > > authentication and encryption, etc.  Looking around, it seems like the
> > > equivilant would perhaps be to use Glewlwyd and libiddawc or libcurl and
> > > our own code to really be able to test this and show that it works and
> > > that we're doing it correctly, and to let us know if we break something.
> >
> > The original patchset includes a test server in Python -- a major
> > advantage being that you can test the client and server independently
> > of each other, since the implementation is so asymmetric. Additionally
> > testing against something like Glewlwyd would be a great way to stack
> > coverage. (If we *only* test against a packaged server, though, it'll
> > be harder to test our stuff in the presence of malfunctions and other
> > corner cases.)
>
> Oh, that's even better- I agree entirely that having test code that can
> be instructed to return specific errors so that we can test that our
> code responds properly is great (and is why pgbackrest has things like
> a stub'd out libpq, fake s3, GCS, and Azure servers, and more) and would
> certainly want to keep that, even if we also build out a test that uses
> a real server to provide integration testing with not-our-code too.
>
> Thanks!
>
> Stephen



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Stephen Frost
Date:
Greetings,

* Andrey Chudnovsky (achudnovskij@gmail.com) wrote:
> > This really doesn't feel like a great area to try and do hooks or
> > similar in, not the least because that approach has been tried and tried
> > again (PAM, GSSAPI, SASL would all be examples..) and frankly none of
> > them has turned out great (which is why we can't just tell people "well,
> > install the pam_oauth2 and watch everything work!") and this strikes me
> > as trying to do that yet again but worse as it's not even a dedicated
> > project trying to solve the problem but more like a side project.
>
> In this case it's not intended to be an open-ended hook, but rather an
> implementation of a specific rfc (rfc-7628) which defines a
> client-server communication for the authentication flow.
> The rfc itself does leave a lot of flexibility on specific parts of
> the implementation. Which do require hooks:

Color me skeptical on an RFC that requires hooks.

> (1.) Server side hook to validate the token, which is specific to the
> OAUTH provider.
> (2.) Client side hook to request the client to obtain the token.

Perhaps I'm missing it... but weren't these handled with what the
original patch that Jacob had was doing?

> On (1.), we would need a hook for the OAUTH provider extension to do
> validation. We can though do some basic check that the credential is
> indeed a JWT token signed by the requested issuer.
>
> Specifically (2.) is where we can provide a layer in libpq to simplify
> the integration. i.e. implement some OAUTH flows.
> Though we would need some flexibility for the clients to bring their own token:
> For example there are cases where the credential to obtain the token
> is stored in a separate secure location and the token is returned from
> a separate service or pushed from a more secure environment.

In those cases... we could, if we wanted, simply implement the code to
actually pull the token, no?  We don't *have* to have a hook here for
this, we could just make it work.

> > another new "generic" set of hooks/APIs that will just cause DBAs and
> > our users headaches trying to make work.
> As I mentioned above, it's an rfc implementation, rather than our invention.

While I only took a quick look, I didn't see anything in that RFC that
explicitly says that hooks or a plugin or a library or such is required
to meet the RFC.  Sure, there are places which say that the
implementation is specific to a particular server or client but that's
not the same thing.

> When it comes to DBAs and the users.
> Builtin libpq implementations which allows psql and pgadmin to
> seamlessly connect should suffice those needs.
> While extensibility would allow the ecosystem to be open for OAUTH
> providers, SAAS developers, PAAS providers and other institutional
> players.

Each to end up writing their own code to do largely the same thing
without the benefit of the larger community to be able to review and
ensure that it's done properly?

That doesn't sound like a great approach to me.

Thanks,

Stephen

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Sep 23, 2022 at 3:39 PM Jacob Champion <jchampion@timescale.com> wrote:
> Here's a newly rebased v5. (They're all zipped now, which I probably
> should have done a while back, sorry.)

To keep this current, v7 is rebased over latest, without the pluggable
authentication patches. This doesn't yet address the architectural
feedback that was discussed previously, so if you're primarily
interested in that, you can safely ignore this version of the
patchset.

The key changes here include
- Meson support, for both the build and the pytest suite
- Cirrus support (and unsurprisingly, Mac and Windows builds fail due
to the Linux-oriented draft code)
- A small tweak to support iddawc down to 0.9.8 (shipped with e.g.
Debian Bullseye)
- Removal of the authn_id test extension in favor of SYSTEM_USER

The meson+pytest support was big enough that I split it into its own
patch. It's not very polished yet, but it mostly works, and when
running tests via Meson it'll now spin up a test server for you. My
virtualenv approach apparently interacts poorly with the multiarch
Cirrus setup (64-bit tests pass, 32-bit tests fail).

Moving forward, the first thing I plan to tackle is asynchronous
operation, so that polling clients can still operate sanely. If I can
find a good solution there, the conversations about possible extension
points should get a lot easier.

Thanks,
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On 4/27/23 10:35, Jacob Champion wrote:
> Moving forward, the first thing I plan to tackle is asynchronous
> operation, so that polling clients can still operate sanely. If I can
> find a good solution there, the conversations about possible extension
> points should get a lot easier.

Attached is patchset v8, now with concurrency and 300% more cURL! And
many more questions to answer.

This is a full reimplementation of the client-side OAuth flow. It's an
async-first engine built on top of cURL's multi handles. All pending
operations are multiplexed into a single epoll set (the "altsock"),
which is exposed through PQsocket() for the duration of the OAuth flow.
Clients return to the flow on their next call to PQconnectPoll().

Andrey and Mahendrakar: you'll probably be interested in the
conn->async_auth() callback, conn->altsock, and the pg_fe_run_oauth_flow
entry point. This is intended to be the foundation for alternative flows.

I've kept the blocking iddawc implementation for comparison, but if
you're running the tests against it, be aware that the asynchronous
tests will, predictably, hang. Skip them with `py.test -k 'not
asynchronous'`.

= The Good =

- PQconnectPoll() is no longer indefinitely blocked on a single
connection's OAuth handshake. (iddawc doesn't appear to have any
asynchronous primitives in its API, unless I've missed something crucial.)

- We now have a swappable entry point. Alternative flows could be
implemented by applications without forcing clients to redesign their
polling loops (PQconnect* should just work as expected).

- We have full control over corner cases in our default flow. Debugging
failures is much nicer, with explanations of exactly what has gone wrong
and where, compared to iddawc's "I_ERROR" messages.

- cURL is not a lightweight library by any means, but we're no longer
bundling things like web servers that we're not going to use.

= The Bad =

- Unsurprisingly, there's a lot more code now that we're implementing
the flow ourselves. The client patch has tripled in size, and we'd be on
the hook for implementing and staying current with the RFCs.

- The client implementation is currently epoll-/Linux-specific. I think
kqueue shouldn't be too much trouble for the BSDs, but it's even more
code to maintain.

- Some clients in the wild (psycopg2/psycopg) suppress all notifications
during PQconnectPoll(). To accommodate them, I no longer use the
noticeHooks for communicating the user code, but that means we have to
come up with some other way to let applications override the printing to
stderr. Something like the OpenSSL decryption callback, maybe?

= The Ugly =

- Unless someone is aware of some amazing Winsock magic, I'm pretty sure
the multiplexed-socket approach is dead in the water on Windows. I think
the strategy there probably has to be a background thread plus a fake
"self-pipe" (loopback socket) for polling... which may be controversial?

- We have to figure out how to initialize cURL in a thread-safe manner.
Newer versions of libcurl and OpenSSL improve upon this situation, but I
don't think there's a way to check at compile time whether the
initialization strategy is safe or not (and even at runtime, I think
there may be a chicken-and-egg problem with the API, where it's not safe
to check for thread-safe initialization until after you've safely
initialized).

= Next Steps =

There are so many TODOs in the cURL implementation: it's been a while
since I've done any libcurl programming, it all needs to be hardened,
and I need to comb through the relevant specs again. But I don't want to
gold-plate it if this overall approach is unacceptable. So, questions
for the gallery:

1) Would starting up a background thread (pooled or not) be acceptable
on Windows? Alternatively, does anyone know enough Winsock deep magic to
combine multiple pending events into one (selectable!) socket?

2) If a background thread is acceptable on one platform, does it make
more sense to use one on every platform and just have synchronous code
everywhere? Or should we use a threadless async implementation when we can?

3) Is the current conn->async_auth() entry point sufficient for an
application to implement the Microsoft flows discussed upthread?

4) Would we want to try to require a new enough cURL/OpenSSL to avoid
thread safety problems during initialization, or do we need to introduce
some API equivalent to PQinitOpenSSL?

5) Does this maintenance tradeoff (full control over the client vs. a
large amount of RFC-governed code) seem like it could be okay?

Thanks,
--Jacob
Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniele Varrazzo
Date:
On Sat, 20 May 2023 at 00:01, Jacob Champion <jchampion@timescale.com> wrote:

> - Some clients in the wild (psycopg2/psycopg) suppress all notifications
> during PQconnectPoll().

If there is anything we can improve in psycopg please reach out.

-- Daniele



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, May 23, 2023 at 4:22 AM Daniele Varrazzo
<daniele.varrazzo@gmail.com> wrote:
> On Sat, 20 May 2023 at 00:01, Jacob Champion <jchampion@timescale.com> wrote:
> > - Some clients in the wild (psycopg2/psycopg) suppress all notifications
> > during PQconnectPoll().
>
> If there is anything we can improve in psycopg please reach out.

Will do, thank you! But in this case, I think there's nothing to
improve in psycopg -- in fact, it highlighted the problem with my
initial design, and now I think the notice processor will never be an
appropriate avenue for communication of the user code.

The biggest issue is that there's a chicken-and-egg situation: if
you're using the synchronous PQconnect* API, you can't override the
notice hooks while the handshake is in progress, because you don't
have a connection handle yet. The second problem is that there are a
bunch of parameters coming back from the server (user code,
verification URI, expiration time) that the application may choose to
display or use, and communicating those pieces in a (probably already
translated) flat text string is a pretty hostile API.

So I think we'll probably need to provide a global handler API,
similar to the passphrase hook we currently provide, that can receive
these pieces separately and assemble them however the application
desires. The hard part will be to avoid painting ourselves into a
corner, because this particular information is specific to the device
authorization flow, and if we ever want to add other flows into libpq,
we'll probably not want to add even more hooks.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On 5/19/23 15:01, Jacob Champion wrote:
> But I don't want to
> gold-plate it if this overall approach is unacceptable. So, questions
> for the gallery:
> 
> 1) Would starting up a background thread (pooled or not) be acceptable
> on Windows? Alternatively, does anyone know enough Winsock deep magic to
> combine multiple pending events into one (selectable!) socket?
> 
> 2) If a background thread is acceptable on one platform, does it make
> more sense to use one on every platform and just have synchronous code
> everywhere? Or should we use a threadless async implementation when we can?
> 
> 3) Is the current conn->async_auth() entry point sufficient for an
> application to implement the Microsoft flows discussed upthread?
> 
> 4) Would we want to try to require a new enough cURL/OpenSSL to avoid
> thread safety problems during initialization, or do we need to introduce
> some API equivalent to PQinitOpenSSL?
> 
> 5) Does this maintenance tradeoff (full control over the client vs. a
> large amount of RFC-governed code) seem like it could be okay?

There was additional interest at PGCon, so I've registered this in the
commitfest.

Potential reviewers should be aware that the current implementation
requires Linux (or, more specifically, epoll), as the cfbot shows. But
if you have any opinions on the above questions, those will help me
tackle the other platforms. :D

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Thomas Munro
Date:
On Sat, May 20, 2023 at 10:01 AM Jacob Champion <jchampion@timescale.com> wrote:
> - The client implementation is currently epoll-/Linux-specific. I think
> kqueue shouldn't be too much trouble for the BSDs, but it's even more
> code to maintain.

I guess you also need a fallback that uses plain old POSIX poll()?  I
see you're not just using epoll but also timerfd.  Could that be
converted to plain old timeout bookkeeping?  That should be enough to
get every other Unix and *possibly* also Windows to work with the same
code path.

> - Unless someone is aware of some amazing Winsock magic, I'm pretty sure
> the multiplexed-socket approach is dead in the water on Windows. I think
> the strategy there probably has to be a background thread plus a fake
> "self-pipe" (loopback socket) for polling... which may be controversial?

I am not a Windows user or hacker, but there are certainly several
ways to multiplex sockets.  First there is the WSAEventSelect() +
WaitForMultipleObjects() approach that latch.c uses.  It has the
advantage that it allows socket readiness to be multiplexed with
various other things that use Windows "events".  But if you don't need
that, ie you *only* need readiness-based wakeup for a bunch of sockets
and no other kinds of fd or object, you can use winsock's plain old
select() or its fairly faithful poll() clone called WSAPoll().  It
looks a bit like that'd be true here if you could kill the timerfd?

It's a shame to write modern code using select(), but you can find
lots of shouting all over the internet about WSAPoll()'s defects, most
famously the cURL guys[1] whose blog is widely cited, so people still
do it.  Possibly some good news on that front:  by my reading of the
docs, it looks like that problem was fixed in Windows 10 2004[2] which
itself is by now EOL, so all systems should have the fix?  I suspect
that means that, finally, you could probably just use the same poll()
code path for Unix (when epoll is not available) *and* Windows these
days, making porting a lot easier.  But I've never tried it, so I
don't know what other problems there might be.  Another thing people
complain about is the lack of socketpair() or similar in winsock which
means you unfortunately can't easily make anonymous
select/poll-compatible local sockets, but that doesn't seem to be
needed here.

[1] https://daniel.haxx.se/blog/2012/10/10/wsapoll-is-broken/
[2] https://learn.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsapoll



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Jun 30, 2023 at 9:29 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Sat, May 20, 2023 at 10:01 AM Jacob Champion <jchampion@timescale.com> wrote:
> > - The client implementation is currently epoll-/Linux-specific. I think
> > kqueue shouldn't be too much trouble for the BSDs, but it's even more
> > code to maintain.
>
> I guess you also need a fallback that uses plain old POSIX poll()?

The use of the epoll API here is to combine several sockets into one,
not to actually call epoll_wait() itself. kqueue descriptors should
let us do the same, IIUC.

> I see you're not just using epoll but also timerfd.  Could that be
> converted to plain old timeout bookkeeping?  That should be enough to
> get every other Unix and *possibly* also Windows to work with the same
> code path.

I might be misunderstanding your suggestion, but I think our internal
bookkeeping is orthogonal to that. The use of timerfd here allows us
to forward libcurl's timeout requirements up to the top-level
PQsocket(). As an example, libcurl is free to tell us to call it again
in ten milliseconds, and we have to make sure a nonblocking client
calls us again after that elapses; otherwise they might hang waiting
for data that's not coming.

> > - Unless someone is aware of some amazing Winsock magic, I'm pretty sure
> > the multiplexed-socket approach is dead in the water on Windows. I think
> > the strategy there probably has to be a background thread plus a fake
> > "self-pipe" (loopback socket) for polling... which may be controversial?
>
> I am not a Windows user or hacker, but there are certainly several
> ways to multiplex sockets.  First there is the WSAEventSelect() +
> WaitForMultipleObjects() approach that latch.c uses.

I don't think that strategy plays well with select() clients, though
-- it requires a handle array, and we've just got the one socket.

My goal is to maintain compatibility with existing PQconnectPoll()
applications, where the only way we get to communicate with the client
is through the PQsocket() for the connection. Ideally, you shouldn't
have to completely rewrite your application loop just to make use of
OAuth. (I assume a requirement like that would be a major roadblock to
committing this -- and if that's not a correct assumption, then I
guess my job gets a lot easier?)

> It's a shame to write modern code using select(), but you can find
> lots of shouting all over the internet about WSAPoll()'s defects, most
> famously the cURL guys[1] whose blog is widely cited, so people still
> do it.

Right -- that's basically the root of my concern. I can't guarantee
that existing Windows clients out there are all using
WaitForMultipleObjects(). From what I can tell, whatever we hand up
through PQsocket() has to be fully Winsock-/select-compatible.

> Another thing people
> complain about is the lack of socketpair() or similar in winsock which
> means you unfortunately can't easily make anonymous
> select/poll-compatible local sockets, but that doesn't seem to be
> needed here.

For the background-thread implementation, it probably would be. I've
been looking at libevent (BSD-licensed) and its socketpair hack for
Windows...

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Thomas Munro
Date:
On Thu, Jul 6, 2023 at 9:00 AM Jacob Champion <jchampion@timescale.com> wrote:
> My goal is to maintain compatibility with existing PQconnectPoll()
> applications, where the only way we get to communicate with the client
> is through the PQsocket() for the connection. Ideally, you shouldn't
> have to completely rewrite your application loop just to make use of
> OAuth. (I assume a requirement like that would be a major roadblock to
> committing this -- and if that's not a correct assumption, then I
> guess my job gets a lot easier?)

Ah, right, I get it.

I guess there are a couple of ways to do it if we give up the goal of
no-code-change-for-the-client:

1.  Generalised PQsocket(), that so that a client can call something like:

int PQpollset(const PGConn *conn, struct pollfd fds[], int fds_size,
int *nfds, int *timeout_ms);

That way, libpq could tell you about which events it would like to
wait for on which fds, and when it would like you to call it back due
to timeout, and you can either pass that information directly to
poll() or WSAPoll() or some equivalent interface (we don't care, we
just gave you the info you need), or combine it in obvious ways with
whatever else you want to multiplex with in your client program.

2.  Convert those events into new libpq events like 'I want you to
call me back in 100ms', and 'call me back when socket #42 has data',
and let clients handle that by managing their own poll set etc.  (This
is something I've speculated about to support more efficient
postgres_fdw shard query multiplexing; gotta figure out how to get
multiple connections' events into one WaitEventSet...)

I guess there is a practical middle ground where client code on
systems that have epoll/kqueue can use OAUTHBEARER without any code
change, and the feature is available on other systems too but you'll
have to change your client code to use one of those interfaces or else
you get an error 'coz we just can't do it.  Or, more likely in the
first version, you just can't do it at all...  Doesn't seem that bad
to me.

BTW I will happily do the epoll->kqueue port work if necessary.



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Wed, Jul 5, 2023 at 3:07 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> I guess there are a couple of ways to do it if we give up the goal of
> no-code-change-for-the-client:
>
> 1.  Generalised PQsocket(), that so that a client can call something like:
>
> int PQpollset(const PGConn *conn, struct pollfd fds[], int fds_size,
> int *nfds, int *timeout_ms);
>
> That way, libpq could tell you about which events it would like to
> wait for on which fds, and when it would like you to call it back due
> to timeout, and you can either pass that information directly to
> poll() or WSAPoll() or some equivalent interface (we don't care, we
> just gave you the info you need), or combine it in obvious ways with
> whatever else you want to multiplex with in your client program.

I absolutely wanted something like this while I was writing the code
(it would have made things much easier), but I'd feel bad adding that
much complexity to the API if the vast majority of connections use
exactly one socket. Are there other use cases in libpq where you think
this expanded API could be useful? Maybe to lift some of the existing
restrictions for PQconnectPoll(), add async DNS resolution, or
something?

Couple complications I can think of at the moment:
1. Clients using persistent pollsets will have to remove old
descriptors, presumably by tracking the delta since the last call,
which might make for a rough transition. Bookkeeping bugs probably
wouldn't show up unless they used OAuth in their test suites. With the
current model, that's more hidden and libpq takes responsibility for
getting it right.
2. In the future, we might need to think carefully around situations
where we want multiple PGConn handles to share descriptors (e.g.
multiplexed backend connections). I avoid tricky questions at the
moment by assigning only one connection per multi pool.

> 2.  Convert those events into new libpq events like 'I want you to
> call me back in 100ms', and 'call me back when socket #42 has data',
> and let clients handle that by managing their own poll set etc.  (This
> is something I've speculated about to support more efficient
> postgres_fdw shard query multiplexing; gotta figure out how to get
> multiple connections' events into one WaitEventSet...)

Something analogous to libcurl's socket and timeout callbacks [1],
then? Or is there an existing libpq API you were thinking about using?

> I guess there is a practical middle ground where client code on
> systems that have epoll/kqueue can use OAUTHBEARER without any code
> change, and the feature is available on other systems too but you'll
> have to change your client code to use one of those interfaces or else
> you get an error 'coz we just can't do it.

That's a possibility -- if your platform is able to do it nicely,
might as well use it. (In a similar vein, I'd personally vote against
having every platform use a background thread, even if we decided to
implement it for Windows.)

> Or, more likely in the
> first version, you just can't do it at all...  Doesn't seem that bad
> to me.

Any initial opinions on whether it's worse or better than a worker thread?

> BTW I will happily do the epoll->kqueue port work if necessary.

And I will happily take you up on that; thanks!

--Jacob

[1] https://curl.se/libcurl/c/CURLMOPT_SOCKETFUNCTION.html



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Thomas Munro
Date:
On Fri, Jul 7, 2023 at 4:57 AM Jacob Champion <jchampion@timescale.com> wrote:
> On Wed, Jul 5, 2023 at 3:07 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > 2.  Convert those events into new libpq events like 'I want you to
> > call me back in 100ms', and 'call me back when socket #42 has data',
> > and let clients handle that by managing their own poll set etc.  (This
> > is something I've speculated about to support more efficient
> > postgres_fdw shard query multiplexing; gotta figure out how to get
> > multiple connections' events into one WaitEventSet...)
>
> Something analogous to libcurl's socket and timeout callbacks [1],
> then? Or is there an existing libpq API you were thinking about using?

Yeah.  Libpq already has an event concept.  I did some work on getting
long-lived WaitEventSet objects to be used in various places, some of
which got committed[1], but not yet the parts related to postgres_fdw
(which uses libpq connections to talk to other PostgreSQL servers, and
runs into the limitations of PQsocket()).  Horiguchi-san had the good
idea of extending the event system to cover socket changes, but I
haven't actually tried it yet.  One day.

> > Or, more likely in the
> > first version, you just can't do it at all...  Doesn't seem that bad
> > to me.
>
> Any initial opinions on whether it's worse or better than a worker thread?

My vote is that it's perfectly fine to make a new feature that only
works on some OSes.  If/when someone wants to work on getting it going
on Windows/AIX/Solaris (that's the complete set of no-epoll, no-kqueue
OSes we target), they can write the patch.

[1]
https://www.postgresql.org/message-id/flat/CA%2BhUKGJAC4Oqao%3DqforhNey20J8CiG2R%3DoBPqvfR0vOJrFysGw%40mail.gmail.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Jul 6, 2023 at 1:48 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Fri, Jul 7, 2023 at 4:57 AM Jacob Champion <jchampion@timescale.com> wrote:
> > Something analogous to libcurl's socket and timeout callbacks [1],
> > then? Or is there an existing libpq API you were thinking about using?
>
> Yeah.  Libpq already has an event concept.

Thanks -- I don't know how I never noticed libpq-events.h before.

Per-connection events (or callbacks) might bring up the same
chicken-and-egg situation discussed above, with the notice hook. We'll
be fine as long as PQconnectStart is guaranteed to return before the
PQconnectPoll engine gets to authentication, and it looks like that's
true with today's implementation, which returns pessimistically at
several points instead of just trying to continue the exchange. But I
don't know if that's intended as a guarantee for the future. At the
very least we would have to pin that implementation detail.

> > > Or, more likely in the
> > > first version, you just can't do it at all...  Doesn't seem that bad
> > > to me.
> >
> > Any initial opinions on whether it's worse or better than a worker thread?
>
> My vote is that it's perfectly fine to make a new feature that only
> works on some OSes.  If/when someone wants to work on getting it going
> on Windows/AIX/Solaris (that's the complete set of no-epoll, no-kqueue
> OSes we target), they can write the patch.

Okay. I'm curious to hear others' thoughts on that, too, if anyone's lurking.

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
Thanks Jacob for making progress on this.

> 3) Is the current conn->async_auth() entry point sufficient for an
> application to implement the Microsoft flows discussed upthread?

Please confirm my understanding of the flow is correct:
1. Client calls PQconnectStart.
  - The client doesn't know yet what is the issuer and the scope.
  - Parameters are strings, so callback is not provided yet.
2. Client gets PgConn from PQconnectStart return value and updates
conn->async_auth to its own callback.
3. Client polls PQconnectPoll and checks conn->sasl_state until the
value is SASL_ASYNC
4. Client accesses conn->oauth_issuer and conn->oauth_scope and uses
those info to trigger the token flow.
5. Expectations on async_auth:
    a. It returns PGRES_POLLING_READING while token acquisition is going on
    b. It returns PGRES_POLLING_OK and sets conn->sasl_state->token
when token acquisition succeeds.
6. Is the client supposed to do anything with the altsock parameter?

Is the above accurate understanding?

If yes, it looks workable with a couple of improvements I think would be nice:
1. Currently, oauth_exchange function sets conn->async_auth =
pg_fe_run_oauth_flow and starts Device Code flow automatically when
receiving challenge and metadata from the server.
    There probably should be a way for the client to prevent default
Device Code flow from triggering.
2. The current signature and expectations from async_auth function
seems to be tightly coupled with the internal implementation:
    - Pieces of information need to be picked and updated in different
places in the PgConn structure.
    - Function is expected to return PostgresPollingStatusType which
is used to communicate internal state to the client.
   Would it make sense to separate the internal callback used to
communicate with Device Code flow from client facing API?
   I.e. introduce a new client facing structure and enum to facilitate
callback and its return value.

-----------
On a separate note:
The backend code currently spawns an external command for token validation.
As we discussed before, an extension hook would be a more efficient
extensibility option.
We see clients make 10k+ connections using OAuth tokens per minute to
our service, and stating external processes would be too much overhead
here.

-----------

> 5) Does this maintenance tradeoff (full control over the client vs. a
> large amount of RFC-governed code) seem like it could be okay?

It's nice for psql to have Device Code flow. Can be made even more
convenient with refresh tokens support.
And for clients on resource constrained devices to be able to
authenticate with Client Credentials (app secret) without bringing
more dependencies.

In most other cases, upstream PostgreSQL drivers written in higher
level languages have libraries / abstractions to implement OAUTH flows
for the platforms they support.

On Fri, Jul 7, 2023 at 11:48 AM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Thu, Jul 6, 2023 at 1:48 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > On Fri, Jul 7, 2023 at 4:57 AM Jacob Champion <jchampion@timescale.com> wrote:
> > > Something analogous to libcurl's socket and timeout callbacks [1],
> > > then? Or is there an existing libpq API you were thinking about using?
> >
> > Yeah.  Libpq already has an event concept.
>
> Thanks -- I don't know how I never noticed libpq-events.h before.
>
> Per-connection events (or callbacks) might bring up the same
> chicken-and-egg situation discussed above, with the notice hook. We'll
> be fine as long as PQconnectStart is guaranteed to return before the
> PQconnectPoll engine gets to authentication, and it looks like that's
> true with today's implementation, which returns pessimistically at
> several points instead of just trying to continue the exchange. But I
> don't know if that's intended as a guarantee for the future. At the
> very least we would have to pin that implementation detail.
>
> > > > Or, more likely in the
> > > > first version, you just can't do it at all...  Doesn't seem that bad
> > > > to me.
> > >
> > > Any initial opinions on whether it's worse or better than a worker thread?
> >
> > My vote is that it's perfectly fine to make a new feature that only
> > works on some OSes.  If/when someone wants to work on getting it going
> > on Windows/AIX/Solaris (that's the complete set of no-epoll, no-kqueue
> > OSes we target), they can write the patch.
>
> Okay. I'm curious to hear others' thoughts on that, too, if anyone's lurking.
>
> Thanks!
> --Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Thomas Munro
Date:
On Fri, Jul 7, 2023 at 4:57 AM Jacob Champion <jchampion@timescale.com> wrote:
> On Wed, Jul 5, 2023 at 3:07 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > BTW I will happily do the epoll->kqueue port work if necessary.
>
> And I will happily take you up on that; thanks!

Some initial hacking, about 2 coffees' worth:
https://github.com/macdice/postgres/commits/oauth-kqueue

This compiles on FreeBSD and macOS, but I didn't have time to figure
out all your Python testing magic so I don't know if it works yet and
it's still red on CI...  one thing I wondered about is the *altsock =
timerfd part which I couldn't do.

The situation on macOS is a little odd: the man page says EVFILT_TIMER
is not implemented.  But clearly it is, we can read the source code as
I had to do to find out which unit of time it defaults to[1] (huh,
Apple's github repo for Darwin appears to have been archived recently
-- no more source code updates?  that'd be a shame!), and it works
exactly as expected in simple programs.  So I would just assume it
works until we see evidence otherwise.  (We already use a couple of
other things on macOS more or less by accident because configure finds
them, where they are undocumented or undeclared.)

[1] https://github.com/apple/darwin-xnu/blob/main/bsd/kern/kern_event.c#L1345



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Jul 7, 2023 at 2:16 PM Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
> Please confirm my understanding of the flow is correct:
> 1. Client calls PQconnectStart.
>   - The client doesn't know yet what is the issuer and the scope.

Right. (Strictly speaking it doesn't even know that OAuth will be used
for the connection, yet, though at some point we'll be able to force
the issue with e.g. `require_auth=oauth`. That's not currently
implemented.)

>   - Parameters are strings, so callback is not provided yet.
> 2. Client gets PgConn from PQconnectStart return value and updates
> conn->async_auth to its own callback.

This is where some sort of official authn callback registration (see
above reply to Daniele) would probably come in handy.

> 3. Client polls PQconnectPoll and checks conn->sasl_state until the
> value is SASL_ASYNC

In my head, the client's custom callback would always be invoked
during the call to PQconnectPoll, rather than making the client do
work in between calls. That way, a client can use custom flows even
with a synchronous PQconnectdb().

> 4. Client accesses conn->oauth_issuer and conn->oauth_scope and uses
> those info to trigger the token flow.

Right.

> 5. Expectations on async_auth:
>     a. It returns PGRES_POLLING_READING while token acquisition is going on
>     b. It returns PGRES_POLLING_OK and sets conn->sasl_state->token
> when token acquisition succeeds.

Yes. Though the token should probably be returned through some
explicit part of the callback, now that you mention it...

> 6. Is the client supposed to do anything with the altsock parameter?

The callback needs to set the altsock up with a select()able
descriptor, which wakes up the client when more work is ready to be
done. Without that, you can't handle multiple connections on a single
thread.

> If yes, it looks workable with a couple of improvements I think would be nice:
> 1. Currently, oauth_exchange function sets conn->async_auth =
> pg_fe_run_oauth_flow and starts Device Code flow automatically when
> receiving challenge and metadata from the server.
>     There probably should be a way for the client to prevent default
> Device Code flow from triggering.

Agreed. I'd like the client to be able to override this directly.

> 2. The current signature and expectations from async_auth function
> seems to be tightly coupled with the internal implementation:
>     - Pieces of information need to be picked and updated in different
> places in the PgConn structure.
>     - Function is expected to return PostgresPollingStatusType which
> is used to communicate internal state to the client.
>    Would it make sense to separate the internal callback used to
> communicate with Device Code flow from client facing API?
>    I.e. introduce a new client facing structure and enum to facilitate
> callback and its return value.

Yep, exactly right! I just wanted to check that the architecture
*looked* sufficient before pulling it up into an API.

> On a separate note:
> The backend code currently spawns an external command for token validation.
> As we discussed before, an extension hook would be a more efficient
> extensibility option.
> We see clients make 10k+ connections using OAuth tokens per minute to
> our service, and stating external processes would be too much overhead
> here.

+1. I'm curious, though -- what language do you expect to use to write
a production validator hook? Surely not low-level C...?

> > 5) Does this maintenance tradeoff (full control over the client vs. a
> > large amount of RFC-governed code) seem like it could be okay?
>
> It's nice for psql to have Device Code flow. Can be made even more
> convenient with refresh tokens support.
> And for clients on resource constrained devices to be able to
> authenticate with Client Credentials (app secret) without bringing
> more dependencies.
>
> In most other cases, upstream PostgreSQL drivers written in higher
> level languages have libraries / abstractions to implement OAUTH flows
> for the platforms they support.

Yeah, I'm really interested in seeing which existing high-level flows
can be mixed in through a driver. Trying not to get too far ahead of
myself :D

Thanks for the review!

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Jul 7, 2023 at 6:01 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Fri, Jul 7, 2023 at 4:57 AM Jacob Champion <jchampion@timescale.com> wrote:
> > On Wed, Jul 5, 2023 at 3:07 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > > BTW I will happily do the epoll->kqueue port work if necessary.
> >
> > And I will happily take you up on that; thanks!
>
> Some initial hacking, about 2 coffees' worth:
> https://github.com/macdice/postgres/commits/oauth-kqueue
>
> This compiles on FreeBSD and macOS, but I didn't have time to figure
> out all your Python testing magic so I don't know if it works yet and
> it's still red on CI...

This is awesome, thank you!

I need to look into the CI more, but it looks like the client tests
are passing, which is a good sign. (I don't understand why the
server-side tests are failing on FreeBSD, but they shouldn't be using
the libpq code at all, so I think your kqueue implementation is in the
clear. Cirrus doesn't have the logs from the server-side test failures
anywhere -- probably a bug in my Meson patch.)

> one thing I wondered about is the *altsock =
> timerfd part which I couldn't do.

I did that because I'm not entirely sure that libcurl is guaranteed to
have cleared out all its sockets from the mux, and I didn't want to
invite spurious wakeups. I should probably verify whether or not
that's possible. If so, we could just make that code resilient to
early wakeup, so that it matters less, or set up a second kqueue that
only holds the timer if that turns out to be unacceptable?

> The situation on macOS is a little odd: the man page says EVFILT_TIMER
> is not implemented.  But clearly it is, we can read the source code as
> I had to do to find out which unit of time it defaults to[1] (huh,
> Apple's github repo for Darwin appears to have been archived recently
> -- no more source code updates?  that'd be a shame!), and it works
> exactly as expected in simple programs.  So I would just assume it
> works until we see evidence otherwise.  (We already use a couple of
> other things on macOS more or less by accident because configure finds
> them, where they are undocumented or undeclared.)

Huh. Something to keep an eye on... might be a problem with older versions?

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Jul 10, 2023 at 4:50 PM Jacob Champion <jchampion@timescale.com> wrote:
> I don't understand why the
> server-side tests are failing on FreeBSD, but they shouldn't be using
> the libpq code at all, so I think your kqueue implementation is in the
> clear.

Oh, whoops, it's just the missed CLOEXEC flag in the final patch. (If
the write side of the pipe gets copied around, it hangs open and the
validator never sees the "end" of the token.) I'll switch the logic
around to set the flag on the write side instead of unsetting it on
the read side.

I have a WIP patch that passes tests on FreeBSD, which I'll clean up
and post Sometime Soon. macOS builds now but still fails before it
runs the test; looks like it's having trouble finding OpenSSL during
`pip install` of the test modules...

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Thomas Munro
Date:
On Wed, Jul 12, 2023 at 5:50 AM Jacob Champion <jchampion@timescale.com> wrote:
> Oh, whoops, it's just the missed CLOEXEC flag in the final patch. (If
> the write side of the pipe gets copied around, it hangs open and the
> validator never sees the "end" of the token.) I'll switch the logic
> around to set the flag on the write side instead of unsetting it on
> the read side.

Oops, sorry about that.  Glad to hear it's all working!

(FTR my parenthetical note about macOS/XNU sources on Github was a
false alarm: the "apple" account has stopped publishing a redundant
copy of that, but "apple-oss-distributions" is the account I should
have been looking at and it is live.  I guess it migrated at some
point, or something.  Phew.)



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
> >   - Parameters are strings, so callback is not provided yet.
> > 2. Client gets PgConn from PQconnectStart return value and updates
> > conn->async_auth to its own callback.
>
> This is where some sort of official authn callback registration (see
> above reply to Daniele) would probably come in handy.
+1

> > 3. Client polls PQconnectPoll and checks conn->sasl_state until the
> > value is SASL_ASYNC
>
> In my head, the client's custom callback would always be invoked
> during the call to PQconnectPoll, rather than making the client do
> work in between calls. That way, a client can use custom flows even
> with a synchronous PQconnectdb().
The way I see this API working is the asynchronous client needs at least 2 PQConnectPoll calls:
1. To be notified of what the authentication requirements are and get parameters.
2. When it acquires the token, the callback is used to inform libpq of the token and return PGRES_POLLING_OK.

For the synchronous client, the callback implementation would need to be aware of the fact that synchronous implementation invokes callback frequently and be implemented accordingly.

Bottom lime, I don't see much problem with the current proposal. Just the way of callback to know that OAUTH token is requested and get parameters relies on PQconnectPoll being invoked after corresponding parameters of conn object are populated.

> > > 5. Expectations on async_auth:
> > >     a. It returns PGRES_POLLING_READING while token acquisition is going on
> > >     b. It returns PGRES_POLLING_OK and sets conn->sasl_state->token
> > > when token acquisition succeeds.
> >
> > Yes. Though the token should probably be returned through some
> > explicit part of the callback, now that you mention it...
>
> > 6. Is the client supposed to do anything with the altsock parameter?
>
> The callback needs to set the altsock up with a select()able
> descriptor, which wakes up the client when more work is ready to be
> done. Without that, you can't handle multiple connections on a single
> thread.

Ok, thanks for clarification.

> > On a separate note:
> > The backend code currently spawns an external command for token validation.
> > As we discussed before, an extension hook would be a more efficient
> > extensibility option.
> > We see clients make 10k+ connections using OAuth tokens per minute to
> > our service, and stating external processes would be too much overhead
> > here.

> +1. I'm curious, though -- what language do you expect to use to write
> a production validator hook? Surely not low-level C...?

For the server side code, it would likely be identity providers publishing extensions to validate their tokens.
Those can do that in C too. Or extensions now can be implemented in Rust using pgrx. Which is developer friendly enough in my opinion.

> Yeah, I'm really interested in seeing which existing high-level flows
> can be mixed in through a driver. Trying not to get too far ahead of
> myself :D

I can think of the following as the most common:
1. Authorization code with PKCE. This is by far the most common for the user login flows. Requires to spin up a browser and listen to redirect URL/Port. Most high level platforms have libraries to do both.
2. Client Certificates. This requires an identity provider specific library to construct and sign the token. The providers publish SDKs to do that for most common app development platforms.

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Jul 11, 2023 at 10:50 AM Jacob Champion
<jchampion@timescale.com> wrote:
> I have a WIP patch that passes tests on FreeBSD, which I'll clean up
> and post Sometime Soon. macOS builds now but still fails before it
> runs the test; looks like it's having trouble finding OpenSSL during
> `pip install` of the test modules...

Hi Thomas,

v9 folds in your kqueue implementation (thanks again!) and I have a
quick question to check my understanding:

> +       case CURL_POLL_REMOVE:
> +           /*
> +            * We don't know which of these is currently registered, perhaps
> +            * both, so we try to remove both.  This means we need to tolerate
> +            * ENOENT below.
> +            */
> +           EV_SET(&ev[nev], socket, EVFILT_READ, EV_DELETE, 0, 0, 0);
> +           nev++;
> +           EV_SET(&ev[nev], socket, EVFILT_WRITE, EV_DELETE, 0, 0, 0);
> +           nev++;
> +           break;

We're not setting EV_RECEIPT for these -- is that because none of the
filters we're using are EV_CLEAR, and so it doesn't matter if we
accidentally pull pending events off the queue during the kevent() call?

v9 also improves the Cirrus debugging experience and fixes more issues
on macOS, so the tests should be green there now. The final patch in the
series works around what I think is a build bug in psycopg2 2.9 [1] for
the BSDs+meson.

Thanks,
--Jacob

[1] https://github.com/psycopg/psycopg2/issues/1599

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Thomas Munro
Date:
On Tue, Jul 18, 2023 at 11:55 AM Jacob Champion <jchampion@timescale.com> wrote:
> We're not setting EV_RECEIPT for these -- is that because none of the
> filters we're using are EV_CLEAR, and so it doesn't matter if we
> accidentally pull pending events off the queue during the kevent() call?

+1 for EV_RECEIPT ("just tell me about errors, don't drain any
events").  I had a vague memory that it caused portability problems.
Just checked... it was OpenBSD I was thinking of, but they finally
added that flag in 6.2 (2017).  Our older-than-that BF OpenBSD animal
recently retired so that should be fine.  (Yes, without EV_CLEAR it's
"level triggered" not "edge triggered" in epoll terminology, so the
way I had it was not broken, but the way you're suggesting would be
nicer.)  Note that you'll have to skip data == 0 (no error) too.

+ #ifdef HAVE_SYS_EVENT_H
+ /* macOS doesn't define the time unit macros, but uses milliseconds
by default. */
+ #ifndef NOTE_MSECONDS
+ #define NOTE_MSECONDS 0
+ #endif
+ #endif

While comparing the cousin OSs' man pages just now, I noticed that
it's not only macOS that lacks NOTE_MSECONDS, it's also OpenBSD and
NetBSD < 10.  Maybe just delete that cruft ^^^ and use literal 0 in
fflags directly.  FreeBSD, and recently also NetBSD, decided to get
fancy with high resolution timers, but 0 gets the traditional unit of
milliseconds on all platforms (I just wrote it like that because I
started from FreeBSD and didn't know the history/portability story).



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Jul 18, 2023 at 4:04 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Tue, Jul 18, 2023 at 11:55 AM Jacob Champion <jchampion@timescale.com> wrote:
> +1 for EV_RECEIPT ("just tell me about errors, don't drain any
> events").

Sounds good.

> While comparing the cousin OSs' man pages just now, I noticed that
> it's not only macOS that lacks NOTE_MSECONDS, it's also OpenBSD and
> NetBSD < 10.  Maybe just delete that cruft ^^^ and use literal 0 in
> fflags directly.

So I don't lose track of it, here's a v10 with those two changes.

Thanks!
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
v11 is a quick rebase over the recent Cirrus changes, and I've dropped
0006 now that psycopg2 can build against BSD/Meson setups (thanks Daniele!).

--Jacob
Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
v12 implements a first draft of a client hook, so applications can
replace either the device prompt or the entire OAuth flow. (Andrey and
Mahendrakar: hopefully this is close to what you need.) It also cleans
up some of the JSON tech debt.

Since (IMO) we don't want to introduce new hooks every time we make
improvements to the internal flows, the new hook is designed to
retrieve multiple pieces of data from the application. Clients either
declare their ability to get that data, or delegate the job to the
next link in the chain, which by default is a no-op. That lets us add
new data types to the end, and older clients will ignore them until
they're taught otherwise. (I'm trying hard not to over-engineer this,
but it seems like the concept of "give me some piece of data to
continue authenticating" could pretty easily subsume things like the
PQsslKeyPassHook if we wanted.)

The PQAUTHDATA_OAUTH_BEARER_TOKEN case is the one that replaces the
flow entirely, as discussed upthread. Your application gets the
discovery URI and the requested scope for the connection. It can then
either delegate back to libpq (e.g. if the issuer isn't one it can
help with), immediately return a token (e.g. if one is already cached
for the current user), or install a nonblocking callback to implement
a custom async flow. When the connection is closed (or fails), the
hook provides a cleanup function to free any resources it may have
allocated.

Thanks,
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Shlok Kyal
Date:
Hi,

On Fri, 3 Nov 2023 at 17:14, Jacob Champion <jchampion@timescale.com> wrote:
>
> v12 implements a first draft of a client hook, so applications can
> replace either the device prompt or the entire OAuth flow. (Andrey and
> Mahendrakar: hopefully this is close to what you need.) It also cleans
> up some of the JSON tech debt.

I went through CFbot and found that build is failing, links:

https://cirrus-ci.com/task/6061898244816896
https://cirrus-ci.com/task/6624848198238208
https://cirrus-ci.com/task/5217473314684928
https://cirrus-ci.com/task/6343373221527552

Just want to make sure you are aware of these failures.

Thanks,
Shlok Kumar Kyal



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Nov 3, 2023 at 5:28 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> Just want to make sure you are aware of these failures.

Thanks for the nudge! Looks like I need to reconcile with the changes
to JsonLexContext in 1c99cde2. I should be able to get to that next
week; in the meantime I'll mark it Waiting on Author.

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Nov 3, 2023 at 4:48 PM Jacob Champion <champion.p@gmail.com> wrote:
> Thanks for the nudge! Looks like I need to reconcile with the changes
> to JsonLexContext in 1c99cde2. I should be able to get to that next
> week; in the meantime I'll mark it Waiting on Author.

v13 rebases over latest. The JsonLexContext changes have simplified
0001 quite a bit, and there's probably a bit more minimization that
could be done.

Unfortunately the configure/Makefile build of libpq now seems to be
pulling in an `exit()` dependency in a way that Meson does not. (Or
maybe Meson isn't checking?) I still need to investigate that
difference and fix it, so I recommend Meson if you're looking to
test-drive a build.

Thanks,
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrey Chudnovsky
Date:
Hi Jacob,

Wanted to follow up on one of the topics discussed here in the past:
Do you plan to support adding an extension hook to validate the token?

It would allow a more efficient integration, then spinning a separate process.

Thanks!
Andrey.

On Wed, Nov 8, 2023 at 11:00 AM Jacob Champion <champion.p@gmail.com> wrote:
On Fri, Nov 3, 2023 at 4:48 PM Jacob Champion <champion.p@gmail.com> wrote:
> Thanks for the nudge! Looks like I need to reconcile with the changes
> to JsonLexContext in 1c99cde2. I should be able to get to that next
> week; in the meantime I'll mark it Waiting on Author.

v13 rebases over latest. The JsonLexContext changes have simplified
0001 quite a bit, and there's probably a bit more minimization that
could be done.

Unfortunately the configure/Makefile build of libpq now seems to be
pulling in an `exit()` dependency in a way that Meson does not. (Or
maybe Meson isn't checking?) I still need to investigate that
difference and fix it, so I recommend Meson if you're looking to
test-drive a build.

Thanks,
--Jacob

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Nov 9, 2023 at 5:43 PM Andrey Chudnovsky <achudnovskij@gmail.com> wrote:
> Do you plan to support adding an extension hook to validate the token?
>
> It would allow a more efficient integration, then spinning a separate process.

I think an API in the style of archive modules might probably be a
good way to go, yeah.

It's probably not very high on the list of priorities, though, since
the inputs and outputs are going to "look" the same whether you're
inside or outside of the server process. The client side is going to
need the bulk of the work/testing/validation. Speaking of which -- how
is the current PQauthDataHook design doing when paired with MS AAD
(er, Entra now I guess)? I haven't had an Azure test bed available for
a while.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 8 Nov 2023, at 20:00, Jacob Champion <champion.p@gmail.com> wrote:

> Unfortunately the configure/Makefile build of libpq now seems to be
> pulling in an `exit()` dependency in a way that Meson does not.

I believe this comes from the libcurl and specifically the ntlm_wb support
which is often enabled in system and package manager provided installations.
There isn't really a fix here apart from requiring a libcurl not compiled with
ntlm_wb support, or adding an exception to the exit() check in the Makefile.

Bringing this up with other curl developers to see if it could be fixed we
instead decided to deprecate the whole module as it's quirky and not used much.
This won't help with existing installations but at least it will be deprecated
and removed by the time v17 ships, so gating on a version shipped after its
removal will avoid it.

https://github.com/curl/curl/commit/04540f69cfd4bf16e80e7c190b645f1baf505a84

> (Or maybe Meson isn't checking?) I still need to investigate that
> difference and fix it, so I recommend Meson if you're looking to
> test-drive a build.

There is no corresponding check in the Meson build, which seems like a TODO.

--
Daniel Gustafsson




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Dec 5, 2023 at 1:44 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 8 Nov 2023, at 20:00, Jacob Champion <champion.p@gmail.com> wrote:
>
> > Unfortunately the configure/Makefile build of libpq now seems to be
> > pulling in an `exit()` dependency in a way that Meson does not.
>
> I believe this comes from the libcurl and specifically the ntlm_wb support
> which is often enabled in system and package manager provided installations.
> There isn't really a fix here apart from requiring a libcurl not compiled with
> ntlm_wb support, or adding an exception to the exit() check in the Makefile.
>
> Bringing this up with other curl developers to see if it could be fixed we
> instead decided to deprecate the whole module as it's quirky and not used much.
> This won't help with existing installations but at least it will be deprecated
> and removed by the time v17 ships, so gating on a version shipped after its
> removal will avoid it.
>
> https://github.com/curl/curl/commit/04540f69cfd4bf16e80e7c190b645f1baf505a84

Ooh, thank you for looking into that and fixing it!

> > (Or maybe Meson isn't checking?) I still need to investigate that
> > difference and fix it, so I recommend Meson if you're looking to
> > test-drive a build.
>
> There is no corresponding check in the Meson build, which seems like a TODO.

Okay, I'll look into that too when I get time.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
Hi all,

v14 rebases over latest and fixes a warning when assertions are
disabled. 0006 is temporary and hacks past a couple of issues I have
not yet root caused -- one of which makes me wonder if 0001 needs to
be considered alongside the recent pg_combinebackup and incremental
JSON work...?

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Feb 20, 2024 at 5:00 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> v14 rebases over latest and fixes a warning when assertions are
> disabled.

v15 is a housekeeping update that adds typedefs.list entries and runs
pgindent. It also includes a temporary patch from Daniel to get the
cfbot a bit farther (see above discussion on libcurl/exit).

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Feb 22, 2024 at 6:08 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> v15 is a housekeeping update that adds typedefs.list entries and runs
> pgindent.

v16 is more transformational!

Daniel contributed 0004, which completely replaces the
validator_command architecture with a C module API. This solves a
bunch of problems as discussed upthread and vastly simplifies the test
framework for the server side. 0004 also adds a set of Perl tests,
which will begin to subsume some of the Python server-side tests as I
get around to porting them. (@Daniel: 0005 is my diff against your
original patch, for review.)

0008 has been modified to quickfix the pgcommon linkage on the
Makefile side; my previous attempt at this only fixed Meson. The
patchset is now carrying a lot of squash-cruft, and I plan to flatten
it in the next version.

Thanks,
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Feb 27, 2024 at 11:20 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> This is done in v17, which is also now based on the two patches pulled
> out by Daniel in [1].

It looks like my patchset has been eaten by a malware scanner:

    550 Message content failed content scanning
(Sanesecurity.Foxhole.Mail_gz.UNOFFICIAL)

Was there a recent change to the lists? Is anyone able to see what the
actual error was so I don't do it again?

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
[Trying again, with all patches unzipped and the CC list temporarily
removed to avoid flooding people's inboxes. Original message follows.]

On Fri, Feb 23, 2024 at 5:01 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> The
> patchset is now carrying a lot of squash-cruft, and I plan to flatten
> it in the next version.

This is done in v17, which is also now based on the two patches pulled
out by Daniel in [1]. Besides the squashes, which make up most of the
range-diff, I've fixed a call to strncasecmp() which is not available
on Windows.

Daniel and I discussed trying a Python version of the test server,
since the standard library there should give us more goodies to work
with. A proof of concept is in 0009. I think the big question I have
for it is, how would we communicate what we want the server to do for
the test? (We could perhaps switch on magic values of the client ID?)
In the end I'd like to be testing close to 100% of the failure modes,
and that's likely to mean a lot of back-and-forth if the server
implementation isn't in the Perl process.

--Jacob

[1] https://postgr.es/m/flat/F51F8777-FAF5-49F2-BC5E-8F9EB423ECE0%40yesql.se

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 28 Feb 2024, at 15:05, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
>
> [Trying again, with all patches unzipped and the CC list temporarily
> removed to avoid flooding people's inboxes. Original message follows.]
>
> On Fri, Feb 23, 2024 at 5:01 PM Jacob Champion
> <jacob.champion@enterprisedb.com> wrote:
>> The
>> patchset is now carrying a lot of squash-cruft, and I plan to flatten
>> it in the next version.
>
> This is done in v17, which is also now based on the two patches pulled
> out by Daniel in [1]. Besides the squashes, which make up most of the
> range-diff, I've fixed a call to strncasecmp() which is not available
> on Windows.
>
> Daniel and I discussed trying a Python version of the test server,
> since the standard library there should give us more goodies to work
> with. A proof of concept is in 0009. I think the big question I have
> for it is, how would we communicate what we want the server to do for
> the test? (We could perhaps switch on magic values of the client ID?)
> In the end I'd like to be testing close to 100% of the failure modes,
> and that's likely to mean a lot of back-and-forth if the server
> implementation isn't in the Perl process.

Thanks for the new version, I'm digesting the test patches but for now I have a
few smaller comments:


+#define ALLOC(size) malloc(size)
I wonder if we should use pg_malloc_extended(size, MCXT_ALLOC_NO_OOM) instead
to self document the code.  We clearly don't want feature-parity with server-
side palloc here.  I know we use malloc in similar ALLOC macros so it's not
unique in that regard, but maybe?


+#ifdef FRONTEND
+               destroyPQExpBuffer(lex->errormsg);
+#else
+               pfree(lex->errormsg->data);
+               pfree(lex->errormsg);
+#endif
Wouldn't it be nicer if we abstracted this into a destroyStrVal function to a)
avoid the ifdefs and b) make it more like the rest of the new API?  While it's
only used in two places (close to each other) it's a shame to let the
underlying API bleed through the abstraction.


+   CURLM      *curlm;      /* top-level multi handle for cURL operations */
Nitpick, but curl is not capitalized cURL anymore (for some value of "anymore"
since it changed in 2016 [0]).  I do wonder if we should consistently write
"libcurl" as well since we don't use curl but libcurl.


+   PQExpBufferData     work_data;  /* scratch buffer for general use (remember
+                                      to clear out prior contents first!) */
This seems like asking for subtle bugs due to uncleared buffers bleeding into
another operation (especially since we are writing this data across the wire).
How about having an array the size of OAuthStep of unallocated buffers where
each step use it's own?  Storing the content of each step could also be useful
for debugging.  Looking at the statemachine here it's not an obvious change but
also not impossible.


+ * TODO: This disables DNS resolution timeouts unless libcurl has been
+ * compiled against alternative resolution support. We should check that.
curl_version_info() can be used to check for c-ares support.


+ * so you don't have to write out the error handling every time. They assume
+ * that they're embedded in a function returning bool, however.
It feels a bit iffy to encode the returntype in the macro, we can use the same
trick that DISABLE_SIGPIPE employs where a failaction is passed in.


+  if (!strcmp(name, field->name))
Project style is to test for (strcmp(x,y) == 0) rather than (!strcmp()) to
improve readability.


+  libpq_append_conn_error(conn, "out of memory");
While not introduced in this patch, it's not an ideal pattern to report "out of
memory" errors via a function which may allocate memory.


+  appendPQExpBufferStr(&conn->errorMessage,
+           libpq_gettext("server's error message contained an embedded NULL"));
We should maybe add ", discarding" or something similar after this string to
indicate that there was an actual error which has been thrown away, the error
wasn't that the server passed an embedded NULL.


+#ifdef USE_OAUTH
+       else if (strcmp(mechanism_buf.data, OAUTHBEARER_NAME) == 0 &&
+               !selected_mechanism)
I wonder if we instead should move the guards inside the statement and error
out with "not built with OAuth support" or something similar like how we do
with TLS and other optional components?


+   errdetail("Comma expected, but found character %s.",
+             sanitize_char(*p))));
The %s formatter should be wrapped like '%s' to indicate that the message part
is the character in question (and we can then reuse the translation since the
error message already exist for SCRAM).


+       temp = curl_slist_append(temp, "authorization_code");
+       if (!temp)
+           oom = true;
+
+       temp = curl_slist_append(temp, "implicit");
While not a bug per se, it reads a bit odd to call another operation that can
allocate memory when the oom flag has been set.  I think we can move some
things around a little to make it clearer.

The attached diff contains some (most?) of the above as a patch on top of your
v17, but as a .txt to keep the CFBot from munging on it.

--
Daniel Gustafsson


Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Andrew Dunstan
Date:
On 2024-02-28 We 09:05, Jacob Champion wrote:
>
> Daniel and I discussed trying a Python version of the test server,
> since the standard library there should give us more goodies to work
> with. A proof of concept is in 0009. I think the big question I have
> for it is, how would we communicate what we want the server to do for
> the test? (We could perhaps switch on magic values of the client ID?)
> In the end I'd like to be testing close to 100% of the failure modes,
> and that's likely to mean a lot of back-and-forth if the server
> implementation isn't in the Perl process.



Can you give some more details about what this python gadget would buy 
us? I note that there are a couple of CPAN modules that provide OAuth2 
servers, not sure if they would be of any use.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 28 Feb 2024, at 22:50, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 2024-02-28 We 09:05, Jacob Champion wrote:
>>
>> Daniel and I discussed trying a Python version of the test server,
>> since the standard library there should give us more goodies to work
>> with. A proof of concept is in 0009. I think the big question I have
>> for it is, how would we communicate what we want the server to do for
>> the test? (We could perhaps switch on magic values of the client ID?)
>> In the end I'd like to be testing close to 100% of the failure modes,
>> and that's likely to mean a lot of back-and-forth if the server
>> implementation isn't in the Perl process.
>
> Can you give some more details about what this python gadget would buy us? I note that there are a couple of CPAN
modulesthat provide OAuth2 servers, not sure if they would be of any use. 

The main benefit would be to be able to provide a full testharness without
adding any additional dependencies over what we already have (Python being
required by meson).  That should ideally make it easy to get good coverage from
BF animals as no installation is needed.

--
Daniel Gustafsson




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
[re-adding the CC list I dropped earlier]

On Wed, Feb 28, 2024 at 1:52 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 28 Feb 2024, at 22:50, Andrew Dunstan <andrew@dunslane.net> wrote:
> > Can you give some more details about what this python gadget would buy us? I note that there are a couple of CPAN
modulesthat provide OAuth2 servers, not sure if they would be of any use. 
>
> The main benefit would be to be able to provide a full testharness without
> adding any additional dependencies over what we already have (Python being
> required by meson).  That should ideally make it easy to get good coverage from
> BF animals as no installation is needed.

As an additional note, the test suite ideally needs to be able to
exercise failure modes where the provider itself is malfunctioning. So
we hand-roll responses rather than deferring to an external
OAuth/OpenID implementation, which adds HTTP and JSON dependencies at
minimum, and Python includes both. See also the discussion with
Stephen upthread [1].

(I do think it'd be nice to eventually include a prepackaged OAuth
server in the test suite, to stack coverage for the happy path and
further test interoperability.)

Thanks,
--Jacob

[1] https://postgr.es/m/CAAWbhmh%2B6q4t3P%2BwDmS%3DJuHBpcgF-VM2cXNft8XV02yk-cHCpQ%40mail.gmail.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 27 Feb 2024, at 20:20, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
>
> On Fri, Feb 23, 2024 at 5:01 PM Jacob Champion
> <jacob.champion@enterprisedb.com> wrote:
>> The
>> patchset is now carrying a lot of squash-cruft, and I plan to flatten
>> it in the next version.
>
> This is done in v17, which is also now based on the two patches pulled
> out by Daniel in [1]. Besides the squashes, which make up most of the
> range-diff, I've fixed a call to strncasecmp() which is not available
> on Windows.

Two quick questions:

+   /* TODO */
+   CHECK_SETOPT(actx, CURLOPT_WRITEDATA, stderr);
I might be missing something, but what this is intended for in
setup_curl_handles()?


--- /dev/null
+++ b/src/interfaces/libpq/fe-auth-oauth-iddawc.c
As discussed off-list I think we should leave iddawc support for later and
focus on getting one library properly supported to start with.  If you agree,
let's drop this from the patchset to make it easier to digest.  We should make
sure we keep pluggability such that another library can be supported though,
much like the libpq TLS support.

--
Daniel Gustafsson




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Wed, Feb 28, 2024 at 9:40 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> +#define ALLOC(size) malloc(size)
> I wonder if we should use pg_malloc_extended(size, MCXT_ALLOC_NO_OOM) instead
> to self document the code.  We clearly don't want feature-parity with server-
> side palloc here.  I know we use malloc in similar ALLOC macros so it's not
> unique in that regard, but maybe?

I have a vague recollection that linking fe_memutils into libpq
tripped the exit() checks, but I can try again and see.

> +#ifdef FRONTEND
> +               destroyPQExpBuffer(lex->errormsg);
> +#else
> +               pfree(lex->errormsg->data);
> +               pfree(lex->errormsg);
> +#endif
> Wouldn't it be nicer if we abstracted this into a destroyStrVal function to a)
> avoid the ifdefs and b) make it more like the rest of the new API?  While it's
> only used in two places (close to each other) it's a shame to let the
> underlying API bleed through the abstraction.

Good idea. I'll fold this from your patch into the next set (and do
the same for the ones I've marked +1 below).

> +   CURLM      *curlm;      /* top-level multi handle for cURL operations */
> Nitpick, but curl is not capitalized cURL anymore (for some value of "anymore"
> since it changed in 2016 [0]).  I do wonder if we should consistently write
> "libcurl" as well since we don't use curl but libcurl.

Huh, I missed that memo. Thanks -- that makes it much easier to type!

> +   PQExpBufferData     work_data;  /* scratch buffer for general use (remember
> +                                      to clear out prior contents first!) */
> This seems like asking for subtle bugs due to uncleared buffers bleeding into
> another operation (especially since we are writing this data across the wire).
> How about having an array the size of OAuthStep of unallocated buffers where
> each step use it's own?  Storing the content of each step could also be useful
> for debugging.  Looking at the statemachine here it's not an obvious change but
> also not impossible.

I like that idea; I'll give it a look.

> + * TODO: This disables DNS resolution timeouts unless libcurl has been
> + * compiled against alternative resolution support. We should check that.
> curl_version_info() can be used to check for c-ares support.

+1

> + * so you don't have to write out the error handling every time. They assume
> + * that they're embedded in a function returning bool, however.
> It feels a bit iffy to encode the returntype in the macro, we can use the same
> trick that DISABLE_SIGPIPE employs where a failaction is passed in.

+1

> +  if (!strcmp(name, field->name))
> Project style is to test for (strcmp(x,y) == 0) rather than (!strcmp()) to
> improve readability.

+1

> +  libpq_append_conn_error(conn, "out of memory");
> While not introduced in this patch, it's not an ideal pattern to report "out of
> memory" errors via a function which may allocate memory.

Does trying (and failing) to allocate more memory cause any harm? Best
case, we still have enough room in the errorMessage to fit the whole
error; worst case, we mark the errorMessage broken and then
PQerrorMessage() can handle it correctly.

> +  appendPQExpBufferStr(&conn->errorMessage,
> +           libpq_gettext("server's error message contained an embedded NULL"));
> We should maybe add ", discarding" or something similar after this string to
> indicate that there was an actual error which has been thrown away, the error
> wasn't that the server passed an embedded NULL.

+1

> +#ifdef USE_OAUTH
> +       else if (strcmp(mechanism_buf.data, OAUTHBEARER_NAME) == 0 &&
> +               !selected_mechanism)
> I wonder if we instead should move the guards inside the statement and error
> out with "not built with OAuth support" or something similar like how we do
> with TLS and other optional components?

This one seems like a step backwards. IIUC, the client can currently
handle a situation where the server returns multiple mechanisms
(though the server doesn't support that yet), and I'd really like to
make use of that property without making users upgrade libpq.

That said, it'd be good to have a more specific error message in the
case where we don't have a match...

> +   errdetail("Comma expected, but found character %s.",
> +             sanitize_char(*p))));
> The %s formatter should be wrapped like '%s' to indicate that the message part
> is the character in question (and we can then reuse the translation since the
> error message already exist for SCRAM).

+1

> +       temp = curl_slist_append(temp, "authorization_code");
> +       if (!temp)
> +           oom = true;
> +
> +       temp = curl_slist_append(temp, "implicit");
> While not a bug per se, it reads a bit odd to call another operation that can
> allocate memory when the oom flag has been set.  I think we can move some
> things around a little to make it clearer.

I'm not a huge fan of nested happy paths/pyramids of doom, but in this
case it's small enough that I'm not opposed. :D

> The attached diff contains some (most?) of the above as a patch on top of your
> v17, but as a .txt to keep the CFBot from munging on it.

Thanks very much! I plan to apply all but the USE_OAUTH guard change
(but let me know if you feel strongly about it).

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Feb 29, 2024 at 1:08 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> +   /* TODO */
> +   CHECK_SETOPT(actx, CURLOPT_WRITEDATA, stderr);
> I might be missing something, but what this is intended for in
> setup_curl_handles()?

Ah, that's cruft left over from early debugging, just so that I could
see what was going on. I'll remove it.

> --- /dev/null
> +++ b/src/interfaces/libpq/fe-auth-oauth-iddawc.c
> As discussed off-list I think we should leave iddawc support for later and
> focus on getting one library properly supported to start with.  If you agree,
> let's drop this from the patchset to make it easier to digest.  We should make
> sure we keep pluggability such that another library can be supported though,
> much like the libpq TLS support.

Agreed. The number of changes being folded into the next set is
already pretty big so I think this will wait until next+1.

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Feb 29, 2024 at 4:04 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> On Wed, Feb 28, 2024 at 9:40 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> > +       temp = curl_slist_append(temp, "authorization_code");
> > +       if (!temp)
> > +           oom = true;
> > +
> > +       temp = curl_slist_append(temp, "implicit");
> > While not a bug per se, it reads a bit odd to call another operation that can
> > allocate memory when the oom flag has been set.  I think we can move some
> > things around a little to make it clearer.
>
> I'm not a huge fan of nested happy paths/pyramids of doom, but in this
> case it's small enough that I'm not opposed. :D

I ended up rewriting this patch hunk a bit to handle earlier OOM
failures; let me know what you think.

--

v18 is the result of plenty of yak shaving now that the Windows build
is working. In addition to Daniel's changes as discussed upthread,
- I have rebased over v2 of the SASL-refactoring patches
- the last CompilerWarnings failure has been fixed
- the py.test suite now runs on Windows (but does not yet completely pass)
- py.test has been completely disabled for the 32-bit Debian test in
Cirrus; I don't know if there's a way to install 32-bit Python
side-by-side with 64-bit

We are now very, very close to green.

The new oauth_validator tests can't work on Windows, since the client
doesn't support OAuth there. The python/server tests can handle this
case, since they emulate the client behavior; do we want to try
something similar in Perl?

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Feb 29, 2024 at 5:08 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> We are now very, very close to green.

v19 gets us a bit closer by adding a missed import for Windows. I've
also removed iddawc support, so the client patch is lighter.

> The new oauth_validator tests can't work on Windows, since the client
> doesn't support OAuth there. The python/server tests can handle this
> case, since they emulate the client behavior; do we want to try
> something similar in Perl?

In addition to this question, I'm starting to notice intermittent
failures of the form

    error: ... failed to fetch OpenID discovery document: failed to
queue HTTP request

This corresponds to a TODO in the libcurl implementation -- if the
initial call to curl_multi_socket_action() reports that no handles are
running, I treated that as an error. But it looks like it's possible
for libcurl to finish a request synchronously if the remote responds
quickly enough, so that needs to change.

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Mar 1, 2024 at 9:46 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> v19 gets us a bit closer by adding a missed import for Windows. I've
> also removed iddawc support, so the client patch is lighter.

v20 fixes a bunch more TODOs:
1) the client initial response is validated more closely
2) the server's invalid_token parameters are properly escaped into the
containing JSON (though, eventually, we probably want to just reject
invalid HBA settings instead of passing them through to the client)
3) Windows-specific responses have been recorded in the test suite

While poking at item 2, I was reminded that there's an alternative way
to get OAuth parameters from the server, and it's subtly incompatible
with the OpenID spec because OpenID didn't follow the rules for
.well-known URI construction [1]. :( Some sort of knob will be
required to switch the behaviors.

I renamed the API for the validator module from res->authenticated to
res->authorized. Authentication is optional, but a validator *must*
check that the client it's talking to was authorized by the user to
access the server, whether or not the user is authenticated. (It may
additionally verify that the user is authorized to access the
database, or it may simply authenticate the user and defer to the
usermap.) Documenting that particular subtlety is going to be
interesting...

The tests now exercise different issuers for different users, which
will also be a good way to signal the server to respond in different
ways during the validator tests. It does raise the question: if a
third party provides an issuer-specific module, how do we switch
between that and some other module for a different user?

Andrew asked over at [2] if we could perhaps get 0001 in as well. I
think the main thing to figure out there is, is requiring linkage
against libpq (see 0008) going to be okay for the frontend binaries
that need JSON support? Or do we need to do something like moving
PQExpBuffer into src/common to simplify the dependency tree?

--Jacob

[1] https://www.rfc-editor.org/rfc/rfc8414.html#section-5
[2] https://www.postgresql.org/message-id/682c8fff-355c-a04f-57ac-81055c4ccda8%40dunslane.net

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
v21 is a quick rebase over HEAD, which has adopted a few pieces of
v20. I've also fixed a race condition in the tests.

On Mon, Mar 11, 2024 at 3:51 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> Andrew asked over at [2] if we could perhaps get 0001 in as well. I
> think the main thing to figure out there is, is requiring linkage
> against libpq (see 0008) going to be okay for the frontend binaries
> that need JSON support? Or do we need to do something like moving
> PQExpBuffer into src/common to simplify the dependency tree?

0001 has been pared down to the part that teaches jsonapi.c to use
PQExpBuffer and track out-of-memory conditions; the linkage questions
remain.

Thanks,
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 22 Mar 2024, at 19:21, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
>
> v21 is a quick rebase over HEAD, which has adopted a few pieces of
> v20. I've also fixed a race condition in the tests.

Thanks for the rebase, I have a few comments from working with it a bit:

In jsonapi.c, makeJsonLexContextCstringLen initialize a JsonLexContext with
palloc0 which would need to be ported over to use ALLOC for frontend code.  On
that note, the errorhandling in parse_oauth_json() for content-type checks
attempts to free the JsonLexContext even before it has been created.  Here we
can just return false.


-  echo 'libpq must not be calling any function which invokes exit'; exit 1; \
+  echo 'libpq must not be calling any function which invokes exit'; \
The offending codepath in libcurl was in the NTLM_WB module, a very old and
obscure form of NTLM support which was replaced (yet remained in the tree) a
long time ago by a full NTLM implementatin.  Based on the findings in this
thread it was deprecated with a removal date set to April 2024 [0].  A bug in
the 8.4.0 release however disconnected NTLM_WB from the build and given the
lack of complaints it was decided to leave as is, so we can base our libcurl
requirements on 8.4.0 while keeping the exit() check intact.


+  else if (strcasecmp(content_type, "application/json") != 0)
This needs to handle parameters as well since it will now fail if the charset
parameter is appended (which undoubtedly will be pretty common).  The easiest
way is probably to just verify the mediatype and skip the parameters since we
know it can only be charset?


+  /* TODO investigate using conn->Pfdebug and CURLOPT_DEBUGFUNCTION here */
+  CHECK_SETOPT(actx, CURLOPT_VERBOSE, 1L, return false);
+  CHECK_SETOPT(actx, CURLOPT_ERRORBUFFER, actx->curl_err, return false);
CURLOPT_ERRORBUFFER is the old and finicky way of extracting error messages, we
should absolutely move to using CURLOPT_DEBUGFUNCTION instead.


+  /* && response_code != 401 TODO */ )
Why is this marked with a TODO, do you remember?


+  print("# OAuth provider (PID $pid) is listening on port $port\n");
Code running under Test::More need to use diag() for printing non-test output
like this.


Another issue I have is the sheer size and the fact that so much code is
replaced by subsequent commits, so I took the liberty to squash some of this
down into something less daunting.  The attached v22 retains the 0001 and then
condenses the rest into two commits for frontent and backend parts.  I did drop
the Python pytest patch since I feel that it's unlikely to go in from this
thread (adding pytest seems worthy of its own thread and discussion), and the
weight of it makes this seem scarier than it is.  For using it, it can be
easily applied from the v21 patchset independently.  I did tweak the commit
message to match reality a bit better, but there is a lot of work left there.

The final patch contains fixes for all of the above review comments as well as
a some refactoring, smaller clean-ups and TODO fixing.  If these fixes are
accepted I'll incorporate them into the two commits.

Next I intend to work on writing documentation for this.

--
Daniel Gustafsson

[0] https://curl.se/dev/deprecate.html
[1] https://github.com/curl/curl/pull/12479


Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Mar 28, 2024 at 3:34 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> In jsonapi.c, makeJsonLexContextCstringLen initialize a JsonLexContext with
> palloc0 which would need to be ported over to use ALLOC for frontend code.

Seems reasonable (but see below, too).

> On
> that note, the errorhandling in parse_oauth_json() for content-type checks
> attempts to free the JsonLexContext even before it has been created.  Here we
> can just return false.

Agreed. They're zero-initialized, so freeJsonLexContext() is safe
IIUC, but it's clearer not to call the free function at all.

But for these additions:

> -   makeJsonLexContextCstringLen(&lex, resp->data, resp->len, PG_UTF8, true);
> +   if (!makeJsonLexContextCstringLen(&lex, resp->data, resp->len, PG_UTF8, true))
> +   {
> +       actx_error(actx, "out of memory");
> +       return false;
> +   }

...since we're using the stack-based API as opposed to the heap-based
API, they shouldn't be possible to hit. Any failures in createStrVal()
are deferred to parse time on purpose.

> -  echo 'libpq must not be calling any function which invokes exit'; exit 1; \
> +  echo 'libpq must not be calling any function which invokes exit'; \
> The offending codepath in libcurl was in the NTLM_WB module, a very old and
> obscure form of NTLM support which was replaced (yet remained in the tree) a
> long time ago by a full NTLM implementatin.  Based on the findings in this
> thread it was deprecated with a removal date set to April 2024 [0].  A bug in
> the 8.4.0 release however disconnected NTLM_WB from the build and given the
> lack of complaints it was decided to leave as is, so we can base our libcurl
> requirements on 8.4.0 while keeping the exit() check intact.

Of the Cirrus machines, it looks like only FreeBSD has a new enough
libcurl for that. Ubuntu won't until 24.04, Debian Bookworm doesn't
have it unless you're running backports, RHEL 9 is still on 7.x... I
think requiring libcurl 8 is effectively saying no one will be able to
use this for a long time. Is there an alternative?

> +  else if (strcasecmp(content_type, "application/json") != 0)
> This needs to handle parameters as well since it will now fail if the charset
> parameter is appended (which undoubtedly will be pretty common).  The easiest
> way is probably to just verify the mediatype and skip the parameters since we
> know it can only be charset?

Good catch. application/json no longer defines charsets officially
[1], so I think we should be able to just ignore them. The new
strncasecmp needs to handle a spurious prefix, too; I have that on my
TODO list.

> +  /* TODO investigate using conn->Pfdebug and CURLOPT_DEBUGFUNCTION here */
> +  CHECK_SETOPT(actx, CURLOPT_VERBOSE, 1L, return false);
> +  CHECK_SETOPT(actx, CURLOPT_ERRORBUFFER, actx->curl_err, return false);
> CURLOPT_ERRORBUFFER is the old and finicky way of extracting error messages, we
> should absolutely move to using CURLOPT_DEBUGFUNCTION instead.

This new way doesn't do the same thing. Here's a sample error:

    connection to server at "127.0.0.1", port 56619 failed: failed to
fetch OpenID discovery document: Weird server reply (  Trying
127.0.0.1:36647...
    Connected to localhost (127.0.0.1) port 36647 (#0)
    Mark bundle as not supporting multiuse
    HTTP 1.0, assume close after body
    Invalid Content-Length: value
    Closing connection 0
    )

IMO that's too much noise. Prior to the change, the same error would have been

    connection to server at "127.0.0.1", port 56619 failed: failed to
fetch OpenID discovery document: Weird server reply (Invalid
Content-Length: value)

The error buffer is finicky for sure, but it's also a generic one-line
explanation of what went wrong... Is there an alternative API for that
I'm missing?

> +  /* && response_code != 401 TODO */ )
> Why is this marked with a TODO, do you remember?

Yeah -- I have a feeling that 401s coming back are going to need more
helpful hints to the user, since it implies that libpq itself hasn't
authenticated correctly as opposed to some user-related auth failure.
I was hoping to find some sample behaviors in the wild and record
those into the suite.

> +  print("# OAuth provider (PID $pid) is listening on port $port\n");
> Code running under Test::More need to use diag() for printing non-test output
> like this.

Ah, thanks.

> +#if LIBCURL_VERSION_MAJOR <= 8 && LIBCURL_VERSION_MINOR < 4

I don't think this catches versions like 7.76, does it? Maybe
`LIBCURL_VERSION_MAJOR < 8 || (LIBCURL_VERSION_MAJOR == 8 &&
LIBCURL_VERSION_MINOR < 4)`, or else `LIBCURL_VERSION_NUM < 0x080400`?

>     my $pid = open(my $read_fh, "-|", $ENV{PYTHON}, "t/oauth_server.py")
> -       // die "failed to start OAuth server: $!";
> +       or die "failed to start OAuth server: $!";
>
> -   read($read_fh, $port, 7) // die "failed to read port number: $!";
> +   read($read_fh, $port, 7) or die "failed to read port number: $!";

The first hunk here looks good (thanks for the catch!) but I think the
second is not correct behavior. $! doesn't get set unless undef is
returned, if I'm reading the docs correctly. Yay Perl.

> +   /* Sanity check the previous operation */
> +   if (actx->running != 1)
> +   {
> +       actx_error(actx, "failed to queue HTTP request");
> +       return false;
> +   }

`running` can be set to zero on success, too. I'm having trouble
forcing that code path in a test so far, but we're going to have to do
something special in that case.

> Another issue I have is the sheer size and the fact that so much code is
> replaced by subsequent commits, so I took the liberty to squash some of this
> down into something less daunting.  The attached v22 retains the 0001 and then
> condenses the rest into two commits for frontent and backend parts.

Looks good.

> I did drop
> the Python pytest patch since I feel that it's unlikely to go in from this
> thread (adding pytest seems worthy of its own thread and discussion), and the
> weight of it makes this seem scarier than it is.

Until its coverage gets ported over, can we keep it as a `DO NOT
MERGE` patch? Otherwise there's not much to run in Cirrus.

> The final patch contains fixes for all of the above review comments as well as
> a some refactoring, smaller clean-ups and TODO fixing.  If these fixes are
> accepted I'll incorporate them into the two commits.
>
> Next I intend to work on writing documentation for this.

Awesome, thank you! I will start adding coverage to the new code paths.

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc7159#section-11



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Apr 1, 2024 at 3:07 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
>
> Awesome, thank you! I will start adding coverage to the new code paths.

This patchset rotted more than I thought it would with the new
incremental JSON, and I got stuck in rebase hell. Rather than chip
away at that while the cfbot is red, here's a rebase of v22 to get the
CI up again, and I will port what I've been working on over that. (So,
for prior reviewers: recent upthread and offline feedback is not yet
incorporated, sorry, come back later.)

The big change in v23 is that I've removed fe_memutils.c from
libpgcommon_shlib completely, to try to reduce my own hair-pulling
when it comes to keeping exit() out of libpq. (It snuck in several
ways with incremental JSON.)

As far as I can tell, removing fe_memutils causes only one problem,
which is that Informix ECPG is relying on pnstrdup(). And I think that
may be a bug in itself? There's code in deccvasc() right after the
pnstrdup() call that takes care of a failed allocation, but the
frontend pnstrdup() is going to call exit() on failure. So my 0001
patch reverts that change, which was made in 0b9466fce. If that can go
in, and I'm not missing something that makes that call okay, maybe
0002 can be peeled off as well.

Thanks,
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
Hi Daniel,

On Mon, Apr 1, 2024 at 3:07 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> Of the Cirrus machines, it looks like only FreeBSD has a new enough
> libcurl for that. Ubuntu won't until 24.04, Debian Bookworm doesn't
> have it unless you're running backports, RHEL 9 is still on 7.x... I
> think requiring libcurl 8 is effectively saying no one will be able to
> use this for a long time. Is there an alternative?

Since the exit() checks appear to be happy now that fe_memutils is
out, I've lowered the requirement to the version of libcurl that seems
to be shipped in RHEL 8 (7.61.0). This also happens to be when TLS 1.3
ciphersuite control was added to Curl, which seems like something we
may want in the very near future, so I'm taking that as a good sign
for what is otherwise a very arbitrary cutoff point. Counterproposals
welcome :D

> Good catch. application/json no longer defines charsets officially
> [1], so I think we should be able to just ignore them. The new
> strncasecmp needs to handle a spurious prefix, too; I have that on my
> TODO list.

I've expanded this handling in v24, attached.

> This new way doesn't do the same thing. Here's a sample error:
>
>     connection to server at "127.0.0.1", port 56619 failed: failed to
> fetch OpenID discovery document: Weird server reply (  Trying
> 127.0.0.1:36647...
>     Connected to localhost (127.0.0.1) port 36647 (#0)
>     Mark bundle as not supporting multiuse
>     HTTP 1.0, assume close after body
>     Invalid Content-Length: value
>     Closing connection 0
>     )
>
> IMO that's too much noise. Prior to the change, the same error would have been
>
>     connection to server at "127.0.0.1", port 56619 failed: failed to
> fetch OpenID discovery document: Weird server reply (Invalid
> Content-Length: value)

I have reverted this change for now, but I'm still hoping there's an
alternative that can help us clean up?

> `running` can be set to zero on success, too. I'm having trouble
> forcing that code path in a test so far, but we're going to have to do
> something special in that case.

For whatever reason, the magic timing for this is popping up more and
more often on Cirrus, leading to really annoying test failures. So I
may have to abandon the search for a perfect test case and just fix
it.

> > I did drop
> > the Python pytest patch since I feel that it's unlikely to go in from this
> > thread (adding pytest seems worthy of its own thread and discussion), and the
> > weight of it makes this seem scarier than it is.
>
> Until its coverage gets ported over, can we keep it as a `DO NOT
> MERGE` patch? Otherwise there's not much to run in Cirrus.

I have added this back (marked loudly as don't-merge) so that we keep
the test coverage for now. The Perl suite (plus Python server) has
been tricked out a lot more in v24, so it shouldn't be too bad to get
things ported.

> > Next I intend to work on writing documentation for this.
>
> Awesome, thank you! I will start adding coverage to the new code paths.

Peter E asked for some documentation stubs to ease review, which I've
added. Hopefully that doesn't step on your toes any.

A large portion of your "Review comments" patch has been pulled
backwards into the previous commits; the remaining pieces are things
I'm still peering at and/or writing tests for. I also owe this thread
an updated roadmap and summary, to make it a little less daunting for
new reviewers. Soon (tm).

Thanks!
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Peter Eisentraut
Date:
I have some comments about the first three patches, that deal with 
memory management.

v24-0001-Revert-ECPG-s-use-of-pnstrdup.patch

This looks right.

I suppose another approach would be to put a full replacement for 
strndup() into src/port/.  But since there is currently only one user, 
and most other users should be using pnstrdup(), the presented approach 
seems ok.

We should take the check for exit() calls from libpq and expand it to 
cover the other libraries as well.  Maybe there are other problems like 
this?


v24-0002-Remove-fe_memutils-from-libpgcommon_shlib.patch

I don't quite understand how this problem can arise.  The description says

"""
libpq appears to have no need for this, and the exit() references cause
our libpq-refs-stamp test to fail if the linker doesn't strip out the
unused code.
"""

But under what circumstances does "the linker doesn't strip out" happen? 
  If this happens accidentally, then we should have seen some buildfarm 
failures or something?

Also, one could look further and notice that restricted_token.c and 
sprompt.c both a) are not needed by libpq and b) can trigger exit() 
calls.  Then it's not clear why those are not affected.


v24-0003-common-jsonapi-support-libpq-as-a-client.patch

I'm reminded of thread [0].  I think there is quite a bit of confusion 
about the pqexpbuffer vs. stringinfo APIs, and they are probably used 
incorrectly quite a bit.  There are now also programs that use both of 
them!  This patch now introduces another layer on top of them.  I fear, 
at the end, nobody is going to understand any of this anymore.  Also, 
changing all the programs to link in libpq for pqexpbuffer seems like 
the opposite direction from what was suggested in [0].

I think we need to do some deeper thinking here about how we want the 
memory management on the client side to work.  Maybe we could just use 
one API but have some flags or callbacks to control the out-of-memory 
behavior.

[0]: 
https://www.postgresql.org/message-id/flat/16d0beac-a141-e5d3-60e9-323da75f49bf%40eisentraut.org




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
Thanks for working on this patchset, I'm looking over 0004 and 0005 but came
across a thing I wanted to bring up one thing sooner than waiting for the
review. In parse_device_authz we have this:

  {"user_code", JSON_TOKEN_STRING, {&authz->user_code}, REQUIRED},
  {"verification_uri", JSON_TOKEN_STRING, {&authz->verification_uri}, REQUIRED},

  /*
   * The following fields are technically REQUIRED, but we don't use
   * them anywhere yet:
   *
   * - expires_in
   */

  {"interval", JSON_TOKEN_NUMBER, {&authz->interval_str}, OPTIONAL},

Together with a colleage we found the Azure provider use "verification_url"
rather than xxx_uri.  Another discrepancy is that it uses a string for the
interval (ie: "interval":"5").  One can of course argue that Azure is wrong and
should feel bad, but I fear that virtually all (major) providers will have
differences like this, so we will have to deal with it in an extensible fashion
(compile time, not runtime configurable).

I was toying with making the name json_field name member an array, to allow
variations.  That won't help with the fieldtype differences though, so another
train of thought was to have some form of REQUIRED_XOR where fields can tied
together.  What do you think about something along these lines?

Another thing, shouldn't we really parse and interpret *all* REQUIRED fields
even if we don't use them to ensure that the JSON is wellformed?  If the JSON
we get is malformed in any way it seems like the safe/conservative option to
error out.

--
Daniel Gustafsson




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Jul 29, 2024 at 5:02 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> We should take the check for exit() calls from libpq and expand it to
> cover the other libraries as well.  Maybe there are other problems like
> this?

Seems reasonable, yeah.

> But under what circumstances does "the linker doesn't strip out" happen?
>   If this happens accidentally, then we should have seen some buildfarm
> failures or something?

On my machine, for example, I see differences with optimization
levels. Say you inadvertently call pfree() in a _shlib build, as I did
multiple times upthread. By itself, that shouldn't actually be a
problem (it eventually redirects to free()), so it should be legal to
call pfree(), and with -O2 the build succeeds. But with -Og, the
exit() check trips, and when I disassemble I see that pg_malloc() et
all have infected the shared object. After all, we did tell the linker
to put that object file in, and we don't ask it to garbage-collect
sections.

> Also, one could look further and notice that restricted_token.c and
> sprompt.c both a) are not needed by libpq and b) can trigger exit()
> calls.  Then it's not clear why those are not affected.

I think it's easier for the linker to omit whole object files rather
than partial ones. If libpq doesn't use any of those APIs there's not
really a reason to trip over it.

(Maybe the _shlib variants should just contain the minimum objects
required to compile.)

> I'm reminded of thread [0].  I think there is quite a bit of confusion
> about the pqexpbuffer vs. stringinfo APIs, and they are probably used
> incorrectly quite a bit.  There are now also programs that use both of
> them!  This patch now introduces another layer on top of them.  I fear,
> at the end, nobody is going to understand any of this anymore.

"anymore"? :)

In all seriousness -- I agree that this isn't sustainable. At the
moment the worst pain (the new layer) is isolated to jsonapi.c, which
seems like an okay place to try something new, since there aren't that
many clients. But to be honest I'm not excited about deciding the Best
Way Forward based on a sample size of JSON.

> Also,
> changing all the programs to link in libpq for pqexpbuffer seems like
> the opposite direction from what was suggested in [0].

(I don't really want to keep that new libpq dependency. We'd just have
to decide where PQExpBuffer is going to go if we're not okay with it.)

> I think we need to do some deeper thinking here about how we want the
> memory management on the client side to work.  Maybe we could just use
> one API but have some flags or callbacks to control the out-of-memory
> behavior.

Any src/common code that needs to handle both in-band and out-of-band
failure modes will still have to decide whether it's going to 1)
duplicate code paths or 2) just act as if in-band failures can always
happen. I think that's probably essential complexity; an ideal API
might make it nicer to deal with but it can't abstract it away.

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Jul 29, 2024 at 1:51 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> Together with a colleage we found the Azure provider use "verification_url"
> rather than xxx_uri.

Yeah, I think that's originally a Google-ism. (As far as I can tell
they helped author the spec for this and then didn't follow it. :/ ) I
didn't recall Azure having used it back when I was testing against it,
though, so that's good to know.

> Another discrepancy is that it uses a string for the
> interval (ie: "interval":"5").

Oh, that's a new one. I don't remember needing to hack around that
either; maybe iddawc handled it silently?

> One can of course argue that Azure is wrong and
> should feel bad, but I fear that virtually all (major) providers will have
> differences like this, so we will have to deal with it in an extensible fashion
> (compile time, not runtime configurable).

Such is life... verification_url we will just have to deal with by
default, I think, since Google does/did it too. Not sure about
interval -- but do we want to make our distribution maintainers deal
with a compile-time setting for libpq, just to support various OAuth
flavors? To me it seems like we should just hold our noses and support
known (large) departures in the core.

> I was toying with making the name json_field name member an array, to allow
> variations.  That won't help with the fieldtype differences though, so another
> train of thought was to have some form of REQUIRED_XOR where fields can tied
> together.  What do you think about something along these lines?

If I designed it right, just adding alternative spellings directly to
the fields list should work. (The "required" check is by struct
member, not name, so both spellings can point to the same
destination.) The alternative typing on the other hand might require
something like a new sentinel "type" that will accept both... I hadn't
expected that.

> Another thing, shouldn't we really parse and interpret *all* REQUIRED fields
> even if we don't use them to ensure that the JSON is wellformed?  If the JSON
> we get is malformed in any way it seems like the safe/conservative option to
> error out.

Good, I was hoping to have a conversation about that. I am fine with
either option in principle. In practice I expect to add code to use
`expires_in` (so that we can pass it to custom OAuth hook
implementations) and `scope` (to check if the server has changed it on
us).

That leaves the provider... Forcing the provider itself to implement
unused stuff in order to interoperate seems like it could backfire on
us, especially since IETF standardized an alternate .well-known URI
[1] that changes some of these REQUIRED things into OPTIONAL. (One way
for us to interpret this: those fields may be required for OpenID, but
your OAuth provider might not be an OpenID provider, and our code
doesn't require OpenID.) I think we should probably tread lightly in
that particular case. Thoughts on that?

Thanks!
--Jacob

[1] https://www.rfc-editor.org/rfc/rfc8414.html



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Peter Eisentraut
Date:
On 30.07.24 00:30, Jacob Champion wrote:
>> But under what circumstances does "the linker doesn't strip out" happen?
>>    If this happens accidentally, then we should have seen some buildfarm
>> failures or something?
> On my machine, for example, I see differences with optimization
> levels. Say you inadvertently call pfree() in a _shlib build, as I did
> multiple times upthread. By itself, that shouldn't actually be a
> problem (it eventually redirects to free()), so it should be legal to
> call pfree(), and with -O2 the build succeeds. But with -Og, the
> exit() check trips, and when I disassemble I see that pg_malloc() et
> all have infected the shared object. After all, we did tell the linker
> to put that object file in, and we don't ask it to garbage-collect
> sections.

I'm tempted to say, this is working as intended.

libpgcommon is built as a static library.  So we can put all the object 
files in the library, and its users only use the object files they 
really need.  So this garbage collection you allude to actually does 
happen, on an object-file level.

You shouldn't use pfree() interchangeably with free(), even if that is 
not enforced because it's the same thing underneath.  First, it just 
makes sense to keep the alloc and free pairs matched up.  And second, on 
Windows there is some additional restriction (vague knowledge) that the 
allocate and free functions must be in the same library, so mixing them 
freely might not even work.




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Aug 2, 2024 at 10:13 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> You shouldn't use pfree() interchangeably with free(), even if that is
> not enforced because it's the same thing underneath.  First, it just
> makes sense to keep the alloc and free pairs matched up.  And second, on
> Windows there is some additional restriction (vague knowledge) that the
> allocate and free functions must be in the same library, so mixing them
> freely might not even work.

Ah, I forgot about the CRT problems on Windows. So my statement of
"the linker might not garbage collect" is pretty much irrelevant.

But it sounds like we agree that we shouldn't be using fe_memutils at
all in shlib builds. (If you can't use palloc -- it calls exit -- then
you can't use pfree either.) Is 0002 still worth pursuing, once I've
correctly wordsmithed the commit? Or did I misunderstand your point?

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Peter Eisentraut
Date:
On 02.08.24 19:51, Jacob Champion wrote:
> But it sounds like we agree that we shouldn't be using fe_memutils at
> all in shlib builds. (If you can't use palloc -- it calls exit -- then
> you can't use pfree either.) Is 0002 still worth pursuing, once I've
> correctly wordsmithed the commit? Or did I misunderstand your point?

Yes, I think with an adjusted comment and commit message, the actual 
change makes sense.




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Aug 2, 2024 at 11:48 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> Yes, I think with an adjusted comment and commit message, the actual
> change makes sense.

Done in v25.

...along with a bunch of other stuff:

1. All the debug-mode things that we want for testing but not in
production have now been hidden behind a PGOAUTHDEBUG environment
variable, instead of being enabled by default. At the moment, that
means 1) sensitive HTTP traffic gets printed on stderr, 2) plaintext
HTTP is allowed, and 3) servers may DoS the client by sending a
zero-second retry interval (which speeds up testing a lot). I've
resurrected some of Daniel's CURLOPT_DEBUGFUNCTION implementation for
this.

I think this feature needs more thought, but I'm not sure how much. In
particular I don't think a connection string option would be
appropriate (imagine the "fun" a proxy solution would have with a
spray-my-password-to-stderr switch). But maybe it makes sense to
further divide the dangerous behavior up, so that for example you can
debug the HTTP stream without also allowing plaintext connections, or
something. And maybe stricter maintainers would like to compile the
feature out entirely?

2. The verification_url variant from Azure and Google is now directly supported.

@Daniel: I figured out why I wasn't seeing the string-based-interval
issue in my testing. I've been using Azure's v2.0 OpenID endpoint,
which seems to be much more compliant than the original. Since this is
a new feature, would it be okay to just push new users to that
endpoint rather than supporting the previous weirdness in our code?
(Either way, I think we should support verification_url.)

Along those lines, with Azure I'm now seeing that device_code is not
advertised in grant_types_supported... is that new behavior? Or did
iddawc just not care?

3. I've restructured the libcurl calls to allow
curl_multi_socket_action() to synchronously succeed on its first call,
which we've been seeing a lot in the CI as mentioned upthread. This
led to a bunch of refactoring of the top-level state machine, which
had gotten too complex. I'm much happier with the code organization
now, but it's a big diff.

4. I've changed things around to get rid of two modern libcurl
deprecation warnings. I need to ask curl-library about my use of
curl_multi_socket_all(), which seems like it's exactly what our use
case needs.

Thanks,
--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Peter Eisentraut
Date:
On 05.08.24 19:53, Jacob Champion wrote:
> On Fri, Aug 2, 2024 at 11:48 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>> Yes, I think with an adjusted comment and commit message, the actual
>> change makes sense.
> 
> Done in v25.
> 
> ...along with a bunch of other stuff:

I have committed 0001, and I plan to backpatch it once the release 
freeze lifts.

I'll work on 0002 next.




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Peter Eisentraut
Date:
On 07.08.24 09:34, Peter Eisentraut wrote:
> On 05.08.24 19:53, Jacob Champion wrote:
>> On Fri, Aug 2, 2024 at 11:48 AM Peter Eisentraut 
>> <peter@eisentraut.org> wrote:
>>> Yes, I think with an adjusted comment and commit message, the actual
>>> change makes sense.
>>
>> Done in v25.
>>
>> ...along with a bunch of other stuff:
> 
> I have committed 0001, and I plan to backpatch it once the release 
> freeze lifts.
> 
> I'll work on 0002 next.

I have committed 0002 now.




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Sun, Aug 11, 2024 at 11:37 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> I have committed 0002 now.

Thanks Peter! Rebased over both in v26.

--Jacob

Attachment

Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Peter Eisentraut
Date:
On 13.08.24 23:11, Jacob Champion wrote:
> On Sun, Aug 11, 2024 at 11:37 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>> I have committed 0002 now.
> 
> Thanks Peter! Rebased over both in v26.

I have looked again at the jsonapi memory management patch (v26-0001).
As previously mentioned, I think adding a third or fourth (depending
on how you count) memory management API is maybe something we should
avoid.  Also, the weird layering where src/common/ now (sometimes)
depends on libpq seems not great.

I'm thinking, maybe we leave the use of StringInfo at the source code
level, but #define the symbols to use PQExpBuffer.  Something like

#ifdef JSONAPI_USE_PQEXPBUFFER

#define StringInfo PQExpBuffer
#define appendStringInfo appendPQExpBuffer
#define appendBinaryStringInfo appendBinaryPQExpBuffer
#define palloc malloc
//etc.

#endif

(simplified, the argument lists might differ)

Or, if people find that too scary, something like

#ifdef JSONAPI_USE_PQEXPBUFFER

#define jsonapi_StringInfo PQExpBuffer
#define jsonapi_appendStringInfo appendPQExpBuffer
#define jsonapi_appendBinaryStringInfo appendBinaryPQExpBuffer
#define jsonapi_palloc malloc
//etc.

#else

#define jsonapi_StringInfo StringInfo
#define jsonapi_appendStringInfo appendStringInfo
#define jsonapi_appendBinaryStringInfo appendBinaryStringInfo
#define jsonapi_palloc palloc
//etc.

#endif

That way, it's at least more easy to follow the source code because
you see a mostly-familiar API.

Also, we should make this PQExpBuffer-using mode only used by libpq,
not by frontend programs.  So libpq takes its own copy of jsonapi.c
and compiles it using JSONAPI_USE_PQEXPBUFFER.  That will make the
libpq build descriptions a bit more complicated, but everyone who is
not libpq doesn't need to change.

Once you get past all the function renaming, the logic changes in
jsonapi.c all look pretty reasonable.  Refactoring like
allocate_incremental_state() makes sense.

You could add pg_nodiscard attributes to
makeJsonLexContextCstringLen() and makeJsonLexContextIncremental() so
that callers who are using the libpq mode are forced to check for
errors.  Or maybe there is a clever way to avoid even that: Create a
fixed JsonLexContext like

     static const JsonLexContext failed_oom;

and on OOM you return that one from makeJsonLexContext*().  And then
in pg_parse_json(), when you get handed that context, you return
JSON_OUT_OF_MEMORY immediately.

Other than that detail and the need to use freeJsonLexContext(), it
looks like this new mode doesn't impose any additional burden on
callers, since during parsing they need to check for errors anyway,
and this just adds one more error type for out of memory.  That's a good 
outcome.




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Aug 26, 2024 at 1:18 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> Or, if people find that too scary, something like
>
> #ifdef JSONAPI_USE_PQEXPBUFFER
>
> #define jsonapi_StringInfo PQExpBuffer
> [...]
>
> That way, it's at least more easy to follow the source code because
> you see a mostly-familiar API.

I was having trouble reasoning about the palloc-that-isn't-palloc code
during the first few drafts, so I will try a round with the jsonapi_
prefix.

> Also, we should make this PQExpBuffer-using mode only used by libpq,
> not by frontend programs.  So libpq takes its own copy of jsonapi.c
> and compiles it using JSONAPI_USE_PQEXPBUFFER.  That will make the
> libpq build descriptions a bit more complicated, but everyone who is
> not libpq doesn't need to change.

Sounds reasonable. It complicates the test coverage situation a little
bit, but I think my current patch was maybe insufficient there anyway,
since the coverage for the backend flavor silently dropped...

> Or maybe there is a clever way to avoid even that: Create a
> fixed JsonLexContext like
>
>      static const JsonLexContext failed_oom;
>
> and on OOM you return that one from makeJsonLexContext*().  And then
> in pg_parse_json(), when you get handed that context, you return
> JSON_OUT_OF_MEMORY immediately.

I like this idea.

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Peter Eisentraut
Date:
On 28.08.24 18:31, Jacob Champion wrote:
> On Mon, Aug 26, 2024 at 4:23 PM Jacob Champion
> <jacob.champion@enterprisedb.com> wrote:
>> I was having trouble reasoning about the palloc-that-isn't-palloc code
>> during the first few drafts, so I will try a round with the jsonapi_
>> prefix.
> 
> v27 takes a stab at that. I have kept the ALLOC/FREE naming to match
> the strategy in other src/common source files.

This looks pretty good to me.  Maybe on the naming side, this seems like 
a gratuitous divergence:

+#define jsonapi_createStringInfo           makeStringInfo

> The name of the variable JSONAPI_USE_PQEXPBUFFER leads to sections of
> code that look like this:
> 
> +#ifdef JSONAPI_USE_PQEXPBUFFER
> +    if (!new_prediction || !new_fnames || !new_fnull)
> +        return false;
> +#endif
> 
> To me it wouldn't be immediately obvious why "using PQExpBuffer" has
> anything to do with this code; the key idea is that we expect any
> allocations to be able to fail. Maybe a name like JSONAPI_ALLOW_OOM or
> JSONAPI_SHLIB_ALLOCATIONS or...?

Seems ok to me as is.  I think the purpose of JSONAPI_USE_PQEXPBUFFER is 
adequately explained by this comment

+/*
+ * By default, we will use palloc/pfree along with StringInfo.  In libpq,
+ * use malloc and PQExpBuffer, and return JSON_OUT_OF_MEMORY on 
out-of-memory.
+ */
+#ifdef JSONAPI_USE_PQEXPBUFFER

For some of the other proposed names, I'd be afraid that someone might 
think you are free to mix and match APIs, OOM behavior, and compilation 
options.


Some comments on src/include/common/jsonapi.h:

-#include "lib/stringinfo.h"

I suspect this will fail headerscheck?  Probably needs an exception 
added there.

+#ifdef JSONAPI_USE_PQEXPBUFFER
+#define StrValType PQExpBufferData
+#else
+#define StrValType StringInfoData
+#endif

Maybe use jsonapi_StrValType here.

+typedef struct StrValType StrValType;

I don't think that is needed.  It would just duplicate typedefs that 
already exist elsewhere, depending on what StrValType is set to.

+       bool            parse_strval;
+       StrValType *strval;                     /* only used if 
parse_strval == true */

The parse_strval field could use a better explanation.

I actually don't understand the need for this field.  AFAICT, this is
just used to record whether strval is valid.  But in the cases where
it's not valid, why do we need to record that?  Couldn't you just return
failed_oom in those cases?




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Peter Eisentraut
Date:
On 03.09.24 22:56, Jacob Champion wrote:
>> The parse_strval field could use a better explanation.
>>
>> I actually don't understand the need for this field.  AFAICT, this is
>> just used to record whether strval is valid.
> No, it's meant to track the value of the need_escapes argument to the
> constructor. I've renamed it and moved the assignment to hopefully
> make that a little more obvious. WDYT?

Yes, this is clearer.

This patch (v28-0001) looks good to me now.




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Peter Eisentraut
Date:
On 04.09.24 11:28, Peter Eisentraut wrote:
> On 03.09.24 22:56, Jacob Champion wrote:
>>> The parse_strval field could use a better explanation.
>>>
>>> I actually don't understand the need for this field.  AFAICT, this is
>>> just used to record whether strval is valid.
>> No, it's meant to track the value of the need_escapes argument to the
>> constructor. I've renamed it and moved the assignment to hopefully
>> make that a little more obvious. WDYT?
> 
> Yes, this is clearer.
> 
> This patch (v28-0001) looks good to me now.

This has been committed.

About the subsequent patches:

Is there any sense in dealing with the libpq and backend patches 
separately in sequence, or is this split just for ease of handling?

(I suppose the 0004 "review comments" patch should be folded into the 
respective other patches?)

What could be the next steps to keep this moving along, other than stare 
at the remaining patches until we're content with them? ;-)




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
(Thanks for the commit, Peter!)

On Wed, Sep 11, 2024 at 6:44 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>
> > On 11 Sep 2024, at 09:37, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> > Is there any sense in dealing with the libpq and backend patches separately in sequence, or is this split just for
easeof handling? 
>
> I think it's just make reviewing a bit easier.  At this point I think they can
> be merged together, it's mostly out of historic reasons IIUC since the patchset
> earlier on supported more than one library.

I can definitely do that (and yeah, it was to make the review slightly
less daunting). The server side could potentially be committed
independently, if you want to parallelize a bit, but it'd have to be
torn back out if the libpq stuff didn't land in time.

> > (I suppose the 0004 "review comments" patch should be folded into the respective other patches?)

Yes. I'm using that patch as a holding area while I write tests for
the hunks, and then moving them backwards.

> I added a warning to autconf in case --with-oauth is used without --with-python
> since this combination will error out in running the tests.  Might be
> superfluous but I had an embarrassingly long headscratcher myself as to why the
> tests kept failing =)

Whoops, sorry. I guess we should just skip them if Python isn't there?

> CURL_IGNORE_DEPRECATION(x;) broke pgindent, it needs to keep the semicolon on
> the outside like CURL_IGNORE_DEPRECATION(x);.  This doesn't really work well
> with how the macro is defined, not sure how we should handle that best (the
> attached makes the style as per how pgindent want's it with the semicolon
> returned).

Ugh... maybe a case for a pre_indent rule in pgindent?

> The oauth_validator test module need to load Makefile.global before exporting
> the symbols from there.

Hm. Why was that passing the CI, though...?

> There is a first stab at documenting the validator module API, more to come (it
> doesn't compile right now).
>
> It contains a pgindent and pgperltidy run to keep things as close to in final
> sync as we can to catch things like the curl deprecation macro mentioned above
> early.

Thanks!

> > What could be the next steps to keep this moving along, other than stare at the remaining patches until we're
contentwith them? ;-) 
>
> I'm in the "stare at things" stage now to try and get this into the tree =)

Yeah, and I still owe you all an updated roadmap.

While I fix up the tests, I've also been picking away at the JSON
encoding problem that was mentioned in [1]; the recent SASLprep fix
was fallout from that, since I'm planning to pull in pieces of its
UTF-8 validation. I will eventually want to fuzz the heck out of this.

> To further pick away at this huge patch I propose to merge the SASL message
> length hunk which can be extracted separately.  The attached .txt (to keep the
> CFBot from poking at it) contains a diff which can be committed ahead of the
> rest of this patch to make it a tad smaller and to keep the history of that
> change a bit clearer.

LGTM!

--

Peter asked me if there were plans to provide a "standard" validator
module, say as part of contrib. The tricky thing is that Bearer
validation is issuer-specific, and many providers give you an opaque
token that you're not supposed to introspect at all.

We could use token introspection (RFC 7662) for online verification,
but last I looked at it, no one had actually implemented those
endpoints. For offline verification, I think the best we could do
would be to provide a generic JWT Profile (RFC 9068) validator, but
again I don't know if anyone is actually providing those token formats
in practice. I'm inclined to push that out into the future.

Thanks,
--Jacob

[1] https://www.postgresql.org/message-id/ZjxQnOD1OoCkEeMN%40paquier.xyz



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Wed, Sep 11, 2024 at 3:54 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> Yeah, and I still owe you all an updated roadmap.

Okay, here goes. New reviewers: start here!

== What is This? ==

OAuth 2.0 is a way for a trusted third party (a "provider") to tell a
server whether a client on the other end of the line is allowed to do
something. This patchset adds OAuth support to libpq with libcurl,
provides a server-side API so that extension modules can add support
for specific OAuth providers, and extends our SASL support to carry
the OAuth access tokens over the OAUTHBEARER mechanism.

Most OAuth clients use a web browser to perform the third-party
handshake. (These are your "Okta logins", "sign in with XXX", etc.)
But there are plenty of people who use psql without a local browser,
and invoking a browser safely across all supported platforms is
actually surprisingly fraught. So this patchset implements something
called device authorization, where the client will display a link and
a code, and then you can log in on whatever device is convenient for
you. Once you've told your provider that you trust libpq to connect to
Postgres on your behalf, it'll give libpq an access token, and libpq
will forward that on to the server.

== How This Fits, or: The Sales Pitch ==

The most popular third-party auth methods we have today are probably
the Kerberos family (AD/GSS/SSPI) and LDAP. If you're not already in
an MS ecosystem, it's unlikely that you're using the former. And users
of the latter are, in my experience, more-or-less resigned to its use,
in spite of LDAP's architectural security problems and the fact that
you have to run weird synchronization scripts to tell Postgres what
certain users are allowed to do.

OAuth provides a decently mature and widely-deployed third option. You
don't have to be running the infrastructure yourself, as long as you
have a provider you trust. If you are running your own infrastructure
(or if your provider is configurable), the tokens being passed around
can carry org-specific user privileges, so that Postgres can figure
out who's allowed to do what without out-of-band synchronization
scripts. And those access tokens are a straight upgrade over
passwords: even if they're somehow stolen, they are time-limited, they
are optionally revocable, and they can be scoped to specific actions.

== Extension Points ==

This patchset provides several points of customization:

Server-side validation is farmed out entirely to an extension, which
we do not provide. (Each OAuth provider is free to come up with its
own proprietary method of verifying its access tokens, and so far the
big players have absolutely not standardized.) Depending on the
provider, the extension may need to contact an external server to see
what the token has been authorized to do, or it may be able to do that
offline using signing keys and an agreed-upon token format.

The client driver using libpq may replace the device authorization
prompt (which by default is done on standard error), for example to
move it into an existing GUI, display a scannable QR code instead of a
link, and so on.

The driver may also replace the entire OAuth flow. For example, a
client that already interacts with browsers may be able to use one of
the more standard web-based methods to get an access token. And
clients attached to a service rather than an end user could use a more
straightforward server-to-server flow, with pre-established
credentials.

== Architecture ==

The client needs to speak HTTP, which is implemented entirely with
libcurl. Originally, I used another OAuth library for rapid
prototyping, but the quality just wasn't there and I ported the
implementation. An internal abstraction layer remains in the libpq
code, so if a better client library comes along, switching to it
shouldn't be too painful.

The client-side hooks all go through a single extension point, so that
we don't continually add entry points in the API for each new piece of
authentication data that a driver may be able to provide. If we wanted
to, we could potentially move the existing SSL passphrase hook into
that, or even handle password retries within libpq itself, but I don't
see any burning reason to do that now.

I wanted to make sure that OAuth could be dropped into existing
deployments without driver changes. (Drivers will probably *want* to
look at the extension hooks for better UX, but they shouldn't
necessarily *have* to.) That has driven several parts of the design.

Drivers using the async APIs should continue to work without blocking,
even during the long HTTP handshakes. So the new client code is
structured as a typical event-driven state machine (similar to
PQconnectPoll). The protocol machine hands off control to the OAuth
machine during authentication, without really needing to know how it
works, because the OAuth machine replaces the PQsocket with a
general-purpose multiplexer that handles all of the HTTP sockets and
events. Once that's completed, the OAuth machine hands control right
back and we return to the Postgres protocol on the wire.

This decision led to a major compromise: Windows client support is
nonexistent. Multiplexer handles exist in Windows (for example with
WSAEventSelect, IIUC), but last I checked they were completely
incompatible with Winsock select(), which means existing async-aware
drivers would fail. We could compromise by providing synchronous-only
support, or by cobbling together a socketpair plus thread pool (or
IOCP?), or simply by saying that existing Windows clients need a new
API other than PQsocket() to be able to work properly. None of those
approaches have been attempted yet, though.

== Areas of Concern ==

Here are the iffy things that a committer is signing up for:

The client implementation is roughly 3k lines, requiring domain
knowledge of Curl, HTTP, JSON, and OAuth, the specifications of which
are spread across several separate standards bodies. (And some big
providers ignore those anyway.)

The OAUTHBEARER mechanism is extensible, but not in the same way as
HTTP. So sometimes, it looks like people design new OAuth features
that rely heavily on HTTP and forget to "port" them over to SASL. That
may be a point of future frustration.

C is not really anyone's preferred language for implementing an
extensible authn/z protocol running on top of HTTP, and constant
vigilance is going to be required to maintain safety. What's more, we
don't really "trust" the endpoints we're talking to in the same way
that we normally trust our servers. It's a fairly hostile environment
for maintainers.

Along the same lines, our JSON implementation assumes some level of
trust in the JSON data -- which is true for the backend, and can be
assumed for a DBA running our utilities, but is absolutely not the
case for a libpq client downloading data from Some Server on the
Internet. I've been working to fuzz the implementation and there are a
few known problems registered in the CF already.

Curl is not a lightweight dependency by any means. Typically, libcurl
is configured with a wide variety of nice options, a tiny subset of
which we're actually going to use, but all that code (and its
transitive dependencies!) is going to arrive in our process anyway.
That might not be a lot of fun if you're not using OAuth.

It's possible that the application embedding libpq is also a direct
client of libcurl. We need to make sure we're not stomping on their
toes at any point.

== TODOs/Known Issues ==

The client does not deal with verification failure well at the moment;
it just keeps retrying with a new OAuth handshake.

Some people are not going to be okay with just contacting any web
server that Postgres tells them to. There's a more paranoid mode
sketched out that lets the connection string specify the trusted
issuer, but it's not complete.

The new code still needs to play well with orthogonal connection
options, like connect_timeout and require_auth.

The server does not deal well with multi-issuer setups yet. And you
only get one oauth_validator_library...

Harden, harden, harden. There are still a handful of inline TODOs
around double-checking certain pieces of the response before
continuing with the handshake. Servers should not be able to run our
recursive descent parser out of stack. And my JSON code is using
assertions too liberally, which will turn bugs into DoS vectors. I've
been working to fit a fuzzer into more and more places, and I'm hoping
to eventually drive it directly from the socket.

Documentation still needs to be filled in. (Thanks Daniel for your work here!)

== Future Features ==

There is no support for token caching (refresh or otherwise). Each new
connection needs a new approval, and the only way to change that for
v1 is to replace the entire flow. I think that's eventually going to
annoy someone. The question is, where do you persist it? Does that
need to be another extensibility point?

We already have pretty good support for client certificates, and it'd
be great if we could bind our tokens to those. That way, even if you
somehow steal the tokens, you can't do anything with them without the
private key! But the state of proof-of-possession in OAuth is an
absolute mess, involving at least three competing standards (Token
Binding, mTLS, DPoP). I don't know what's going to win.

--

Hope this helps! Next I'll be working to fold the patches together, as
discussed upthread.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Antonin Houska
Date:
Jacob Champion <jacob.champion@enterprisedb.com> wrote:

> Peter asked me if there were plans to provide a "standard" validator
> module, say as part of contrib. The tricky thing is that Bearer
> validation is issuer-specific, and many providers give you an opaque
> token that you're not supposed to introspect at all.
>
> We could use token introspection (RFC 7662) for online verification,
> but last I looked at it, no one had actually implemented those
> endpoints. For offline verification, I think the best we could do
> would be to provide a generic JWT Profile (RFC 9068) validator, but
> again I don't know if anyone is actually providing those token formats
> in practice. I'm inclined to push that out into the future.

Have you considered sending the token for validation to the server, like this

curl -X GET "https://www.googleapis.com/oauth2/v3/userinfo" -H "Authorization: Bearer $TOKEN"

and getting the userid (e.g. email address) from the response, as described in
[1]? ISTM that this is what pgadmin4 does - in paricular, see the
get_user_profile() function in web/pgadmin/authenticate/oauth2.py.

[1] https://www.oauth.com/oauth2-servers/signing-in-with-google/verifying-the-user-info/

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Sep 27, 2024 at 10:58 AM Antonin Houska <ah@cybertec.at> wrote:
> Have you considered sending the token for validation to the server, like this
>
> curl -X GET "https://www.googleapis.com/oauth2/v3/userinfo" -H "Authorization: Bearer $TOKEN"

In short, no, but I'm glad you asked. I think it's going to be a
common request, and I need to get better at explaining why it's not
safe, so we can document it clearly. Or else someone can point out
that I'm misunderstanding, which honestly would make all this much
easier and less complicated. I would love to be able to do it that
way.

We cannot, for the same reason libpq must send the server an access
token instead of an ID token. The /userinfo endpoint tells you who the
end user is, but it doesn't tell you whether the Bearer is actually
allowed to access the database. That difference is critical: it's
entirely possible for an end user to be authorized to access the
database, *and yet* the Bearer token may not actually carry that
authorization on their behalf. (In fact, the user may have actively
refused to give the Bearer that permission.) That's why people are so
pedantic about saying that OAuth is an authorization framework and not
an authentication framework.

To illustrate, think about all the third-party web services out there
that ask you to Sign In with Google. They ask Google for permission to
access your personal ID, and Google asks you if you're okay with that,
and you either allow or deny it. Now imagine that I ran one of those
services, and I decided to become evil. I could take my legitimately
acquired Bearer token -- which should only give me permission to query
your Google ID -- and send it to a Postgres database you're authorized
to access.

The server is supposed to introspect it, say, "hey, this token doesn't
give the bearer access to the database at all," and shut everything
down. For extra credit, the server could notice that the client ID
tied to the access token isn't even one that it recognizes! But if all
the server does is ask Google, "what's the email address associated
with this token's end user?", then it's about to make some very bad
decisions. The email address it gets back doesn't belong to Jacob the
Evil Bearer; it belongs to you.

Now, the token introspection endpoint I mentioned upthread should give
us the required information (scopes, etc.). But Google doesn't
implement that one. In fact they don't seem to have implemented custom
scopes at all in the years since I started work on this feature, which
makes me think that people are probably not going to be able to safely
log into Postgres using Google tokens. Hopefully there's some feature
buried somewhere that I haven't seen.

Let me know if that makes sense. (And again: I'd love to be proven
wrong. It would improve the reach of the feature considerably if I
am.)

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Antonin Houska
Date:
Jacob Champion <jacob.champion@enterprisedb.com> wrote:

> On Fri, Sep 27, 2024 at 10:58 AM Antonin Houska <ah@cybertec.at> wrote:
> > Have you considered sending the token for validation to the server, like this
> >
> > curl -X GET "https://www.googleapis.com/oauth2/v3/userinfo" -H "Authorization: Bearer $TOKEN"
>
> In short, no, but I'm glad you asked. I think it's going to be a
> common request, and I need to get better at explaining why it's not
> safe, so we can document it clearly. Or else someone can point out
> that I'm misunderstanding, which honestly would make all this much
> easier and less complicated. I would love to be able to do it that
> way.
>
> We cannot, for the same reason libpq must send the server an access
> token instead of an ID token. The /userinfo endpoint tells you who the
> end user is, but it doesn't tell you whether the Bearer is actually
> allowed to access the database. That difference is critical: it's
> entirely possible for an end user to be authorized to access the
> database, *and yet* the Bearer token may not actually carry that
> authorization on their behalf. (In fact, the user may have actively
> refused to give the Bearer that permission.)

> That's why people are so pedantic about saying that OAuth is an
> authorization framework and not an authentication framework.

This statement alone sounds as if you missed *authentication*, but you seem to
admit above that the /userinfo endpoint provides it ("tells you who the end
user is"). I agree that it does. My understanding is that this endpoint, as
well as the concept of "claims" and "scopes", is introduced by OpenID, which
is an *authentication* framework, although it's built on top of OAuth.

Regarding *authorization*, I agree that the bearer token may not contain
enough information to determine whether the owner of the token is allowed to
access the database. However, I consider database a special kind of
"application", which can handle authorization on its own. In this case, the
authorization can be controlled by (not) assigning the user the LOGIN
attribute, as well as by (not) granting it privileges on particular database
objects. In short, I think that *authentication* is all we need.

> To illustrate, think about all the third-party web services out there
> that ask you to Sign In with Google. They ask Google for permission to
> access your personal ID, and Google asks you if you're okay with that,
> and you either allow or deny it. Now imagine that I ran one of those
> services, and I decided to become evil. I could take my legitimately
> acquired Bearer token -- which should only give me permission to query
> your Google ID -- and send it to a Postgres database you're authorized
> to access.
>
> The server is supposed to introspect it, say, "hey, this token doesn't
> give the bearer access to the database at all," and shut everything
> down. For extra credit, the server could notice that the client ID
> tied to the access token isn't even one that it recognizes! But if all
> the server does is ask Google, "what's the email address associated
> with this token's end user?", then it's about to make some very bad
> decisions. The email address it gets back doesn't belong to Jacob the
> Evil Bearer; it belongs to you.

Are you sure you can legitimately acquire the bearer token containing my email
address? I think the email address returned by the /userinfo endpoint is one
of the standard claims [1]. Thus by returning the particular value of "email"
from the endpoint the identity provider asserts that the token owner does have
this address. (And that, if "email_verified" claim is "true", it spent some
effort to verify that the email address is controlled by that user.)

> Now, the token introspection endpoint I mentioned upthread

Can you please point me to the particular message?

> should give us the required information (scopes, etc.). But Google doesn't
> implement that one. In fact they don't seem to have implemented custom
> scopes at all in the years since I started work on this feature, which makes
> me think that people are probably not going to be able to safely log into
> Postgres using Google tokens. Hopefully there's some feature buried
> somewhere that I haven't seen.
>
> Let me know if that makes sense. (And again: I'd love to be proven
> wrong. It would improve the reach of the feature considerably if I
> am.)

Another question, assuming the token verification is resolved somehow:
wouldn't it be sufficient for the initial implementation if the client could
pass the bearer token to libpq in the connection string?

Obviously, one use case is than an application / web server which needs the
token to authenticate the user could eventually pass the token to the database
server. Thus, if users could authenticate to the database using their
individual ids, it would no longer be necessary to store a separate userid /
password for the application in a configuration file.

Also, if libpq accepted the bearer token via the connection string, it would
be possible to implement the authorization as a separate front-end application
(e.g. pg_oauth_login) rather than adding more complexity to libpq itself.

(I'm learning this stuff on-the-fly, so there might be something naive in my
comments.)

[1] https://openid.net/specs/openid-connect-core-1_0.html#StandardClaims

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Antonin Houska
Date:
Antonin Houska <ah@cybertec.at> wrote:

> Jacob Champion <jacob.champion@enterprisedb.com> wrote:
> > Now, the token introspection endpoint I mentioned upthread
> 
> Can you please point me to the particular message?

Please ignore this dumb question. You probably referred to the email I was
responding to.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Sep 30, 2024 at 6:38 AM Antonin Houska <ah@cybertec.at> wrote:
>
> Jacob Champion <jacob.champion@enterprisedb.com> wrote:
>
> > On Fri, Sep 27, 2024 at 10:58 AM Antonin Houska <ah@cybertec.at> wrote:
> > That's why people are so pedantic about saying that OAuth is an
> > authorization framework and not an authentication framework.
>
> This statement alone sounds as if you missed *authentication*, but you seem to
> admit above that the /userinfo endpoint provides it ("tells you who the end
> user is"). I agree that it does. My understanding is that this endpoint, as
> well as the concept of "claims" and "scopes", is introduced by OpenID, which
> is an *authentication* framework, although it's built on top of OAuth.

OpenID is an authentication framework, but it's generally focused on a
type of client known as a Relying Party. In the architecture of this
patchset, the Relying Party would be libpq, which has the option of
retrieving authentication claims from the provider. Unfortunately for
us, libpq has no use for those claims. It's not trying to authenticate
the user for its own purposes.

The Postgres server, on the other hand, is not a Relying Party. (It's
an OAuth resource server, in this architecture.) It's not performing
any of the OIDC flows, it's not talking to the end user and the
provider at the same time, and it is very restricted in its ability to
influence the client exchange via the SASL mechanism.

> Regarding *authorization*, I agree that the bearer token may not contain
> enough information to determine whether the owner of the token is allowed to
> access the database. However, I consider database a special kind of
> "application", which can handle authorization on its own. In this case, the
> authorization can be controlled by (not) assigning the user the LOGIN
> attribute, as well as by (not) granting it privileges on particular database
> objects. In short, I think that *authentication* is all we need.

Authorizing the *end user's* access to the database using scopes is
optional. Authorizing the *bearer's* ability to connect on behalf of
the end user, however, is mandatory. Hopefully the below clarifies.

(I agree that most people probably want to use authentication, so that
the database can then make decisions based on HBA settings. OIDC is a
fine way to do that.)

> Are you sure you can legitimately acquire the bearer token containing my email
> address?

Yes. In general that's how OpenID-based "Sign in with <Service>"
works. All those third-party services are running around with tokens
that identify you, but unless they've asked for more abilities and
you've granted them the associated scopes, identifying you is all they
can do.

> I think the email address returned by the /userinfo endpoint is one
> of the standard claims [1]. Thus by returning the particular value of "email"
> from the endpoint the identity provider asserts that the token owner does have
> this address.

We agree that /userinfo gives authentication claims for the end user.
It's just insufficient for our use case.

For example, there are enterprise applications out there that will ask
for read access to your Google Calendar. If you're willing to grant
that, then you probably won't mind if those applications also know
your email address, but you probably do mind if they're suddenly able
to access your production databases just because you gave them your
email.

Put another way: if you log into Postgres using OAuth, and your
provider doesn't show you a big message saying "this application is
about to access *your* prod database using *your* identity; do you
want to allow that?", then your DBA has deployed a really dangerous
configuration. That's a critical protection feature you get from your
OAuth provider. Otherwise, what's stopping somebody else from setting
up their own malicious service to farm access tokens? All they'd have
to do is ask for your email.

> Another question, assuming the token verification is resolved somehow:
> wouldn't it be sufficient for the initial implementation if the client could
> pass the bearer token to libpq in the connection string?

It was discussed wayyy upthread:

    https://postgr.es/m/CAAWbhmhmBe9v3aCffz5j8Sg4HMWWkB5FvTDCSZ_Vh8E1fX91Gw%40mail.gmail.com

Basically, at that point the entire implementation becomes an exercise
for the reader. I want to avoid that if possible. I'm not adamantly
opposed to it, but I think the client-side hook implementation is
going to be better for the use cases that have been discussed so far.

> Also, if libpq accepted the bearer token via the connection string, it would
> be possible to implement the authorization as a separate front-end application
> (e.g. pg_oauth_login) rather than adding more complexity to libpq itself.

The application would still need to parse the server error response.
There was (a small) consensus at the time [1] that parsing error
messages for that purpose would be really unpleasant; hence the hook
architecture.

> (I'm learning this stuff on-the-fly, so there might be something naive in my
> comments.)

No worries! Please keep the questions coming; this OAuth architecture
is unintuitive, and I need to be able to defend it.

Thanks,
--Jacob

[1] https://postgr.es/m/CACrwV54_euYe%2Bv7bcLrxnje-JuM%3DKRX5azOcmmrXJ5qrffVZfg%40mail.gmail.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Antonin Houska
Date:
Jacob Champion <jacob.champion@enterprisedb.com> wrote:

> On Mon, Sep 30, 2024 at 6:38 AM Antonin Houska <ah@cybertec.at> wrote:
> >
> > Are you sure you can legitimately acquire the bearer token containing my email
> > address?
>
> Yes. In general that's how OpenID-based "Sign in with <Service>"
> works. All those third-party services are running around with tokens
> that identify you, but unless they've asked for more abilities and
> you've granted them the associated scopes, identifying you is all they
> can do.
>
> > I think the email address returned by the /userinfo endpoint is one
> > of the standard claims [1]. Thus by returning the particular value of "email"
> > from the endpoint the identity provider asserts that the token owner does have
> > this address.
>
> We agree that /userinfo gives authentication claims for the end user.
> It's just insufficient for our use case.
>
> For example, there are enterprise applications out there that will ask
> for read access to your Google Calendar. If you're willing to grant
> that, then you probably won't mind if those applications also know
> your email address, but you probably do mind if they're suddenly able
> to access your production databases just because you gave them your
> email.
>
> Put another way: if you log into Postgres using OAuth, and your
> provider doesn't show you a big message saying "this application is
> about to access *your* prod database using *your* identity; do you
> want to allow that?", then your DBA has deployed a really dangerous
> configuration. That's a critical protection feature you get from your
> OAuth provider. Otherwise, what's stopping somebody else from setting
> up their own malicious service to farm access tokens? All they'd have
> to do is ask for your email.

Perhaps I understand now. I use getmail [2] to retrieve email messages from my
Google account. What made me confused is that the getmail application,
although installed on my workstation (and thus the bearer token it eventually
gets contains my email address), it's "someone else" (in particular the
"Relying Party") from the perspective of the OpenID protocol. And the same
applies to "psql" in the context of your patch.

Thus, in addition to the email, we'd need special claims which authorize the
RPs to access the database and only the database. Does this sound correct?

> > (I'm learning this stuff on-the-fly, so there might be something naive in my
> > comments.)
>
> No worries! Please keep the questions coming; this OAuth architecture
> is unintuitive, and I need to be able to defend it.

I'd like to play with the code a bit and provide some review before or during
the next CF. That will probably generate some more questions.

>
> [1] https://postgr.es/m/CACrwV54_euYe%2Bv7bcLrxnje-JuM%3DKRX5azOcmmrXJ5qrffVZfg%40mail.gmail.com

[2] https://github.com/getmail6/getmail6/

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Oct 8, 2024 at 3:46 AM Antonin Houska <ah@cybertec.at> wrote:
> Perhaps I understand now. I use getmail [2] to retrieve email messages from my
> Google account. What made me confused is that the getmail application,
> although installed on my workstation (and thus the bearer token it eventually
> gets contains my email address), it's "someone else" (in particular the
> "Relying Party") from the perspective of the OpenID protocol. And the same
> applies to "psql" in the context of your patch.
>
> Thus, in addition to the email, we'd need special claims which authorize the
> RPs to access the database and only the database. Does this sound correct?

Yes. (One nitpick: the "special claims" in this case are not OpenID
claims at all, but OAuth scopes. The HBA will be configured with the
list of scopes that the server requires, and it requests those from
the client during the SASL handshake.)

> I'd like to play with the code a bit and provide some review before or during
> the next CF. That will probably generate some more questions.

Thanks very much for the review!

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Alexander Lakhin
Date:
Hello Peter,

11.09.2024 10:37, Peter Eisentraut wrote:
>
> This has been committed.
>

I've discovered that starting from 0785d1b8b,
make check -C src/bin/pg_combinebackup
fails under Valgrind, with the following diagnostics:
2024-10-15 14:29:52.883 UTC [3338981] 002_compare_backups.pl STATEMENT:  UPLOAD_MANIFEST
==00:00:00:20.028 3338981== Conditional jump or move depends on uninitialised value(s)
==00:00:00:20.028 3338981==    at 0xA3E68F: json_lex (jsonapi.c:1496)
==00:00:00:20.028 3338981==    by 0xA3ED13: json_lex (jsonapi.c:1666)
==00:00:00:20.028 3338981==    by 0xA3D5AF: pg_parse_json_incremental (jsonapi.c:822)
==00:00:00:20.028 3338981==    by 0xA40ECF: json_parse_manifest_incremental_chunk (parse_manifest.c:194)
==00:00:00:20.028 3338981==    by 0x31656B: FinalizeIncrementalManifest (basebackup_incremental.c:237)
==00:00:00:20.028 3338981==    by 0x73B4A4: UploadManifest (walsender.c:709)
==00:00:00:20.028 3338981==    by 0x73DF4A: exec_replication_command (walsender.c:2185)
==00:00:00:20.028 3338981==    by 0x7C58C3: PostgresMain (postgres.c:4762)
==00:00:00:20.028 3338981==    by 0x7BBDA7: BackendMain (backend_startup.c:107)
==00:00:00:20.028 3338981==    by 0x6CF60F: postmaster_child_launch (launch_backend.c:274)
==00:00:00:20.028 3338981==    by 0x6D546F: BackendStartup (postmaster.c:3415)
==00:00:00:20.028 3338981==    by 0x6D2B21: ServerLoop (postmaster.c:1648)
==00:00:00:20.028 3338981==

(Initializing
         dummy_lex.inc_state = NULL;
before
         partial_result = json_lex(&dummy_lex);
makes these TAP tests pass for me.)

Best regards,
Alexander



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Peter Eisentraut
Date:
On 15.10.24 20:10, Jacob Champion wrote:
> On Tue, Oct 15, 2024 at 11:00 AM Alexander Lakhin <exclusion@gmail.com> wrote:
>> I've discovered that starting from 0785d1b8b,
>> make check -C src/bin/pg_combinebackup
>> fails under Valgrind, with the following diagnostics:
> 
> Yep, sorry for that (and thanks for the report!). It's currently
> tracked over at [1], but I should have mentioned it here. The patch I
> used is attached, renamed to not stress out the cfbot.

I have committed this fix.




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Antonin Houska
Date:
Antonin Houska <ah@cybertec.at> wrote:

> I'd like to play with the code a bit and provide some review before or during
> the next CF. That will probably generate some more questions.

This is the 1st round, based on reading the code. I'll continue paying
attention to the project and possibly post some more comments in the future.


* Information on the new method should be added to pg_hba.conf.sample.method.


* Is it important that fe_oauth_state.token also contains the "Bearer"
  keyword? I'd expect only the actual token value here. The keyword can be
  added to the authentication message w/o storing it.

  The same applies to the 'token' structure in fe-auth-oauth-curl.c.


* Does PQdefaultAuthDataHook() have to be declared extern and exported via
  libpq/exports.txt ? Even if the user was interested in it, he can use
  PQgetAuthDataHook() to get the pointer (unless he already installed his
  custom hook).


* I wonder if the hooks (PQauthDataHook) can be implemented in a separate
  diff. Couldn't the first version of the feature be commitable without these
  hooks?


* Instead of allocating an instance of PQoauthBearerRequest, assigning it to
  fe_oauth_state.async_ctx, and eventually having to all its cleanup()
  function, wouldn't it be simpler to embed PQoauthBearerRequest as a member
  in fe_oauth_state ?


* oauth_validator_library is defined as PGC_SIGHUP - is that intentional?

  And regardless, the library appears to be loaded by every backend during
  authentication. Why isn't it loaded by postmaster like libraries listed in
  shared_preload_libraries? fork() would then ensure that the backends do have
  the library in their address space.


* pg_fe_run_oauth_flow()

  When first time here
            case OAUTH_STEP_TOKEN_REQUEST:
                if (!handle_token_response(actx, &state->token))
                    goto error_return;

  the user hasn't been prompted yet so ISTM that the first token request must
  always fail. It seems more logical if the prompt is set to the user before
  sending the token request to the server. (Although the user probably won't
  be that fast to make the first request succeed, so consider this just a
  hint.)


* As long as I understand, the following comment would make sense:

diff --git a/src/interfaces/libpq/fe-auth-oauth.c b/src/interfaces/libpq/fe-auth-oauth.c
index f943a31cc08..97259fb5654 100644
--- a/src/interfaces/libpq/fe-auth-oauth.c
+++ b/src/interfaces/libpq/fe-auth-oauth.c
@@ -518,6 +518,7 @@ oauth_exchange(void *opaq, bool final,
        switch (state->state)
        {
                case FE_OAUTH_INIT:
+                       /* Initial Client Response */
                        Assert(inputlen == -1);

                        if (!derive_discovery_uri(conn))

  Or, doesn't the FE_OAUTH_INIT branch of the switch statement actually fit
  better into oauth_init()? A side-effect of that might be (I only judge from
  reading the code, haven't tried to implement this suggestion) that
  oauth_exchange() would no longer return the SASL_ASYNC status. Furthermore,
  I'm not sure if pg_SASL_continue() can receive the SASL_ASYNC at all. So I
  wonder if moving that part from oauth_exchange() to oauth_init() would make
  the SASL_ASYNC state unnecessary.


* Finally, the user documentation is almost missing. I say that just for the
  sake of completeness, you obviously know it. (On the other hand, I think
  that the lack of user information might discourage some people from running
  the code and testing it.)

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Oct 17, 2024 at 10:51 PM Antonin Houska <ah@cybertec.at> wrote:
> This is the 1st round, based on reading the code. I'll continue paying
> attention to the project and possibly post some more comments in the future.

Thanks again for the reviews!

> * Information on the new method should be added to pg_hba.conf.sample.method.

Whoops, this will be fixed in v34.

> * Is it important that fe_oauth_state.token also contains the "Bearer"
>   keyword? I'd expect only the actual token value here. The keyword can be
>   added to the authentication message w/o storing it.
>
>   The same applies to the 'token' structure in fe-auth-oauth-curl.c.

Excellent question; I've waffled a bit on that myself. I think you're
probably right, but here's some background on why I originally made
that decision.

RFC 7628 defines not only OAUTHBEARER but also a generic template for
future OAuth-based SASL methods, and as part of that, the definition
of the "auth" key is incredibly vague:

      auth (REQUIRED):  The payload that would be in the HTTP
         Authorization header if this OAuth exchange was being carried
         out over HTTP.

I was worried that forcing a specific format would prevent future
extensibility, if say the Bearer scheme were updated to add additional
auth-params. I was also wondering if maybe a future specification
would allow OAUTHBEARER to carry a different scheme altogether, such
as DPoP [1].

However:
- auth-param support for Bearer was considered at the draft stage and
explicitly removed, with the old drafts stating "If additional
parameters are needed in the future, a different scheme would need to
be defined."
- I think the intent of RFC 7628 is that a new SASL mechanism will be
named for each new scheme (even if the new scheme shares all of the
bones of the old one). So DPoP tokens wouldn't piggyback on
OAUTHBEARER, and instead something like an OAUTHDPOP mech would need
to be defined.

So: the additional complexity in the current API is probably a YAGNI
violation, and I should just hardcode the Bearer format as you
suggest. Any future OAuth SASL mechanisms we support will have to go
through a different PQAUTHDATA type, e.g. PQAUTHDATA_OAUTH_DPOP_TOKEN.
And I'll need to make sure that I'm not improperly coupling the
concepts elsewhere in the API.

> * Does PQdefaultAuthDataHook() have to be declared extern and exported via
>   libpq/exports.txt ? Even if the user was interested in it, he can use
>   PQgetAuthDataHook() to get the pointer (unless he already installed his
>   custom hook).

I guess I don't have a strongly held opinion, but is there a good
reason not to? Exposing it means that a client application may answer
questions like "is the current hook set to the default?" and so on.
IME, hook-chain maintenance is not a lot of fun in general, and having
more visibility can be nice for third-party developers.

> * I wonder if the hooks (PQauthDataHook) can be implemented in a separate
>   diff. Couldn't the first version of the feature be commitable without these
>   hooks?

I am more than happy to split things up as needed! But in the end, I
think this is a question that can only be answered by the first brave
committer to take a bite. :)

(The original patchset didn't have these hooks; they were added as a
compromise, to prevent the builtin implementation from having to be
all things for all people.)

> * Instead of allocating an instance of PQoauthBearerRequest, assigning it to
>   fe_oauth_state.async_ctx, and eventually having to all its cleanup()
>   function, wouldn't it be simpler to embed PQoauthBearerRequest as a member
>   in fe_oauth_state ?

Hmm, that would maybe be simpler. But you'd still have to call
cleanup() and set the async_ctx, right? The primary gain would be in
reducing the number of malloc calls.

> * oauth_validator_library is defined as PGC_SIGHUP - is that intentional?

Yes, I think it's going to be important to let DBAs migrate their
authentication modules without a full restart. That probably deserves
more explicit testing, now that you mention it. Is there a specific
concern that you have with that?

>   And regardless, the library appears to be loaded by every backend during
>   authentication. Why isn't it loaded by postmaster like libraries listed in
>   shared_preload_libraries? fork() would then ensure that the backends do have
>   the library in their address space.

It _can_ be, if you want -- there's nothing that I know of preventing
the validator from also being preloaded with its own _PG_init(), is
there? But I don't think it's a good idea to force that, for the same
reason we want to allow SIGHUP.

> * pg_fe_run_oauth_flow()
>
>   When first time here
>                         case OAUTH_STEP_TOKEN_REQUEST:
>                                 if (!handle_token_response(actx, &state->token))
>                                         goto error_return;
>
>   the user hasn't been prompted yet so ISTM that the first token request must
>   always fail. It seems more logical if the prompt is set to the user before
>   sending the token request to the server. (Although the user probably won't
>   be that fast to make the first request succeed, so consider this just a
>   hint.)

That's also intentional -- if the first token response fails for a
reason _other_ than "we're waiting for the user", then we want to
immediately fail hard instead of making them dig out their phone and
go on a two-minute trip, because they're going to come back and find
that it was all for nothing.

There's a comment immediately below the part you quoted that mentions
this briefly; maybe I should move it up a bit?

> * As long as I understand, the following comment would make sense:
>
> diff --git a/src/interfaces/libpq/fe-auth-oauth.c b/src/interfaces/libpq/fe-auth-oauth.c
> index f943a31cc08..97259fb5654 100644
> --- a/src/interfaces/libpq/fe-auth-oauth.c
> +++ b/src/interfaces/libpq/fe-auth-oauth.c
> @@ -518,6 +518,7 @@ oauth_exchange(void *opaq, bool final,
>         switch (state->state)
>         {
>                 case FE_OAUTH_INIT:
> +                       /* Initial Client Response */
>                         Assert(inputlen == -1);
>
>                         if (!derive_discovery_uri(conn))

There are multiple "initial client response" cases, though. What
questions are you hoping to clarify with the comment? Maybe we can
find a more direct answer.

>   Or, doesn't the FE_OAUTH_INIT branch of the switch statement actually fit
>   better into oauth_init()?

oauth_init() is the mechanism initialization for the SASL framework
itself, which is shared with SCRAM. In the current architecture, the
init callback doesn't take the initial client response into
consideration at all.

Generating the client response is up to the exchange callback -- and
even if we moved the SASL_ASYNC processing elsewhere, I don't think we
can get rid of its added complexity. Something has to signal upwards
that it's time to transfer control to an async engine. And we can't
make the asynchronicity a static attribute of the mechanism itself,
because we can skip the flow if something gives us a cached token.

> * Finally, the user documentation is almost missing. I say that just for the
>   sake of completeness, you obviously know it. (On the other hand, I think
>   that the lack of user information might discourage some people from running
>   the code and testing it.)

Yeah, the catch-22 of writing huge features... By the way, if anyone's
reading along and dissuaded by the lack of docs, please say so!
(Daniel has been helping me out so much with the docs; thanks again,
Daniel.)

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc9449



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Oct 18, 2024 at 4:38 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> In validate() it seems to me we should clear out ret->authn_id on failure to
> pair belts with suspenders. Fixed by calling explicit_bzero on it in the error
> path.

The new hunk says:

> cleanup:
>     /*
>      * Clear and free the validation result from the validator module once
>      * we're done with it to avoid accidental re-use.
>      */
>     if (ret->authn_id != NULL)
>     {
>         explicit_bzero(ret->authn_id, strlen(ret->authn_id));
>         pfree(ret->authn_id);
>     }
>     pfree(ret);

But I'm not clear on what's being protected against. Which code would
reuse this result?

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Mon, Oct 28, 2024 at 6:24 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> > On 25 Oct 2024, at 20:22, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
>
> > I have combed almost all of Daniel's feedback backwards into the main
> > patch (just the new bzero code remains, with the open question
> > upthread),
>
> Re-reading I can't see a vector there, I guess I am just scarred from what
> seemed to be harmless leaks in auth codepaths and treat every bit as
> potentially important.  Feel free to drop from the patchset for now.

Okay. For authn_id specifically, which isn't secret and doesn't have
any power unless it's somehow copied into the ClientConnectionInfo,
I'm not sure that the bzero() gives us much. But I do see value in
clearing out, say, the Bearer token once we're finished with it.

Also in this validate() code path, I'm taking a look at the added
memory management with the pfree():
1. Should we add any more ceremony to the returned struct, to try to
ensure that the ABI matches? Or is it good enough to declare that
modules need to be compiled against a specific server version?
2. Should we split off a separate memory context to contain
allocations made by the validator?

> Looking more at the patchset I think we need to apply conditional compilation
> of the backend for oauth like how we do with other opt-in schemes in configure
> and meson.  The attached .txt has a diff for making --with-oauth a requirement
> for compiling support into backend libpq.

Do we get the flexibility we need with that approach? With other
opt-in schemes, the backend and the frontend both need some sort of
third-party dependency, but that's not true for OAuth. I could see
some people wanting to support an offline token validator on the
server side but not wanting to build the HTTP dependency into their
clients.

I was considering going in the opposite direction: With the client
hooks, a user could plug in their own implementation without ever
having to touch the built-in flow, and I'm wondering if --with-oauth
should really just be --with-builtin-oauth or similar. Then if the
server sends OAUTHBEARER, the client only complains if it doesn't have
a flow available to use, rather than checking USE_OAUTH. This kind of
ties into the other big open question of "what do we do about users
that don't want the additional overhead of something they're not
using?"

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 28 Oct 2024, at 17:09, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
> On Mon, Oct 28, 2024 at 6:24 AM Daniel Gustafsson <daniel@yesql.se> wrote:

>> Looking more at the patchset I think we need to apply conditional compilation
>> of the backend for oauth like how we do with other opt-in schemes in configure
>> and meson.  The attached .txt has a diff for making --with-oauth a requirement
>> for compiling support into backend libpq.
>
> Do we get the flexibility we need with that approach? With other
> opt-in schemes, the backend and the frontend both need some sort of
> third-party dependency, but that's not true for OAuth. I could see
> some people wanting to support an offline token validator on the
> server side but not wanting to build the HTTP dependency into their
> clients.

Currently we don't support any conditional compilation which only affects
backend or frontend, all --without-XXX flags turn it off for both.  Maybe this
is something which should change but I'm not sure that property should be
altered as part of a patch rather than discussed on its own merit.

> I was considering going in the opposite direction: With the client
> hooks, a user could plug in their own implementation without ever
> having to touch the built-in flow, and I'm wondering if --with-oauth
> should really just be --with-builtin-oauth or similar. Then if the
> server sends OAUTHBEARER, the client only complains if it doesn't have
> a flow available to use, rather than checking USE_OAUTH. This kind of
> ties into the other big open question of "what do we do about users
> that don't want the additional overhead of something they're not
> using?"

We already know that GSS cause measurable performance impact on connections
even when compiled but not in use [0], so I think we should be careful about
piling on more.

--
Daniel Gustafsson

[0] 20240610181212.auytluwmbfl7lb5n@awork3.anarazel.de


Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Oct 29, 2024 at 3:52 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> Currently we don't support any conditional compilation which only affects
> backend or frontend, all --without-XXX flags turn it off for both.

I don't think that's strictly true; see --with-pam which affects only
server-side code, since the hard part is in the server. Similarly,
--with-oauth currently affects only client-side code.

But in any case, that confusion is why I'm proposing a change to the
option name. I chose --with-oauth way before the architecture
solidified, and it doesn't reflect reality anymore. OAuth support on
the server side doesn't require Curl, and likely never will. So if you
want to support that on a Windows server, it's going to be strange if
we also force you to build the client with a libcurl dependency that
we won't even make use of on that platform.

> We already know that GSS cause measurable performance impact on connections
> even when compiled but not in use [0], so I think we should be careful about
> piling on more.

I agree, but if the server asks for OAUTHBEARER, that's the end of it.
Either the client supports OAuth and initiates a token flow, or it
doesn't and the connection fails. That's very different from the
client-initiated transport negotiation.

On the other hand, if we're concerned about the link-time overhead
(time and/or RAM) of the new dependency, I think that's going to need
something different from a build-time switch. My guess is that
maintainers are only going to want to ship one libpq.

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 29 Oct 2024, at 17:40, Jacob Champion <jacob.champion@enterprisedb.com> wrote:
>
> On Tue, Oct 29, 2024 at 3:52 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>> Currently we don't support any conditional compilation which only affects
>> backend or frontend, all --without-XXX flags turn it off for both.
>
> I don't think that's strictly true; see --with-pam which affects only
> server-side code, since the hard part is in the server. Similarly,
> --with-oauth currently affects only client-side code.

Fair, maybe it's an unwarranted concern.  Question is though, if we added PAM
today would we have done the same?

> But in any case, that confusion is why I'm proposing a change to the
> option name.

+1

--
Daniel Gustafsson




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Oct 29, 2024 at 10:41 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> Question is though, if we added PAM
> today would we have done the same?

I assume so; the client can't tell PAM apart from LDAP or any other
plaintext method. (In the same vein, the server can't tell if the
client uses libcurl to grab a token, or something entirely different.)

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Antonin Houska
Date:
Jacob Champion <jacob.champion@enterprisedb.com> wrote:

> On Thu, Oct 17, 2024 at 10:51 PM Antonin Houska <ah@cybertec.at> wrote:
> > * oauth_validator_library is defined as PGC_SIGHUP - is that intentional?
>
> Yes, I think it's going to be important to let DBAs migrate their
> authentication modules without a full restart. That probably deserves
> more explicit testing, now that you mention it. Is there a specific
> concern that you have with that?

No concern. I was just trying to imagine when the module needs to be changed.

> >   And regardless, the library appears to be loaded by every backend during
> >   authentication. Why isn't it loaded by postmaster like libraries listed in
> >   shared_preload_libraries? fork() would then ensure that the backends do have
> >   the library in their address space.
>
> It _can_ be, if you want -- there's nothing that I know of preventing
> the validator from also being preloaded with its own _PG_init(), is
> there? But I don't think it's a good idea to force that, for the same
> reason we want to allow SIGHUP.

Loading the library by postmaster does not prevent the backends from reloading
it on SIGHUP later. I was simply concerned about performance. (I proposed
loading the library at another stage of backend initialization rather than
adding _PG_init() to it.)

> > * pg_fe_run_oauth_flow()
> >
> >   When first time here
> >                         case OAUTH_STEP_TOKEN_REQUEST:
> >                                 if (!handle_token_response(actx, &state->token))
> >                                         goto error_return;
> >
> >   the user hasn't been prompted yet so ISTM that the first token request must
> >   always fail. It seems more logical if the prompt is set to the user before
> >   sending the token request to the server. (Although the user probably won't
> >   be that fast to make the first request succeed, so consider this just a
> >   hint.)
>
> That's also intentional -- if the first token response fails for a
> reason _other_ than "we're waiting for the user", then we want to
> immediately fail hard instead of making them dig out their phone and
> go on a two-minute trip, because they're going to come back and find
> that it was all for nothing.
>
> There's a comment immediately below the part you quoted that mentions
> this briefly; maybe I should move it up a bit?

That's fine, I understand now.

> > * As long as I understand, the following comment would make sense:
> >
> > diff --git a/src/interfaces/libpq/fe-auth-oauth.c b/src/interfaces/libpq/fe-auth-oauth.c
> > index f943a31cc08..97259fb5654 100644
> > --- a/src/interfaces/libpq/fe-auth-oauth.c
> > +++ b/src/interfaces/libpq/fe-auth-oauth.c
> > @@ -518,6 +518,7 @@ oauth_exchange(void *opaq, bool final,
> >         switch (state->state)
> >         {
> >                 case FE_OAUTH_INIT:
> > +                       /* Initial Client Response */
> >                         Assert(inputlen == -1);
> >
> >                         if (!derive_discovery_uri(conn))
>
> There are multiple "initial client response" cases, though. What
> questions are you hoping to clarify with the comment? Maybe we can
> find a more direct answer.

Easiness of reading is the only "question" here :-) It's might not always be
obvious why a variable should have some particular value. In general, the
Assert() statements are almost always preceded with a comment in the PG
source.

> >   Or, doesn't the FE_OAUTH_INIT branch of the switch statement actually fit
> >   better into oauth_init()?
>
> oauth_init() is the mechanism initialization for the SASL framework
> itself, which is shared with SCRAM. In the current architecture, the
> init callback doesn't take the initial client response into
> consideration at all.

Sure. The FE_OAUTH_INIT branch in oauth_exchange() (FE) also does not generate
the initial client response.

Based on reading the SCRAM implementation, I concluded that the init()
callback can do authentication method specific things, but unlike exchange()
it does not generate any output.

> Generating the client response is up to the exchange callback -- and
> even if we moved the SASL_ASYNC processing elsewhere, I don't think we
> can get rid of its added complexity. Something has to signal upwards
> that it's time to transfer control to an async engine. And we can't
> make the asynchronicity a static attribute of the mechanism itself,
> because we can skip the flow if something gives us a cached token.

I didn't want to skip the flow. I thought that the init() callback could be
made responsible for getting the token, but forgot that it still needs some
way to signal to the caller that the async flow is needed.

Anyway, are you sure that pg_SASL_continue() can also receive the SASL_ASYNC
value from oauth_exchange()? My understanding is that pg_SASL_init() receives
it if there is no token, but after that, oauth_exchange() is not called util
the token is available, and thus it should not return SASL_ASYNC anymore.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Thu, Oct 31, 2024 at 4:05 AM Antonin Houska <ah@cybertec.at> wrote:
> > >   And regardless, the library appears to be loaded by every backend during
> > >   authentication. Why isn't it loaded by postmaster like libraries listed in
> > >   shared_preload_libraries? fork() would then ensure that the backends do have
> > >   the library in their address space.
> >
> > It _can_ be, if you want -- there's nothing that I know of preventing
> > the validator from also being preloaded with its own _PG_init(), is
> > there? But I don't think it's a good idea to force that, for the same
> > reason we want to allow SIGHUP.
>
> Loading the library by postmaster does not prevent the backends from reloading
> it on SIGHUP later. I was simply concerned about performance. (I proposed
> loading the library at another stage of backend initialization rather than
> adding _PG_init() to it.)

Okay. I think this is going to be one of the slower authentication
methods by necessity: the builtin flow in libpq requires a human in
the loop, and an online validator is going to be making several HTTP
calls from the backend. So if it turns out later that we need to
optimize the backend logic, I'd prefer to have a case study in hand;
otherwise I think we're likely to optimize the wrong things.

> Easiness of reading is the only "question" here :-) It's might not always be
> obvious why a variable should have some particular value. In general, the
> Assert() statements are almost always preceded with a comment in the PG
> source.

Oh, an assertion label! I can absolutely add one; I originally thought
you were proposing a label for the case itself.

> > >   Or, doesn't the FE_OAUTH_INIT branch of the switch statement actually fit
> > >   better into oauth_init()?
> >
> > oauth_init() is the mechanism initialization for the SASL framework
> > itself, which is shared with SCRAM. In the current architecture, the
> > init callback doesn't take the initial client response into
> > consideration at all.
>
> Sure. The FE_OAUTH_INIT branch in oauth_exchange() (FE) also does not generate
> the initial client response.

It might, if it ends up falling through to FE_OAUTH_REQUESTING_TOKEN.
There are two paths that can do that: the case where we have no
discovery URI, and the case where a custom user flow returns a token
synchronously (it was probably cached).

> Anyway, are you sure that pg_SASL_continue() can also receive the SASL_ASYNC
> value from oauth_exchange()? My understanding is that pg_SASL_init() receives
> it if there is no token, but after that, oauth_exchange() is not called util
> the token is available, and thus it should not return SASL_ASYNC anymore.

Correct -- the only way for the current implementation of the
OAUTHBEARER mechanism to return SASL_ASYNC is during the very first
call. That's not an assumption I want to put into the higher levels,
though; I think Michael will be unhappy with me if I introduce
additional SASL coupling after the decoupling work that's been done
over the last few releases. :D

Thanks again,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
jian he
Date:
Hi there.
zero knowledge of Oath, just reading through the v35-0001.
forgive me if my comments are naive.

+static int
+parse_interval(struct async_ctx *actx, const char *interval_str)
+{
+ double parsed;
+ int cnt;
+
+ /*
+ * The JSON lexer has already validated the number, which is stricter than
+ * the %f format, so we should be good to use sscanf().
+ */
+ cnt = sscanf(interval_str, "%lf", &parsed);
+
+ if (cnt != 1)
+ {
+ /*
+ * Either the lexer screwed up or our assumption above isn't true, and
+ * either way a developer needs to take a look.
+ */
+ Assert(cnt == 1);
+ return 1; /* don't fall through in release builds */
+ }
+
+ parsed = ceil(parsed);
+
+ if (parsed < 1)
+ return actx->debugging ? 0 : 1;
+
+ else if (INT_MAX <= parsed)
+ return INT_MAX;
+
+ return parsed;
+}
The above Assert looks very wrong to me.

we can also use PG_INT32_MAX, instead of INT_MAX
(generally i think PG_INT32_MAX looks more intuitive to me)


+/*
+ * The Device Authorization response, described by RFC 8628:
+ *
+ *     https://www.rfc-editor.org/rfc/rfc8628#section-3.2
+ */
+struct device_authz
+{
+ char   *device_code;
+ char   *user_code;
+ char   *verification_uri;
+ char   *interval_str;
+
+ /* Fields below are parsed from the corresponding string above. */
+ int interval;
+};

click through the link https://www.rfc-editor.org/rfc/rfc8628#section-3.2
it says
"
   expires_in
      REQUIRED.  The lifetime in seconds of the "device_code" and
      "user_code".
   interval
      OPTIONAL.  The minimum amount of time in seconds that the client
      SHOULD wait between polling requests to the token endpoint.  If no
      value is provided, clients MUST use 5 as the default.
"
these two fields seem to differ from struct device_authz.



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Daniel Gustafsson
Date:
> On 4 Nov 2024, at 06:00, jian he <jian.universality@gmail.com> wrote:

> + if (cnt != 1)
> + {
> + /*
> + * Either the lexer screwed up or our assumption above isn't true, and
> + * either way a developer needs to take a look.
> + */
> + Assert(cnt == 1);
> + return 1; /* don't fall through in release builds */
> + }

> The above Assert looks very wrong to me.

I think the point is to fail hard in development builds to ensure whatever
caused the disconnect between the json lexer and sscanf parsing is looked at.
It should probably be changed to Assert(false); which is the common pattern for
erroring out like this.

--
Daniel Gustafsson




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Sun, Nov 3, 2024 at 9:00 PM jian he <jian.universality@gmail.com> wrote:
> The above Assert looks very wrong to me.

I can switch to Assert(false) if that's preferred, but it makes part
of the libc assert() report useless. (I wish we had more fluent ways
to say "this shouldn't happen, but if it does, we still need to get
out safely.")

> we can also use PG_INT32_MAX, instead of INT_MAX
> (generally i think PG_INT32_MAX looks more intuitive to me)

That's a fixed-width max; we want the maximum for the `int` type here.

>    expires_in
>       REQUIRED.  The lifetime in seconds of the "device_code" and
>       "user_code".
>    interval
>       OPTIONAL.  The minimum amount of time in seconds that the client
>       SHOULD wait between polling requests to the token endpoint.  If no
>       value is provided, clients MUST use 5 as the default.
> "
> these two fields seem to differ from struct device_authz.

Yeah, Daniel and I had talked about being stricter about REQUIRED
fields that are not currently used. There's a comment making note of
this in parse_device_authz(). The v1 code will need to make expires_in
REQUIRED, so that future developers can develop features that depend
on it without worrying about breaking
currently-working-but-noncompliant deployments. (And if there are any
noncompliant deployments out there now, we need to know about them so
we can have that explicit discussion.)

Thanks,
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Nov 5, 2024 at 3:33 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> Done in v36, attached.

Forgot to draw attention to this part:

>     +# XXX libcurl must link after libgssapi_krb5 on FreeBSD to avoid segfaults
>     +# during gss_acquire_cred(). This is possibly related to Curl's Heimdal
>     +# dependency on that platform?

Best I can tell, libpq for FreeBSD has a dependency diamond for GSS
symbols: libpq links against MIT krb5, libcurl links against Heimdal,
libpq links against libcurl. Link order becomes critical to avoid
nasty segfaults, but I have not dug deeply into the root cause.

--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Peter Eisentraut
Date:
On 06.11.24 00:33, Jacob Champion wrote:
> Done in v36, attached.

Assorted review comments from me:

Everything in the commit message between

     = Debug Mode =

and

     Several TODOs:

should be moved to the documentation.  In some cases, it already is,
but it doesn't always have the same level of detail.

(You could point from the commit message to .sgml files if you want to
highlight usage instructions, but I don't think this is generally
necessary.)

* config/programs.m4

Can we do the libcurl detection using pkg-config only?  Seems simpler,
and maintains consistency to meson.

* doc/src/sgml/client-auth.sgml

In the list of terms (this could be a <variablelist>), state how these
terms map to a PostgreSQL installation.  You already explain what the
client and the resource server are, but not who the resource owner is
and what the authorization server is.  It would also be good to be
explicit and upfront that the authorization server is a third-party
component that needs to be obtained separately.

trust_validator_authz: Personally, I'm not a fan of the "authz" and
"authn" abbreviations.  I know this is security jargon.  But are
regular users going to understand this?  Can we just spell it out?

* doc/src/sgml/config.sgml

Also here maybe state that these OAuth libraries have to be obtained
separately.

* doc/src/sgml/installation.sgml

I find the way the installation options are structured a bit odd.  I
would have expected --with-libcurl and -Dlibcurl (or --with-curl and
-Dcurl).  These build options usually just say, use this library.  We
don't spell out what, for example, libldap is used for, we just use it
and enable all the features that require it.

* doc/src/sgml/libpq.sgml

Maybe oauth_issuer should be oauth_issuer_url?  Otherwise one might
expect to just write "google" here or something.  Or there might be
other ways to contact an issuer in the future?  Just a thought.

* doc/src/sgml/oauth-validators.sgml

This chapter says "libpq" several times, but I think this is a server
side plugin, so libpq does not participate.  Check please.

* src/backend/libpq/auth-oauth.c

I'm confused by the use of PG_MAX_AUTH_TOKEN_LENGTH in the
pg_be_oauth_mech definition.  What does that mean?

+#define KVSEP 0x01
+#define AUTH_KEY "auth"
+#define BEARER_SCHEME "Bearer "

Add comments to these.

Also, add comments to all functions defined here that don't have one
yet.

* src/backend/utils/misc/guc_tables.c

Why is oauth_validator_library GUC_NOT_IN_SAMPLE?

Also, shouldn't this be an hba option instead?  What if you want to
use different validators for different connections?

* src/interfaces/libpq/fe-auth-oauth-curl.c

The CURL_IGNORE_DEPRECATION thing needs clarification.  Is that in
progress?

+#define MAX_OAUTH_RESPONSE_SIZE (1024 * 1024)

Add a comment about why this value.

+   union
+   {
+       char      **scalar;     /* for all scalar types */
+       struct curl_slist **array;  /* for type == JSON_TOKEN_ARRAY_START */
+   };

This is an anonymous union, which requires C11.  Strangely, I cannot
get clang to warn about this with -Wc11-extensions.  Probably better
to fix anyway.  (The trailing supported MSVC versions don't support
C11 yet.)

* src/interfaces/libpq/fe-auth.h

+extern const pg_fe_sasl_mech pg_oauth_mech;

Should this rather be in fe-auth-oauth.h?

* src/interfaces/libpq/libpq-fe.h

The naming scheme of types and functions in this file is clearly
obscure and has grown randomly over time.  But at least my intuition
is that the preferred way is

types start with PG
function start with PQ

and the next letter is usually lower case. (PQconnectdb, PQhost,
PGconn, PQresult)

Maybe check your additions against that.

* src/interfaces/libpq/pqexpbuffer.c
* src/interfaces/libpq/pqexpbuffer.h

Let's try to do this without opening up additional APIs here.

This is only used once, in append_urlencoded(), and there are other
ways to communicate errors, for example returning a bool.

* src/test/modules/oauth_validator/

Everything in this directory needs more comments, at least on a file
level.

Add a README in this directory.  Also update the README in the upper
directory.

* src/test/modules/oauth_validator/t/001_server.pl

On Cirrus CI Windows task, this test reports SKIP.  Can't tell why,
because the log is not kept.  I suppose you expect this to work on
Windows (but see my comment below), so it would be good to get this
test running.

* src/test/modules/oauth_validator/t/002_client.pl

+my $issuer = "https://127.0.0.1:54321";

Use PostgreSQL::Test::Cluster::get_free_port() instead of hardcoding
port numbers.

Or is this a real port?  I don't see it used anywhere else.

+   diag "running '" . join("' '", @cmd) . "'";

This should be "note" instead.  Otherwise it garbles the output.

* src/test/perl/PostgreSQL/Test/OAuthServer.pm

Add some comments to this file, what it's for.

Is this meant to work on Windows?  Just thinking, things like

kill(15, $self->{'pid'});

pgperlcritic complains:

src/test/perl/PostgreSQL/Test/OAuthServer.pm: Return value of flagged 
function ignored - read at line 39, column 2.

* src/tools/pgindent/typedefs.list

We don't need to typedef every locally used enum or similar into a
full typedef.  I suggest the following might be unnecessary:

AsyncAuthFunc
OAuthStep
fe_oauth_state_enum




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Fri, Nov 8, 2024 at 1:21 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> Assorted review comments from me:

Thank you! I will cherry-pick some responses here and plan to address
the rest in a future patchset.

> trust_validator_authz: Personally, I'm not a fan of the "authz" and
> "authn" abbreviations.  I know this is security jargon.  But are
> regular users going to understand this?  Can we just spell it out?

Yes. That name's a holdover from the very first draft, actually.

Is "trust_validator_authorization" a great name in the first place?
The key concept is that user mapping is being delegated to the OAuth
system itself, so you'd better make sure that the validator has been
built to do that. (Anyone have any suggestions?)

> I find the way the installation options are structured a bit odd.  I
> would have expected --with-libcurl and -Dlibcurl (or --with-curl and
> -Dcurl).  These build options usually just say, use this library.

It's patterned directly off of -Dssl/--with-ssl (which I liberally
borrowed from) because the builtin client implementation used to have
multiple options for the library in use. I can change it if needed,
but I thought it'd be helpful for future devs if I didn't undo the
generalization.

> Maybe oauth_issuer should be oauth_issuer_url?  Otherwise one might
> expect to just write "google" here or something.  Or there might be
> other ways to contact an issuer in the future?  Just a thought.

More specifically this is an "issuer identifier", as defined by the
OAuth/OpenID discovery specs. It's a subset of a URL, and I want to
make sure users know how to differentiate between an "issuer" they
trust and the "discovery URI" that's in use for that issuer. They may
want to set one or the other -- a discovery URI is associated with
exactly one issuer, but unfortunately an issuer may have multiple
discovery URIs, which I'm actively working on. (There is also some
relation to the multiple-issuers problem mentioned below.)

> I'm confused by the use of PG_MAX_AUTH_TOKEN_LENGTH in the
> pg_be_oauth_mech definition.  What does that mean?

Just that Bearer tokens can be pretty long, so we don't want to limit
them to 1k like SCRAM does. 64k is probably overkill, but I've seen
anecdotal reports of tens of KBs and it seemed reasonable to match
what we're doing for GSS tokens.

> Also, shouldn't [oauth_validator_library] be an hba option instead?  What if you want to
> use different validators for different connections?

Yes. This is again the multiple-issuers problem; I will split that off
into its own email since this one's getting long. It has security
implications.

> The CURL_IGNORE_DEPRECATION thing needs clarification.  Is that in
> progress?

Thanks for the nudge, I've started a thread:

    https://curl.se/mail/lib-2024-11/0028.html

> This is an anonymous union, which requires C11.  Strangely, I cannot
> get clang to warn about this with -Wc11-extensions.  Probably better
> to fix anyway.  (The trailing supported MSVC versions don't support
> C11 yet.)

Oh, that's not going to be fun.

> This is only used once, in append_urlencoded(), and there are other
> ways to communicate errors, for example returning a bool.

I'd rather not introduce two parallel error indicators for the caller
to have to check for that particular part. But I can change over to
using the (identical!) termPQExpBuffer. I felt like the other API
signaled the intent a little better, though.

> On Cirrus CI Windows task, this test reports SKIP.  Can't tell why,
> because the log is not kept.  I suppose you expect this to work on
> Windows (but see my comment below)

No, builtin client support does not exist on Windows. If/when it's
added, the 001_server tests will need to be ported.

> +my $issuer = "https://127.0.0.1:54321";
>
> Use PostgreSQL::Test::Cluster::get_free_port() instead of hardcoding
> port numbers.
>
> Or is this a real port?  I don't see it used anywhere else.

It's not real; 002_client.pl doesn't start an authorization server at
all. I can make that more explicit.

> src/test/perl/PostgreSQL/Test/OAuthServer.pm: Return value of flagged
> function ignored - read at line 39, column 2.

So perlcritic recognizes "or" but not the "//" operator... Lovely.

Thanks!
--Jacob



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Nov 12, 2024 at 1:47 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:
> On Fri, Nov 8, 2024 at 1:21 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> > Also, shouldn't [oauth_validator_library] be an hba option instead?  What if you want to
> > use different validators for different connections?
>
> Yes. This is again the multiple-issuers problem; I will split that off
> into its own email since this one's getting long. It has security
> implications.

Okay, so, how to use multiple issuers/providers. Here's my current
plan, with justification below:

1. libpq connection strings must specify exactly one issuer
2. the discovery document coming from the server must belong to that
libpq issuer
3. the HBA should allow a choice of discovery document and validator

= Current Bug =

First, I should point out a critical mistake I've made on the client
side: I treat oauth_issuer and oauth_client_id as if they can be
arbitrarily mixed and matched. Some of the providers I've been testing
do allow you to use one registered client across multiple issuers, but
that's the exception rather than the norm. Even if you have multiple
issuers available, you still expect your registered client to be
talking to only the provider you registered it with.

And you don't want the Postgres server to switch providers for you.
Imagine that you've registered a client application for use with a big
provider, and that provider has given you a client secret. You expect
to share that secret only with them, but with the current setup, if a
DBA wants to steal that secret from you, all they have to do is stand
up a provider of their own, and libpq will send the secret straight to
it instead. Great.

There's actually a worse scenario that's pointed out in the spec for
the Device Authorization flow [1]:

    Note that if an authorization server used with this flow is
    malicious, then it could perform a man-in-the-middle attack on the
    backchannel flow to another authorization server. [...] For this to
    be possible, the device manufacturer must either be the attacker and
    shipping a device intended to perform the man-in-the-middle attack,
    or be using an authorization server that is controlled by an
    attacker, possibly because the attacker compromised the
    authorization server used by the device.

Back when I implemented this, that paragraph seemed pointlessly
obvious: of course you must trust your authorization server. What I
missed was, the Postgres server MUST NOT be able to control the entry
point into the device flow, because that means a malicious DBA can
trivially start a device prompt with a different provider, forward you
all the details through the endpoint they control, and hope you're too
fatigued to notice the difference before clicking through. (This is
easier if that provider is one of the big ones that you're already
used to trusting.) Then they have a token with which to attack you on
a completely different platform.

So in my opinion, my patchset must be changed to require a trusted
issuer in the libpq connection string. The server can tell you which
discovery document to get from that issuer, and it can tell you which
scopes are required (as long as the user hasn't hardcoded those too),
but it shouldn't be able to force the client to talk to an arbitrary
provider or swap out issuers.

= Multiple Issuers =

Okay, with that out of the way, let's talk about multiple issuer support.

First, server-side. If a server wants different groups of
users/databases/etc. to go through different issuers, then it stands
to reason that a validator should be selectable in the HBA settings,
since a validator for Provider A may not have any clue how to validate
Provider B. I don't like the idea of pg_hba being used to load
arbitrary libraries, though; I think the superuser should have to
designate a pool of "blessed" validator libraries to load through a
GUC. As a UX improvement for the common case, maybe we don't require
the HBA to have an explicit validator parameter if the conf contains
exactly one blessed library.

In case someone does want to develop a multi-issuer validator (say, to
deal with the providers that have multiple issuers underneath their
umbrella), we need to make sure that the configured issuer in use is
available to the validator, so that they aren't susceptible to a
mix-up attack of their own.

As for the client side, I think v1 should allow only one expected
issuer per connection. There are OAuth features [2] that help clients
handle more safely, but as far as I can tell they are not widely
deployed yet, and I don't know if any of them apply to the device
flow. (With the device flow, if the client allows multiple providers,
those providers can attack each other as described above.)

If a more complicated client application associates a single end user
with multiple Postgres connections, and each connection needs its own
issuer, then that application needs to be encouraged to use a flow
which has been hardened for that use case. (Setting aside the security
problems with mix-ups, the device flow won't be particularly pleasant
for that anyway. "Here's a bunch of URLs and codes, go to all of them
before they time out, good luck!")

= Discovery Documents =

There are two flavors of discovery document, OAuth and OpenID. And
OIDC Discovery and RFC 8414 disagree on the rules, so for the issuer
"https://example.com/abcd", you have two discovery document locations
using postfix or infix styles for the path:

- OpenID: https://example.com/abcd/.well-known/openid-configuration
- OAuth:  https://example.com/.well-known/oauth-authorization-server/abcd

Some providers publish different information at each [3], so the
difference may be important for some deployments. RFC 8414 claims the
OpenID flavor should transition to the infix style at some point (a
transition that is not happening as far as I can see), so now there
are three standards. And Okta uses the construction
"https://example.com/abcd/.well-known/oauth-authorization-server",
which you may notice matches _neither_ of the two options above, so
now there are four standards.

To deal with all of this, I plan to better separate the difference
between the issuer and the discovery URL in the code, as well as allow
DBAs and clients to specify the discovery URL explicitly to override
the default OpenID flavor. For now I plan to support only
"openid-configuration" and "oauth-authorization-server" in both
postfix and infix notation (four options total, as seen in the wild).

How's all that sound?

--Jacob

[1] https://datatracker.ietf.org/doc/html/rfc8628#section-5.3
[2] https://datatracker.ietf.org/doc/html/rfc9207
[3] https://devforum.okta.com/t/is-userinfo-endpoint-available-in-oauth-authorization-server/24284



Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Peter Eisentraut
Date:
On 12.11.24 22:47, Jacob Champion wrote:
> On Fri, Nov 8, 2024 at 1:21 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>> I find the way the installation options are structured a bit odd.  I
>> would have expected --with-libcurl and -Dlibcurl (or --with-curl and
>> -Dcurl).  These build options usually just say, use this library.
> 
> It's patterned directly off of -Dssl/--with-ssl (which I liberally
> borrowed from) because the builtin client implementation used to have
> multiple options for the library in use. I can change it if needed,
> but I thought it'd be helpful for future devs if I didn't undo the
> generalization.

Personally, I'm not even a fan of the -Dssl/--with-ssl system.  I'm more 
attached to --with-openssl.  But if you want to stick with that, a more 
suitable naming would be something like, say, --with-httplib=curl, which 
means, use curl for all your http needs.  Because if we later add other 
functionality that can use some http, I don't think we want to enable or 
disable them all individually, or even mix different http libraries for 
different features.  In practice, curl is a widely available and 
respected library, so I'd expect packagers to be just turn it all on 
without much further consideration.


>> I'm confused by the use of PG_MAX_AUTH_TOKEN_LENGTH in the
>> pg_be_oauth_mech definition.  What does that mean?
> 
> Just that Bearer tokens can be pretty long, so we don't want to limit
> them to 1k like SCRAM does. 64k is probably overkill, but I've seen
> anecdotal reports of tens of KBs and it seemed reasonable to match
> what we're doing for GSS tokens.

Ah, ok, I totally misread that code.  Could you maybe write this definition

+/* Mechanism declaration */
+const pg_be_sasl_mech pg_be_oauth_mech = {
+   oauth_get_mechanisms,
+   oauth_init,
+   oauth_exchange,
+
+   PG_MAX_AUTH_TOKEN_LENGTH,
+};

with designated initializers:

const pg_be_sasl_mech pg_be_oauth_mech = {
     .get_mechanisms = oauth_get_mechanisms,
     .init = oauth_init,
     .exchange = oauth_exchange,
     .max_message_length = PG_MAX_AUTH_TOKEN_LENGTH,
};


>> The CURL_IGNORE_DEPRECATION thing needs clarification.  Is that in
>> progress?
> 
> Thanks for the nudge, I've started a thread:
> 
>      https://curl.se/mail/lib-2024-11/0028.html

It looks like this has been clarified, so let's put that URL into a code 
comment.


>> This is only used once, in append_urlencoded(), and there are other
>> ways to communicate errors, for example returning a bool.
> 
> I'd rather not introduce two parallel error indicators for the caller
> to have to check for that particular part. But I can change over to
> using the (identical!) termPQExpBuffer. I felt like the other API
> signaled the intent a little better, though.

I think it's better to not drill a new hole into an established API for 
such a limited use.  So termPQExpBuffer() seems better for now.  If it 
later turns out, many callers are using termPQExpBuffer() for fake error 
handling purposes, then that can be considered independently.


>> On Cirrus CI Windows task, this test reports SKIP.  Can't tell why,
>> because the log is not kept.  I suppose you expect this to work on
>> Windows (but see my comment below)
> 
> No, builtin client support does not exist on Windows. If/when it's
> added, the 001_server tests will need to be ported.

Could you put some kind of explicit conditional or a comment in there. 
Right now, it's not possible to tell that Windows is not supported.




Re: [PoC] Federated Authn/z with OAUTHBEARER

From
Jacob Champion
Date:
On Tue, Nov 19, 2024 at 3:05 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> Personally, I'm not even a fan of the -Dssl/--with-ssl system.  I'm more
> attached to --with-openssl.  But if you want to stick with that, a more
> suitable naming would be something like, say, --with-httplib=curl, which
> means, use curl for all your http needs.  Because if we later add other
> functionality that can use some http, I don't think we want to enable or
> disable them all individually, or even mix different http libraries for
> different features.  In practice, curl is a widely available and
> respected library, so I'd expect packagers to be just turn it all on
> without much further consideration.

Okay, I can see that. I'll work on replacing --with-builtin-oauth. Any
votes from the gallery on --with-httplib vs. --with-libcurl?

The other suggestions look good and I've added them to my personal
TODO list. Thanks again for all the feedback!

--Jacob