Thread: [PoC] Delegating pg_ident to a third party

[PoC] Delegating pg_ident to a third party

From

Jacob Champion

Date:

16 December 2021, 23:48:57

Hi all,

In keeping with my theme of expanding the authentication/authorization
options for the server, attached is an experimental patchset that lets
Postgres determine an authenticated user's allowed roles by querying an
LDAP server, and enables SASL binding for those queries.

This lets you delegate pieces of pg_ident.conf to a central server, so
that you don't have to run any synchronization scripts (or deal with
associated staleness problems, repeated load on the LDAP deployment,
etc.). And it lets you make those queries with a client certificate
instead of a bind password, or at the very least protect your bind
password with some SCRAM crypto. You don't have to use the LDAP auth
method for this to work; you can combine it with Kerberos or certs or
any auth method that already supports pg_ident.

The target users, in my mind, are admins who are already using an auth
method with user maps, but have many deployments and want easier
control over granting and revoking database access from one location.
This won't help you so much if you need to have exactly one role per
user -- there's no logic to automatically create roles, so it can't
fully replace the existing synchronization scripts that are out there.
But if all you need is "X, Y, and Z are allowed to log in as guest, and
A and B may connect as admins", then this is meant to simplify your
life.

This is a smaller step than my previous proof-of-concept, which handled
fully federated authentication and authorization via an OAuth provider
[1], and it should be a nice companion to my patch that adds user
mappings to the LDAP auth method [2], though I haven't tried them
together yet. (I've also been thinking about pulling group membership
information out of Kerberos authorization data, for those of you using
Active Directory. Things for later.)

= How-To =

If you want to try it out -- on a non-production system please -- take
a look at the test suite in src/test/ldap, which has been filled out
with some example usage. The core features are the "ldapmap" HBA option
(which you would use instead of "map" in your existing HBA) and the
"ldapsaslmechs" HBA option, which you can set to a list of SASL
mechanisms that you will accept. (The list of supported mechanisms is
determined by both systems' LDAP and SASL libraries, not by Postgres.)

The tricky part is writing the pg_ident line correctly, because it's
currently not a very good user experience. The query is in the form of
an LDAP URL. It needs to return exactly one entry for the user being
authorized; the attribute values contained in that entry will be
interpreted as the list of roles that the user is allowed to connect
as. Regex matching and substitution are supported as they are for
regular maps. Here's a sample:

pg_ident.conf:

myldapmap /^(.*)$ ldap://example.com/dc=example,dc=com?postgresRole?sub?(uid=\1)

pg_hba.conf:

hostssl all all all cert ldapmap=myldapmap ldaptls=1 ldapsaslmechs=scram-sha-1 ldapbinddn=admin
ldapbindpasswd=secret

This particular setup can be described as follows:

- Clients must use client certificates to authenticate to Postgres.
- Once the certificate is verified, Postgres will connect to the LDAP
server at example.com, issue StartTLS, and begin a SCRAM-SHA-1 exchange
using the bind username and password (admin/secret).
- Once that completes, Postgres will issue a query for the LDAP user
that has a uid matching the CN of the client certificate. (If more than
one user matches, authorization fails.)
- The client's PGUSER will be compared with the list of postgresRole
attributes belonging to that LDAP user, and if one matches,
authorization succeeds.

= Areas for Improvement =

I think it would be nice to support LDAP group membership in addition
to object attributes.

Settings for the LDAP connection are currently spread between pg_hba,
pg_ident, and environment variables like LDAPTLS_CERT. I made the
situation worse by allowing the pg_ident query to contain a scheme,
host, and port. That makes it seem like you could send different users
to different LDAP servers, but since they would all have to share
exactly the same TLS settings anyway, I think this was a mistake on my
part.

That mistake aside, I think the current URL query syntax is powerful
but unintuitive. I would rather see that as an option for power users,
and let other people just specify the user filter and role attribute
separately. And there needs to be more logging around the feature, to
help debug problems.

Regex substitution of user-controlled data into an LDAP query is
perilous, and I don't like it. For now I have restricted the allowed
characters as a first mitigation.

Is it safe to use listen_addresses in the test suite, as I have done,
as long as the HBA requires authentication? Or is that reopening a
security hole? I seem to recall discussion on this but my search-fu has
failed me.

There's a lot of code duplication in the current patchset that would
need to be undone.

...and more; see TODOs in the patches if you're interested.

= Patch Roadmap =

- 0001 fixes error messages that are printed when ldap_url_parse()
fails. Since the pg_ident queries use LDAP URLs, and it's easy to get
them wrong, that fix is particularly important for this patchset. But I
think it could potentially be applied separately.

- 0002 implements the "ldapmap" HBA option and enables the ldaptls,
ldapbinddn, and ldapbindpasswd options for it. It also adds
corresponding tests to the LDAP suite.

- 0003 tests the use of client certificates via LDAP environment
variables. (This is already supported today but I didn't see any
coverage, which will be important for the last patch.)

- 0004 implements the "ldapsaslmechs" HBA option and adds enough SASL
support for at least the EXTERNAL and SCRAM-* mechanisms. Others may
work but I haven't tested them. This feature is available only if you
have the <sasl/sasl.h> header on your system at build time.

WDYT? (My responses here will be slower than usual. Hope you all have a
great end to the year!)

--Jacob

[1] https://www.postgresql.org/message-id/flat/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com
[2] https://www.postgresql.org/message-id/flat/1a61806047c536e7528b943d0cfe12608118ca31.camel@vmware.com

Attachment

Re: [PoC] Delegating pg_ident to a third party

From

Peter Eisentraut

Date:

17 December 2021, 09:06:05

On 17.12.21 00:48, Jacob Champion wrote:
> WDYT? (My responses here will be slower than usual. Hope you all have a
> great end to the year!)

Looks interesting.  I wonder whether putting this into pg_ident.conf is 
sensible.  I suspect people will want to eventually add more features 
around this, like automatically creating roles or role memberships, at 
which point pg_ident.conf doesn't seem appropriate anymore.  Should we 
have a new file for this?  Do you have any further ideas?

Re: [PoC] Delegating pg_ident to a third party

From

Jacob Champion

Date:

03 January 2022, 16:46:16

On Fri, 2021-12-17 at 10:06 +0100, Peter Eisentraut wrote:
> On 17.12.21 00:48, Jacob Champion wrote:
> > WDYT? (My responses here will be slower than usual. Hope you all have a
> > great end to the year!)
> 
> Looks interesting.  I wonder whether putting this into pg_ident.conf is 
> sensible.  I suspect people will want to eventually add more features 
> around this, like automatically creating roles or role memberships, at 
> which point pg_ident.conf doesn't seem appropriate anymore.

Yeah, pg_ident is getting too cramped for this.

> Should we have a new file for this?  Do you have any further ideas?

My experience with these configs is mostly limited to HTTP servers.
That said, it's pretty hard to beat the flexibility of arbitrary key-
value pairs inside nested contexts. It's nice to be able to say things
like

    Everyone has to use LDAP auth
    With this server
    And these TLS settings

    Except admins
        who additionally need client certificates
        with this CA root

    And Jacob
        who isn't allowed in anymore

Are there any existing discussions along these lines that I should take
a look at?

--Jacob

Re: [PoC] Delegating pg_ident to a third party

From

Stephen Frost

Date:

03 January 2022, 17:36:05

Greetings,

* Jacob Champion (pchampion@vmware.com) wrote:
> On Fri, 2021-12-17 at 10:06 +0100, Peter Eisentraut wrote:
> > On 17.12.21 00:48, Jacob Champion wrote:
> > > WDYT? (My responses here will be slower than usual. Hope you all have a
> > > great end to the year!)
> >
> > Looks interesting.  I wonder whether putting this into pg_ident.conf is
> > sensible.  I suspect people will want to eventually add more features
> > around this, like automatically creating roles or role memberships, at
> > which point pg_ident.conf doesn't seem appropriate anymore.

This is the part that I really wonder about also ... I've always viewed
pg_ident as being intended mainly for one-to-one kind of mappings and
not the "map a bunch of different users into the same role" that this
advocated for.  Being able to have roles and memberships automatically
created is much more the direction that I'd say we should be going in,
so that in-database auditing has an actual user to go on and not some
generic role that could be any number of people.

I'd go a step further and suggest that the way to do this is with a
background worker that's started up and connects to an LDAP
infrastructure and listens for changes, allowing the system to pick up
on new roles/memberships as soon as they're created in the LDAP
environment.  That would then be controlled by appropriate settings in
postgresql.conf/.auto.conf.

> Yeah, pg_ident is getting too cramped for this.

All that said, I do see how having the ability to call out to another
system for mappings may be useful, so I'm not sure that we shouldn't
consider this specific change and have it be specifically just for
mappings, in which case pg_ident seems appropriate.

> > Should we have a new file for this?  Do you have any further ideas?
>
> My experience with these configs is mostly limited to HTTP servers.
> That said, it's pretty hard to beat the flexibility of arbitrary key-
> value pairs inside nested contexts. It's nice to be able to say things
> like
>
>     Everyone has to use LDAP auth
>     With this server
>     And these TLS settings
>
>     Except admins
>         who additionally need client certificates
>         with this CA root
>
>     And Jacob
>         who isn't allowed in anymore

I certainly don't think we should have this be limited to LDAP auth-
such an external mapping ability is suitable for any authentication
method that supports a mapping (thinking specifically of GSSAPI, of
course..).  Not sure if that's what was meant above but did want to
make sure that was clear.  The rest looks a lot more like pg_hba or
perhaps in-database privileges like roles/memberships existing or not
and CONNECT rights.  I'm not really sold on the idea of adding yet even
more different ways to control authorization.

Thanks,

Stephen

Attachment

signature.asc

Re: [PoC] Delegating pg_ident to a third party

From

Jacob Champion

Date:

03 January 2022, 18:29:26

On Mon, 2022-01-03 at 12:36 -0500, Stephen Frost wrote:
> * Jacob Champion (pchampion@vmware.com) wrote:
> > On Fri, 2021-12-17 at 10:06 +0100, Peter Eisentraut wrote:
> > > On 17.12.21 00:48, Jacob Champion wrote:
> > > > WDYT? (My responses here will be slower than usual. Hope you all have a
> > > > great end to the year!)
> > > 
> > > Looks interesting.  I wonder whether putting this into pg_ident.conf is 
> > > sensible.  I suspect people will want to eventually add more features 
> > > around this, like automatically creating roles or role memberships, at 
> > > which point pg_ident.conf doesn't seem appropriate anymore.
> 
> This is the part that I really wonder about also ... I've always viewed
> pg_ident as being intended mainly for one-to-one kind of mappings and
> not the "map a bunch of different users into the same role" that this
> advocated for.  Being able to have roles and memberships automatically
> created is much more the direction that I'd say we should be going in,
> so that in-database auditing has an actual user to go on and not some
> generic role that could be any number of people.

That last point was my motivation for the authn_id patch [1] -- so that
auditing could see the actual user _and_ the generic role. The
information is already there to be used, it's just not exposed to the
stats framework yet.

Forcing one role per individual end user is wasteful and isn't really
making good use of the role-based system that you already have.
Generally speaking, when administering hundreds or thousands of users,
people start dividing them up into groups as opposed to dealing with
them individually. So I don't think new features should be taking away
flexibility in this area -- if one role per user already works well for
you, great, but don't make everyone do the same.

> I'd go a step further and suggest that the way to do this is with a
> background worker that's started up and connects to an LDAP
> infrastructure and listens for changes, allowing the system to pick up
> on new roles/memberships as soon as they're created in the LDAP
> environment.  That would then be controlled by appropriate settings in
> postgresql.conf/.auto.conf.

This is roughly what you can already do with existing (third-party)
tools, and that approach isn't scaling out in practice for some of our
existing customers. The load on the central server, for thousands of
idle databases dialing in just to see if there are any new users, is
huge.

> All that said, I do see how having the ability to call out to another
> system for mappings may be useful, so I'm not sure that we shouldn't
> consider this specific change and have it be specifically just for
> mappings, in which case pg_ident seems appropriate.

Yeah, this PoC was mostly an increment on the functionality that
already existed. The division between what goes in pg_hba and what goes
in pg_ident is starting to blur with this patchset, though, and I think
Peter's point is sound.

> I certainly don't think we should have this be limited to LDAP auth-
> such an external mapping ability is suitable for any authentication
> method that supports a mapping (thinking specifically of GSSAPI, of
> course..).  Not sure if that's what was meant above but did want to
> make sure that was clear.

You can't use usermaps with LDAP auth yet, so no, that's not what I
meant. (I have another patch for that feature in commitfest, which
would allow these two things to be used together.)

Thanks,
--Jacob

[1] https://www.postgresql.org/message-id/flat/E1lTwp4-0002l4-L9%40gemulon.postgresql.org

Re: [PoC] Delegating pg_ident to a third party

From

Stephen Frost

Date:

04 January 2022, 00:42:32

Greetings,

* Jacob Champion (pchampion@vmware.com) wrote:
> On Mon, 2022-01-03 at 12:36 -0500, Stephen Frost wrote:
> > * Jacob Champion (pchampion@vmware.com) wrote:
> > > On Fri, 2021-12-17 at 10:06 +0100, Peter Eisentraut wrote:
> > > > On 17.12.21 00:48, Jacob Champion wrote:
> > > > > WDYT? (My responses here will be slower than usual. Hope you all have a
> > > > > great end to the year!)
> > > >
> > > > Looks interesting.  I wonder whether putting this into pg_ident.conf is
> > > > sensible.  I suspect people will want to eventually add more features
> > > > around this, like automatically creating roles or role memberships, at
> > > > which point pg_ident.conf doesn't seem appropriate anymore.
> >
> > This is the part that I really wonder about also ... I've always viewed
> > pg_ident as being intended mainly for one-to-one kind of mappings and
> > not the "map a bunch of different users into the same role" that this
> > advocated for.  Being able to have roles and memberships automatically
> > created is much more the direction that I'd say we should be going in,
> > so that in-database auditing has an actual user to go on and not some
> > generic role that could be any number of people.
>
> That last point was my motivation for the authn_id patch [1] -- so that
> auditing could see the actual user _and_ the generic role. The
> information is already there to be used, it's just not exposed to the
> stats framework yet.

While that helps, and I generally support adding that information to the
logs, it's certainly not nearly as good or useful as having the actual
user known to the database.

> Forcing one role per individual end user is wasteful and isn't really
> making good use of the role-based system that you already have.
> Generally speaking, when administering hundreds or thousands of users,
> people start dividing them up into groups as opposed to dealing with
> them individually. So I don't think new features should be taking away
> flexibility in this area -- if one role per user already works well for
> you, great, but don't make everyone do the same.

Using the role system we have to assign privileges certainly is useful
and sensible, of course, though I don't see where you've actually made
an argument for why one role per individual is somehow wasteful or
somehow takes away from the role system that we have for granting
rights.  I'm also not suggesting that we make everyone do the same
thing, indeed, later on I was supportive of having an external system
provide the mapping.  Here, I'm just making the point that we should
also be looking at automatic role/membership creation.

> > I'd go a step further and suggest that the way to do this is with a
> > background worker that's started up and connects to an LDAP
> > infrastructure and listens for changes, allowing the system to pick up
> > on new roles/memberships as soon as they're created in the LDAP
> > environment.  That would then be controlled by appropriate settings in
> > postgresql.conf/.auto.conf.
>
> This is roughly what you can already do with existing (third-party)
> tools, and that approach isn't scaling out in practice for some of our
> existing customers. The load on the central server, for thousands of
> idle databases dialing in just to see if there are any new users, is
> huge.

If you're referring specifically to cron-based tools which are
constantly hammering on the LDAP servers running the same queries over
and over, sure, I agree that that's creating load on the LDAP
infrastructure (though, well, it was kind of designed to be very
scalable for exactly that kind of load, no?  So I'm not really sure why
that's such an issue..).  That's also why I specifically wasn't
suggesting that and was instead suggesting that we have something that's
connected to one of the (hopefully, many, many) LDAP servers and is
doing change monitoring, allowing changes to be pushed down to PG,
rather than cronjobs constantly running the same queries and re-checking
things over and over.  I appreciate that that's also not free, but I
don't believe it's nearly as bad as the cron-based approach and it's
certainly something that an LDAP infrastructure should be really rather
good at.

> > All that said, I do see how having the ability to call out to another
> > system for mappings may be useful, so I'm not sure that we shouldn't
> > consider this specific change and have it be specifically just for
> > mappings, in which case pg_ident seems appropriate.
>
> Yeah, this PoC was mostly an increment on the functionality that
> already existed. The division between what goes in pg_hba and what goes
> in pg_ident is starting to blur with this patchset, though, and I think
> Peter's point is sound.

This part I tend to disagree with- pg_ident for mappings and for ways to
call out to other systems to provide those mappings strikes me as
entirely appropriate and doesn't blur the lines and that's really what
this patch seems to be primarily about.  Peter noted that there might be
other things we want to do and argued that those might not be
appropriate in pg_ident, which I tend to agree with, but I don't think
we need to invent something entirely new for mappings when we have
pg_ident already.

When it comes to the question of "how to connect to an LDAP server for
$whatever", it seems like it'd be nice to be able to configure that once
and reuse that configuration.  Not sure I have a great suggestion for
how to do that.  The approach this patch takes of adding options to
pg_hba for that, just like other options in pg_hba do, strikes me as
pretty reasonable.  I would advocate for other methods to work when it
comes to authenticating to LDAP from PG though (such as GSSAPI, in
particular, of course...).

> > I certainly don't think we should have this be limited to LDAP auth-
> > such an external mapping ability is suitable for any authentication
> > method that supports a mapping (thinking specifically of GSSAPI, of
> > course..).  Not sure if that's what was meant above but did want to
> > make sure that was clear.
>
> You can't use usermaps with LDAP auth yet, so no, that's not what I
> meant. (I have another patch for that feature in commitfest, which
> would allow these two things to be used together.)

Yes, I'm aware of the other patch, just wanted to make sure the intent
is for this to work for all map-supporting auth methods.  Figured that
was the case but the examples in the prior email had me concerned and
just wanted to make sure.

Thanks,

Stephen

Attachment

signature.asc

Re: [PoC] Delegating pg_ident to a third party

From

Jacob Champion

Date:

04 January 2022, 23:56:02

On Mon, 2022-01-03 at 19:42 -0500, Stephen Frost wrote:
> * Jacob Champion (pchampion@vmware.com) wrote:
> > 
> > That last point was my motivation for the authn_id patch [1] -- so that
> > auditing could see the actual user _and_ the generic role. The
> > information is already there to be used, it's just not exposed to the
> > stats framework yet.
> 
> While that helps, and I generally support adding that information to the
> logs, it's certainly not nearly as good or useful as having the actual
> user known to the database.

Could you talk more about the use cases for which having the "actual
user" is better? From an auditing perspective I don't see why
"authenticated as jacob@example.net, logged in as admin" is any worse
than "logged in as jacob".

> > Forcing one role per individual end user is wasteful and isn't really
> > making good use of the role-based system that you already have.
> > Generally speaking, when administering hundreds or thousands of users,
> > people start dividing them up into groups as opposed to dealing with
> > them individually. So I don't think new features should be taking away
> > flexibility in this area -- if one role per user already works well for
> > you, great, but don't make everyone do the same.
> 
> Using the role system we have to assign privileges certainly is useful
> and sensible, of course, though I don't see where you've actually made
> an argument for why one role per individual is somehow wasteful or
> somehow takes away from the role system that we have for granting
> rights. 

I was responding more to your statement that "Being able to have roles
and memberships automatically created is much more the direction that
I'd say we should be going in". It's not that one-role-per-user is
inherently wasteful, but forcing role proliferation where it's not
needed is. If all users have the same set of permissions, there doesn't
need to be more than one role. But see below.

> I'm also not suggesting that we make everyone do the same
> thing, indeed, later on I was supportive of having an external system
> provide the mapping.  Here, I'm just making the point that we should
> also be looking at automatic role/membership creation.

Gotcha. Agreed; that would open up the ability to administer role
privileges externally too, which would be cool. That could be used in
tandem with something like this patchset.

> > > I'd go a step further and suggest that the way to do this is with a
> > > background worker that's started up and connects to an LDAP
> > > infrastructure and listens for changes, allowing the system to pick up
> > > on new roles/memberships as soon as they're created in the LDAP
> > > environment.  That would then be controlled by appropriate settings in
> > > postgresql.conf/.auto.conf.
> > 
> > This is roughly what you can already do with existing (third-party)
> > tools, and that approach isn't scaling out in practice for some of our
> > existing customers. The load on the central server, for thousands of
> > idle databases dialing in just to see if there are any new users, is
> > huge.
> 
> If you're referring specifically to cron-based tools which are
> constantly hammering on the LDAP servers running the same queries over
> and over, sure, I agree that that's creating load on the LDAP
> infrastructure (though, well, it was kind of designed to be very
> scalable for exactly that kind of load, no?  So I'm not really sure why
> that's such an issue..).

I don't have hands-on experience here -- just going on what I've been
told via field/product teams -- but it seems to me that there's a big
difference between asking an LDAP server to give you information on a
user at the time that user logs in, and asking it to give a list of
_all_ users to every single Postgres instance you have on a regular
timer. The latter is what seems to be problematic.

> That's also why I specifically wasn't
> suggesting that and was instead suggesting that we have something that's
> connected to one of the (hopefully, many, many) LDAP servers and is
> doing change monitoring, allowing changes to be pushed down to PG,
> rather than cronjobs constantly running the same queries and re-checking
> things over and over.  I appreciate that that's also not free, but I
> don't believe it's nearly as bad as the cron-based approach and it's
> certainly something that an LDAP infrastructure should be really rather
> good at.

I guess I'd have to see an implementation -- I was under the impression
that persistent search wasn't widely implemented?

> > > All that said, I do see how having the ability to call out to another
> > > system for mappings may be useful, so I'm not sure that we shouldn't
> > > consider this specific change and have it be specifically just for
> > > mappings, in which case pg_ident seems appropriate.
> > 
> > Yeah, this PoC was mostly an increment on the functionality that
> > already existed. The division between what goes in pg_hba and what goes
> > in pg_ident is starting to blur with this patchset, though, and I think
> > Peter's point is sound.
> 
> This part I tend to disagree with- pg_ident for mappings and for ways to
> call out to other systems to provide those mappings strikes me as
> entirely appropriate and doesn't blur the lines and that's really what
> this patch seems to be primarily about.  Peter noted that there might be
> other things we want to do and argued that those might not be
> appropriate in pg_ident, which I tend to agree with, but I don't think
> we need to invent something entirely new for mappings when we have
> pg_ident already.

The current patchset here has pieces of what is usually contained in
HBA (the LDAP host/port/base/filter/etc.) effectively moved into
pg_ident, while other pieces (TLS settings) remain in the HBA and the
environment. That's what I'm referring to. If that is workable for you
in the end, that's fine, but for me it'd be much easier to maintain if
the mapping query and the LDAP connection settings for that mapping
query were next to each other.

> When it comes to the question of "how to connect to an LDAP server for
> $whatever", it seems like it'd be nice to be able to configure that once
> and reuse that configuration.  Not sure I have a great suggestion for
> how to do that. The approach this patch takes of adding options to
> pg_hba for that, just like other options in pg_hba do, strikes me as
> pretty reasonable.

Right. That part seems less reasonable to me, given the current format
of the HBA. YMMV.

> I would advocate for other methods to work when it comes to
> authenticating to LDAP from PG though (such as GSSAPI, in particular,
> of course...).

I can take a look at the Cyrus requirements for the GSSAPI mechanism.
Might be tricky to add tests for it, though. Any others you're
interested in?

> > > I certainly don't think we should have this be limited to LDAP auth-
> > > such an external mapping ability is suitable for any authentication
> > > method that supports a mapping (thinking specifically of GSSAPI, of
> > > course..).  Not sure if that's what was meant above but did want to
> > > make sure that was clear.
> > 
> > You can't use usermaps with LDAP auth yet, so no, that's not what I
> > meant. (I have another patch for that feature in commitfest, which
> > would allow these two things to be used together.)
> 
> Yes, I'm aware of the other patch, just wanted to make sure the intent
> is for this to work for all map-supporting auth methods.  Figured that
> was the case but the examples in the prior email had me concerned and
> just wanted to make sure.

Correct. The new tests use cert auth, for instance.

Thanks,
--Jacob

Re: [PoC] Delegating pg_ident to a third party

From

Stephen Frost

Date:

05 January 2022, 03:24:58

Greetings,

On Tue, Jan 4, 2022 at 18:56 Jacob Champion <pchampion@vmware.com> wrote:

On Mon, 2022-01-03 at 19:42 -0500, Stephen Frost wrote:
> * Jacob Champion (pchampion@vmware.com) wrote:
> >
> > That last point was my motivation for the authn_id patch [1] -- so that
> > auditing could see the actual user _and_ the generic role. The
> > information is already there to be used, it's just not exposed to the
> > stats framework yet.
>
> While that helps, and I generally support adding that information to the
> logs, it's certainly not nearly as good or useful as having the actual
> user known to the database.

Could you talk more about the use cases for which having the "actual
user" is better? From an auditing perspective I don't see why
"authenticated as jacob@example.net, logged in as admin" is any worse
than "logged in as jacob".

The above case isn’t what we are talking about, as far as I understand anyway. You’re suggesting “authenticated as jacob@example.net, logged in as sales” where the user in the database is “sales”. Consider triggers which only have access to “sales”, or a tool like pgaudit which only has access to “sales”. Who was it in sales that updated that record though? We don’t know- we would have to go try to figure it out from the logs, but even if we had time stamps on the row update, there could be 50 sales people logged in at overlapping times.

> > Forcing one role per individual end user is wasteful and isn't really
> > making good use of the role-based system that you already have.
> > Generally speaking, when administering hundreds or thousands of users,
> > people start dividing them up into groups as opposed to dealing with
> > them individually. So I don't think new features should be taking away
> > flexibility in this area -- if one role per user already works well for
> > you, great, but don't make everyone do the same.
>
> Using the role system we have to assign privileges certainly is useful
> and sensible, of course, though I don't see where you've actually made
> an argument for why one role per individual is somehow wasteful or
> somehow takes away from the role system that we have for granting
> rights.

I was responding more to your statement that "Being able to have roles
and memberships automatically created is much more the direction that
I'd say we should be going in". It's not that one-role-per-user is
inherently wasteful, but forcing role proliferation where it's not
needed is. If all users have the same set of permissions, there doesn't
need to be more than one role. But see below.

Just saying it’s wasteful isn’t actually saying what is wasteful about it.

> I'm also not suggesting that we make everyone do the same
> thing, indeed, later on I was supportive of having an external system
> provide the mapping. Here, I'm just making the point that we should
> also be looking at automatic role/membership creation.

Gotcha. Agreed; that would open up the ability to administer role
privileges externally too, which would be cool. That could be used in
tandem with something like this patchset.

Not sure exactly what you’re referring to here by “administer role privileges externally too”..? Curious to hear what you are imagining specifically.

> > > I'd go a step further and suggest that the way to do this is with a
> > > background worker that's started up and connects to an LDAP
> > > infrastructure and listens for changes, allowing the system to pick up
> > > on new roles/memberships as soon as they're created in the LDAP
> > > environment. That would then be controlled by appropriate settings in
> > > postgresql.conf/.auto.conf.
> >
> > This is roughly what you can already do with existing (third-party)
> > tools, and that approach isn't scaling out in practice for some of our
> > existing customers. The load on the central server, for thousands of
> > idle databases dialing in just to see if there are any new users, is
> > huge.
>
> If you're referring specifically to cron-based tools which are
> constantly hammering on the LDAP servers running the same queries over
> and over, sure, I agree that that's creating load on the LDAP
> infrastructure (though, well, it was kind of designed to be very
> scalable for exactly that kind of load, no? So I'm not really sure why
> that's such an issue..).

I don't have hands-on experience here -- just going on what I've been
told via field/product teams -- but it seems to me that there's a big
difference between asking an LDAP server to give you information on a
user at the time that user logs in, and asking it to give a list of
_all_ users to every single Postgres instance you have on a regular
timer. The latter is what seems to be problematic.

And to be clear, I agree that’s not good (though, again, really, your ldap infrastructure shouldn’t be having all that much trouble with it- you can scale those out verryyyy far, and far more easily than a relational database..).

I’d also point out though that having to do an ldap lookup on every login to PG is *already* an issue in some environments, having to do multiple amplifies that. Not to mention that when the ldap servers can’t be reached for some reason, no one can log into the database and that’s rather unfortunate too. These are, of course, arguments for moving away from methods that require checking with some other system synchronously during login- which is another reason why it’s better to have the authentication credentials easily map to the PG role, without the need for external checks at login time. That’s done with today’s pg_ident, but this patch would change that.

Consider the approach I continue to advocate- GSSAPI based authentication, where a user only needs to contact the Kerberos server perhaps every 8 hours or so for an updated ticket but otherwise can authorize directly to PG using their existing ticket and credentials, where their role was previously created and their memberships already exist thanks to a background worker whose job it is to handle that and which deals with transient network failures or other issues. In this world, most logins to PG don’t require any other system to be involved besides the client, the PG server, and the networking between them; perhaps DNS if things aren’t cached on the client.

On the other hand, to use ldap authentication (which also happens to be demonstrable insecure without any reasonable way to fix that), with an ldap mapping setup, requires two logins to an ldap server every single time a user logs into PG and if the ldap environment is offline or overloaded for whatever reason, the login fails or takes an excessively long amount of time.

> That's also why I specifically wasn't
> suggesting that and was instead suggesting that we have something that's
> connected to one of the (hopefully, many, many) LDAP servers and is
> doing change monitoring, allowing changes to be pushed down to PG,
> rather than cronjobs constantly running the same queries and re-checking
> things over and over. I appreciate that that's also not free, but I
> don't believe it's nearly as bad as the cron-based approach and it's
> certainly something that an LDAP infrastructure should be really rather
> good at.

I guess I'd have to see an implementation -- I was under the impression
that persistent search wasn't widely implemented?

I mean … let’s talk about the one that really matters here:

https://docs.microsoft.com/en-us/windows/win32/ad/change-notifications-in-active-directory-domain-services

OpenLDAP has an audit log system which can be used though it’s certainly not as nice and would require code specific to it.

This talks a bit about other directories:

https://docs.informatica.com/data-integration/powerexchange-adapters-for-powercenter/10-1/powerexchange-for-ldap-user-guide-for-powercenter/ldap-sessions/configuring-change-data-capture/methods-for-tracking-changes-in-different-directories.html

I do wish they all supported it cleanly in the same way.

> > > All that said, I do see how having the ability to call out to another
> > > system for mappings may be useful, so I'm not sure that we shouldn't
> > > consider this specific change and have it be specifically just for
> > > mappings, in which case pg_ident seems appropriate.
> >
> > Yeah, this PoC was mostly an increment on the functionality that
> > already existed. The division between what goes in pg_hba and what goes
> > in pg_ident is starting to blur with this patchset, though, and I think
> > Peter's point is sound.
>
> This part I tend to disagree with- pg_ident for mappings and for ways to
> call out to other systems to provide those mappings strikes me as
> entirely appropriate and doesn't blur the lines and that's really what
> this patch seems to be primarily about. Peter noted that there might be
> other things we want to do and argued that those might not be
> appropriate in pg_ident, which I tend to agree with, but I don't think
> we need to invent something entirely new for mappings when we have
> pg_ident already.

The current patchset here has pieces of what is usually contained in
HBA (the LDAP host/port/base/filter/etc.) effectively moved into
pg_ident, while other pieces (TLS settings) remain in the HBA and the
environment. That's what I'm referring to. If that is workable for you
in the end, that's fine, but for me it'd be much easier to maintain if
the mapping query and the LDAP connection settings for that mapping
query were next to each other.

I can agree with the point that it would be nicer to have the ldap host/port/base/filter be in the hba instead, if there is a way to accomplish that reasonably. Did you have a suggestion in mind for how to do that..? If there’s an alternative approach to consider, it’d be useful to see them next to each other and then we could all contemplate which is better.

> When it comes to the question of "how to connect to an LDAP server for
> $whatever", it seems like it'd be nice to be able to configure that once
> and reuse that configuration. Not sure I have a great suggestion for
> how to do that. The approach this patch takes of adding options to
> pg_hba for that, just like other options in pg_hba do, strikes me as
> pretty reasonable.

Right. That part seems less reasonable to me, given the current format
of the HBA. YMMV.

If the ldap connection info and filters and such could all exist in the hba, then perhaps a way to define those credentials in one place in the hba file and then use them on other lines would be possible..? Seems like that would be easier than having them also in the ident or having the ident refer to something defined elsewhere.

Consider in the hba having:

LDAPSERVER[ldap1]=“ldaps://whatever other options go here”

Then later:

hostssl all all ::0/0 ldap ldapserver=ldap1 ldapmapserver=ldap1 map=myldapmap

Clearly needs more thought needed due to different requirements for ldap authentication vs. the map, but still, the general idea being to have all of it in the hba and then a way to define ldap server configuration in the hba once and then reused.

> I would advocate for other methods to work when it comes to
> authenticating to LDAP from PG though (such as GSSAPI, in particular,
> of course...).

I can take a look at the Cyrus requirements for the GSSAPI mechanism.
Might be tricky to add tests for it, though. Any others you're
interested in?

GSSAPI is the main one … I suppose client side certificates would be nice too if that’s possible. I suspect some would like a way to have username/pw ldap credentials in some other file besides the hba, but that isn’t as interesting to me, at least.

> > > I certainly don't think we should have this be limited to LDAP auth-
> > > such an external mapping ability is suitable for any authentication
> > > method that supports a mapping (thinking specifically of GSSAPI, of
> > > course..). Not sure if that's what was meant above but did want to
> > > make sure that was clear.
> >
> > You can't use usermaps with LDAP auth yet, so no, that's not what I
> > meant. (I have another patch for that feature in commitfest, which
> > would allow these two things to be used together.)
>
> Yes, I'm aware of the other patch, just wanted to make sure the intent
> is for this to work for all map-supporting auth methods. Figured that
> was the case but the examples in the prior email had me concerned and
> just wanted to make sure.

Correct. The new tests use cert auth, for instance.

Great.

Thanks!

Stephen

Re: [PoC] Delegating pg_ident to a third party

From

Jacob Champion

Date:

08 January 2022, 00:32:58

On Tue, 2022-01-04 at 22:24 -0500, Stephen Frost wrote:
> On Tue, Jan 4, 2022 at 18:56 Jacob Champion <pchampion@vmware.com> wrote:
> > 
> > Could you talk more about the use cases for which having the "actual
> > user" is better? From an auditing perspective I don't see why
> > "authenticated as jacob@example.net, logged in as admin" is any worse
> > than "logged in as jacob".
> 
> The above case isn’t what we are talking about, as far as I
> understand anyway. You’re suggesting “authenticated as 
> jacob@example.net, logged in as sales” where the user in the database
> is “sales”.  Consider triggers which only have access to “sales”, or
> a tool like pgaudit which only has access to “sales”.

Okay. So an additional getter function in miscadmin.h, and surfacing
that function to trigger languages, are needed to make authn_id more
generally useful. Any other cases you can think of?

> > I was responding more to your statement that "Being able to have roles
> > and memberships automatically created is much more the direction that
> > I'd say we should be going in". It's not that one-role-per-user is
> > inherently wasteful, but forcing role proliferation where it's not
> > needed is. If all users have the same set of permissions, there doesn't
> > need to be more than one role. But see below.
> 
> Just saying it’s wasteful isn’t actually saying what is wasteful about it.

Well, I felt like it was irrelevant; you've already said you have no
intention to force one-user-per-role.

But to elaborate: *forcing* one-user-per-role is wasteful, because if I
have a thousand employees, and I want to give all my employees access
to a guest role in the database, then I have to administer a thousand
roles: maintaining them through dump/restores and pg_upgrades, auditing
them to figure out why Bob in Accounting somehow got a different
privilege GRANT than the rest of the users, adding new accounts,
purging old ones, maintaining the inevitable scripts that will result.

If none of the users need to be "special" in any way, that's all wasted
overhead. (If they do actually need to be special, then at least some
of that overhead becomes necessary. Otherwise it's waste.) You may be
able to mitigate the cost of the waste, or absorb the mitigations into
Postgres so that the user can't see the waste, or decide that the waste
is not costly enough to care about. It's still waste.

> > > I'm also not suggesting that we make everyone do the same
> > > thing, indeed, later on I was supportive of having an external system
> > > provide the mapping.  Here, I'm just making the point that we should
> > > also be looking at automatic role/membership creation.
> > 
> > Gotcha. Agreed; that would open up the ability to administer role
> > privileges externally too, which would be cool. That could be used in
> > tandem with something like this patchset.
> 
> Not sure exactly what you’re referring to here by “administer role
> privileges externally too”..?  Curious to hear what you are imagining
> specifically.

Just that it would be nice to centrally provision role GRANTs as well
as role membership, that's all. No specifics in mind, and I'm not even
sure if LDAP would be a helpful place to put that sort of config.

> I’d also point out though that having to do an ldap lookup on every
> login to PG is *already* an issue in some environments, having to do
> multiple amplifies that.

You can't use the LDAP auth method with this patch yet, so this concern
is based on code that doesn't exist. It's entirely possible that you
could do the role query as part of the first bound connection. If that
proves unworkable, then yes, I agree that it's a concern.

> Not to mention that when the ldap servers can’t be reached for some
> reason, no one can log into the database and that’s rather
> unfortunate too.

Assuming you have no caches, then yes. That might be a pretty good
argument for allowing ldapmap and map to be used together, actually, so
that you can have some critical users who can always log in as
"themselves" or "admin" or etc. Or maybe it's an argument for allowing
HBA to handle fallback methods of authentication.

Luckily I think it's pretty easy to communicate to LDAP users that if
*all* your login infrastructure goes down, you will no longer be able
to log in. They're probably used to that idea, if they haven't set up
any availability infra.

>  These are, of course, arguments for moving away from methods that
> require checking with some other system synchronously during login-
> which is another reason why it’s better to have the authentication
> credentials easily map to the PG role, without the need for external
> checks at login time.  That’s done with today’s pg_ident, but this
> patch would change that.

There are arguments for moving towards synchronous checks as well.
Central revocation of credentials (in timeframes shorter than ticket
expiration) is what comes to mind. Revocation is hard and usually
conflicts with the desire for availability.

What's "better" for me or you is not necessarily "better" overall; it's
all tradeoffs, all the time.

> Consider the approach I continue to advocate- GSSAPI based
> authentication, where a user only needs to contact the Kerberos
> server perhaps every 8 hours or so for an updated ticket but
> otherwise can authorize directly to PG using their existing ticket
> and credentials, where their role was previously created and their
> memberships already exist thanks to a background worker whose job it
> is to handle that and which deals with transient network failures or
> other issues. In this world, most logins to PG don’t require any
> other system to be involved besides the client, the PG server, and
> the networking between them; perhaps DNS if things aren’t cached on 
> the client.
> 
> On the other hand, to use ldap authentication (which also happens to
> be demonstrable insecure without any reasonable way to fix that),
> with an ldap mapping setup, requires two logins to an ldap server
> every single time a user logs into PG and if the ldap environment is
> offline or overloaded for whatever reason, the login fails or takes
> an excessively long amount of time.

The two systems have different architectures, and different security
properties, and you have me at a disadvantage in that you can see the
experimental code I have written and I cannot see the hypothetical code
in your head.

It sounds like I'm more concerned with the ability to have an online
central source of truth for access control, accepting that denial of
service may cause the system to fail shut; and you're more concerned
with availability in the face of network failure, accepting that denial
of service may cause the system to fail open. I think that's a design
decision that belongs to an end user.

The distributed availability problems you're describing are, in my
experience, typically solved by caching. With your not-yet-written
solution, the caching is built into Postgres, and it's on all of the
time, but may (see below) only actually perform well with Active
Directory. With my solution, any caching is optional, because it has to
be implemented/maintained external to Postgres, but because it's just
generic "LDAP caching" then it should be broadly compatible and we
don't have to maintain it. I can see arguments for and against both
approaches.

> > I guess I'd have to see an implementation -- I was under the impression
> > that persistent search wasn't widely implemented?
> 
> I mean … let’s talk about the one that really matters here: 
> 
> https://docs.microsoft.com/en-us/windows/win32/ad/change-notifications-in-active-directory-domain-services

That would certainly be a useful thing to implement for deployments
that can use it. But my personal interest in writing "LDAP" code that
only works with AD is nil, at least in the short term.

(The continued attitude that Microsoft Active Directory is "the one
that really matters" is really frustrating. I have users on LDAP
without Active Directory. Postgres tests are written against OpenLDAP.)

> OpenLDAP has an audit log system which can be used though it’s
> certainly not as nice and would require code specific to it.
> 
> This talks a bit about other directories: 
>
https://docs.informatica.com/data-integration/powerexchange-adapters-for-powercenter/10-1/powerexchange-for-ldap-user-guide-for-powercenter/ldap-sessions/configuring-change-data-capture/methods-for-tracking-changes-in-different-directories.html
> 
> I do wish they all supported it cleanly in the same way.

Okay. But the answer to "is persistent search widely implemented?"
appears to be "No."

> > The current patchset here has pieces of what is usually contained in
> > HBA (the LDAP host/port/base/filter/etc.) effectively moved into
> > pg_ident, while other pieces (TLS settings) remain in the HBA and the
> > environment. That's what I'm referring to. If that is workable for you
> > in the end, that's fine, but for me it'd be much easier to maintain if
> > the mapping query and the LDAP connection settings for that mapping
> > query were next to each other.
> 
> I can agree with the point that it would be nicer to have the ldap
> host/port/base/filter be in the hba instead, if there is a way to
> accomplish that reasonably. Did you have a suggestion in mind for how
> to do that..?  If there’s an alternative approach to consider, it’d
> be useful to see them next to each other and then we could all
> contemplate which is better.

I didn't say I necessarily wanted it all in the HBA, just that I wanted
it all in the same spot.

I don't see a good way to push the filter back into the HBA, because it
may very well depend on the users being mapped (i.e. there may need to
be multiple lines in the map). Same for the query attributes. In fact
if I'm already using AD Kerberos or SSPI and I want to be able to
handle users coming from multiple domains, couldn't I be querying
entirely different servers depending on the username presented?

> > > When it comes to the question of "how to connect to an LDAP server for
> > > $whatever", it seems like it'd be nice to be able to configure that once
> > > and reuse that configuration.  Not sure I have a great suggestion for
> > > how to do that. The approach this patch takes of adding options to
> > > pg_hba for that, just like other options in pg_hba do, strikes me as
> > > pretty reasonable.
> > 
> > Right. That part seems less reasonable to me, given the current format
> > of the HBA. YMMV.
> 
> If the ldap connection info and filters and such could all exist in
> the hba, then perhaps a way to define those credentials in one place
> in the hba file and then use them on other lines would be
> possible..?  Seems like that would be easier than having them also in
> the ident or having the ident refer to something defined elsewhere. 
> 
> Consider in the hba having:
> 
> LDAPSERVER[ldap1]=“ldaps://whatever other options go here”
> 
> Then later:
> 
> hostssl all all ::0/0 ldap ldapserver=ldap1 ldapmapserver=ldap1 map=myldapmap
> 
> Clearly needs more thought needed due to different requirements for
> ldap authentication vs. the map, but still, the general idea being to
> have all of it in the hba and then a way to define ldap server
> configuration in the hba once and then reused.

You're open to the idea of bolting a new key/value grammar onto the HBA
parser, but not to the idea of brainstorming a different configuration
DSL?

> > I can take a look at the Cyrus requirements for the GSSAPI mechanism.
> > Might be tricky to add tests for it, though. Any others you're
> > interested in?
> 
> GSSAPI is the main one … I suppose client side certificates would be
> nice too if that’s possible.  I suspect some would like a way to have
> username/pw ldap credentials in some other file besides the hba, but
> that isn’t as interesting to me, at least.

Certificate auth is already there in the patch. See the end of
t/001_ldap.t.

Thanks,
--Jacob

Re: [PoC] Delegating pg_ident to a third party

From

Stephen Frost

Date:

10 January 2022, 20:09:32

Greetings,

* Jacob Champion (pchampion@vmware.com) wrote:
> On Tue, 2022-01-04 at 22:24 -0500, Stephen Frost wrote:
> > On Tue, Jan 4, 2022 at 18:56 Jacob Champion <pchampion@vmware.com> wrote:
> > >
> > > Could you talk more about the use cases for which having the "actual
> > > user" is better? From an auditing perspective I don't see why
> > > "authenticated as jacob@example.net, logged in as admin" is any worse
> > > than "logged in as jacob".
> >
> > The above case isn’t what we are talking about, as far as I
> > understand anyway. You’re suggesting “authenticated as
> > jacob@example.net, logged in as sales” where the user in the database
> > is “sales”.  Consider triggers which only have access to “sales”, or
> > a tool like pgaudit which only has access to “sales”.
>
> Okay. So an additional getter function in miscadmin.h, and surfacing
> that function to trigger languages, are needed to make authn_id more
> generally useful. Any other cases you can think of?

That would help but now you've got two different things that have to be
tracked, potentially, because for some people you might not want to use
their system auth'd-as ID.  I don't see that as a great solution and
instead as a workaround.  Yes, we should also do this but it's really an
argument for how to deal with such a setup, not a justification for
going down this route.

> > > I was responding more to your statement that "Being able to have roles
> > > and memberships automatically created is much more the direction that
> > > I'd say we should be going in". It's not that one-role-per-user is
> > > inherently wasteful, but forcing role proliferation where it's not
> > > needed is. If all users have the same set of permissions, there doesn't
> > > need to be more than one role. But see below.
> >
> > Just saying it’s wasteful isn’t actually saying what is wasteful about it.
>
> Well, I felt like it was irrelevant; you've already said you have no
> intention to force one-user-per-role.

Forcing one-user-per-role would be breaking things we already support
so, no, I certainly don't have any intention of requiring such a change.
That said, I do feel it's useful to have these discussions.

> But to elaborate: *forcing* one-user-per-role is wasteful, because if I
> have a thousand employees, and I want to give all my employees access
> to a guest role in the database, then I have to administer a thousand
> roles: maintaining them through dump/restores and pg_upgrades, auditing
> them to figure out why Bob in Accounting somehow got a different
> privilege GRANT than the rest of the users, adding new accounts,
> purging old ones, maintaining the inevitable scripts that will result.

pg_upgrade just handles it, no?  pg_dumpall -g does too.  Having to deal
with roles in general is a pain but the number of them isn't necessarily
an issue.  A guest role which doesn't have any auditing requirements
might be a decent use-case for what you're talking about here but I
don't know that we'd implement this for just that case.  Part of this
discussion was specifically about addressing the other challenges- like
having automation around the account addition/removal and sync'ing role
membership too.  As for auditing privileges, that should be done
regardless and the case you outline isn't somehow different from others
(the same could be as easily said for how the 'guest' account got access
to whatever it did).

> If none of the users need to be "special" in any way, that's all wasted
> overhead. (If they do actually need to be special, then at least some
> of that overhead becomes necessary. Otherwise it's waste.) You may be
> able to mitigate the cost of the waste, or absorb the mitigations into
> Postgres so that the user can't see the waste, or decide that the waste
> is not costly enough to care about. It's still waste.

Except the amount of 'wasted' overhead being claimed here seems to be
hardly any.  The biggest complaint levied at this seems to really be
just the issues around the load on the ldap systems from having to deal
with the frequent sync queries, and that's largely a solvable issue in
the majority of environments out there today.

> > > > I'm also not suggesting that we make everyone do the same
> > > > thing, indeed, later on I was supportive of having an external system
> > > > provide the mapping.  Here, I'm just making the point that we should
> > > > also be looking at automatic role/membership creation.
> > >
> > > Gotcha. Agreed; that would open up the ability to administer role
> > > privileges externally too, which would be cool. That could be used in
> > > tandem with something like this patchset.
> >
> > Not sure exactly what you’re referring to here by “administer role
> > privileges externally too”..?  Curious to hear what you are imagining
> > specifically.
>
> Just that it would be nice to centrally provision role GRANTs as well
> as role membership, that's all. No specifics in mind, and I'm not even
> sure if LDAP would be a helpful place to put that sort of config.

GRANT's on objects, you mean?  I agree, that would be interesting to
consider though it would involve custom entries in the LDAP directory,
no?  Role membership would be able to be sync'd as part of group
membership and that was something I was thinking would be handled as
part of this in a similar manner to what the 3rd party solutions provide
today using the cron-based approach.

> > I’d also point out though that having to do an ldap lookup on every
> > login to PG is *already* an issue in some environments, having to do
> > multiple amplifies that.
>
> You can't use the LDAP auth method with this patch yet, so this concern
> is based on code that doesn't exist. It's entirely possible that you
> could do the role query as part of the first bound connection. If that
> proves unworkable, then yes, I agree that it's a concern.

Perhaps it could be done as part of the same connection but that then
has an impact on what the configuration of the ident LDAP lookup would
be, no?  That seems like an important thing to flesh out before we move
too much farther with this patch, to make sure that, if we want that to
work, that there's a clear way to configure it to avoid the double LDAP
connection.  I'm guessing you already have an idea how that'll work
though..?

> > Not to mention that when the ldap servers can’t be reached for some
> > reason, no one can log into the database and that’s rather
> > unfortunate too.
>
> Assuming you have no caches, then yes. That might be a pretty good
> argument for allowing ldapmap and map to be used together, actually, so
> that you can have some critical users who can always log in as
> "themselves" or "admin" or etc. Or maybe it's an argument for allowing
> HBA to handle fallback methods of authentication.

Ok, so now we're talking about a cache that needs to be implemented
which will ... store the user's password for LDAP authentication?  Or
what the mapping is for various LDAP IDs to PG roles?  And how will that
cache be managed?  Would it be handled by dump/restore?  What about
pg_upgrade?  How will entries in the cache be removed?

And mainly- how is this different from just having all the roles in PG
to begin with..?

> Luckily I think it's pretty easy to communicate to LDAP users that if
> *all* your login infrastructure goes down, you will no longer be able
> to log in. They're probably used to that idea, if they haven't set up
> any availability infra.

Except that most of the rest of the infrastructure may continue to work
just fine except for logging in- which is something most folks only do
once a day.  That is, why is the SQL Server system still happily
accepting connections while the AD is being rebooted?  Or why can I
still log into the company website even though AD is down, but I can't
get into PG?  Not everything in an environment is tied to LDAP being up
and running all the time, so it's not nearly so cut and dry in many,
many cases.

> >  These are, of course, arguments for moving away from methods that
> > require checking with some other system synchronously during login-
> > which is another reason why it’s better to have the authentication
> > credentials easily map to the PG role, without the need for external
> > checks at login time.  That’s done with today’s pg_ident, but this
> > patch would change that.
>
> There are arguments for moving towards synchronous checks as well.
> Central revocation of credentials (in timeframes shorter than ticket
> expiration) is what comes to mind. Revocation is hard and usually
> conflicts with the desire for availability.

Revocation in less time than ticket lifetime and everything falling over
due to the AD being restarted are very different.  The approaches being
discussed are all much shorter than ticket lifetime and so that's hardly
an appropriate comparison to be making.  I didn't suggest that waiting
for ticket expiration would be appropriate when it comes to syncing
accounts between AD and PG or that it would be appropriate for
revocation.  Regarding the cache'ing proposed above- in such a case,
clearly, revocation wouldn't be syncronous either.  Certainly in the
cases today where cronjobs are being used to perform the sync,
revocation also isn't syncronous (unless also using LDAP for
authentication, of course, though that wouldn't do anything for existing
sessions, while removing role memberships does...).

> What's "better" for me or you is not necessarily "better" overall; it's
> all tradeoffs, all the time.

Sure.

> > Consider the approach I continue to advocate- GSSAPI based
> > authentication, where a user only needs to contact the Kerberos
> > server perhaps every 8 hours or so for an updated ticket but
> > otherwise can authorize directly to PG using their existing ticket
> > and credentials, where their role was previously created and their
> > memberships already exist thanks to a background worker whose job it
> > is to handle that and which deals with transient network failures or
> > other issues. In this world, most logins to PG don’t require any
> > other system to be involved besides the client, the PG server, and
> > the networking between them; perhaps DNS if things aren’t cached on
> > the client.
> >
> > On the other hand, to use ldap authentication (which also happens to
> > be demonstrable insecure without any reasonable way to fix that),
> > with an ldap mapping setup, requires two logins to an ldap server
> > every single time a user logs into PG and if the ldap environment is
> > offline or overloaded for whatever reason, the login fails or takes
> > an excessively long amount of time.
>
> The two systems have different architectures, and different security
> properties, and you have me at a disadvantage in that you can see the
> experimental code I have written and I cannot see the hypothetical code
> in your head.

I've barely glanced at the code you've written and it largely hasn't
been driving my comments on this thread- merely the understanding of how
it works.  Further, you've stated that you're already familiar with
systems that sync between LDAP and PG and the vast majority of this
discussion has been about that distinction- if we push the mappings into
PG as roles, or if we execute a query out to LDAP on connection to check
the mapping.  The above references to tickets and GSSAPI/Kerberos are
all from existing code as well.  The only reference to hypothetical code
is the idea of a background or other worker that subscribes to changes
in LDAP and implements those changes in PG instead of having something
cron-based do it, but that doesn't really change anything about the
architectural question of if we cache (either with an explicit cache, as
you've opined us adding above, though which there is no code for today,
or just by using PG's existing role/membership system) or call out to
LDAP for every login.

> It sounds like I'm more concerned with the ability to have an online
> central source of truth for access control, accepting that denial of
> service may cause the system to fail shut; and you're more concerned
> with availability in the face of network failure, accepting that denial
> of service may cause the system to fail open. I think that's a design
> decision that belongs to an end user.

There is more to it than just failing shut/closed.  Part of the argument
being used to drive this change was that it would help to reduce the
load on the LDAP servers because there wouldn't be a need to run large
queries on them frequently out of cron to keep PG's understanding of
what the roles are and their mappings is matching what's in LDAP.

> The distributed availability problems you're describing are, in my
> experience, typically solved by caching. With your not-yet-written
> solution, the caching is built into Postgres, and it's on all of the
> time, but may (see below) only actually perform well with Active
> Directory. With my solution, any caching is optional, because it has to
> be implemented/maintained external to Postgres, but because it's just
> generic "LDAP caching" then it should be broadly compatible and we
> don't have to maintain it. I can see arguments for and against both
> approaches.

I'm a bit confused by the this- either you're referring to the cache
being PG's existing system, which certainly has already been written,
and has existed since it was committed and released as part of 8.1, and
is, indeed, on all the time ... or you're talking about something else
which hasn't been written and could therefore be anything, though I'm
generally against the idea of having an independent cache for this, as
described above.

As for optional cacheing with some generic LDAP caching system, that
strikes me as clearly even worse than building something into PG for
this as it requires maintaining yet another system in order to have a
reasonably well working system and that isn't good.  While it's good
that we have pgbouncer, it'd certainly be better if we didn't need it
and it's got a bunch of downsides to it.  I strongly suspect the same
would be true of some external generic "LDAP cacheing" system as is
referred to above, though as there isn't anything to look at, I can't
say for sure.

Regarding 'performing well', while lots of little queries may be better
in some cases than less frequent larger queries, that's really going to
depend on the frequency of each and therefore really be rather dependent
on the environment and usage.  In any case, however, being able to
leverage change modifications instead of fully resyncing will definitely
be better.  At the same time, however, if we have the external generic
LDAP cacheing system that's being claimed ... why wouldn't we simply use
that with the cron-based system today to offload those from the main
LDAP systems?

> > > I guess I'd have to see an implementation -- I was under the impression
> > > that persistent search wasn't widely implemented?
> >
> > I mean … let’s talk about the one that really matters here:
> >
> > https://docs.microsoft.com/en-us/windows/win32/ad/change-notifications-in-active-directory-domain-services
>
> That would certainly be a useful thing to implement for deployments
> that can use it. But my personal interest in writing "LDAP" code that
> only works with AD is nil, at least in the short term.
>
> (The continued attitude that Microsoft Active Directory is "the one
> that really matters" is really frustrating. I have users on LDAP
> without Active Directory. Postgres tests are written against OpenLDAP.)

What would you consider the important directories to worry about beyond
AD?  I don't consider the PG testing framework to be particularly
indicative of what enterprises are actually running.

> > OpenLDAP has an audit log system which can be used though it’s
> > certainly not as nice and would require code specific to it.
> >
> > This talks a bit about other directories:
> >
https://docs.informatica.com/data-integration/powerexchange-adapters-for-powercenter/10-1/powerexchange-for-ldap-user-guide-for-powercenter/ldap-sessions/configuring-change-data-capture/methods-for-tracking-changes-in-different-directories.html
> >
> > I do wish they all supported it cleanly in the same way.
>
> Okay. But the answer to "is persistent search widely implemented?"
> appears to be "No."

I'm curious as to how the large environments that you've worked with
have generally solved this issue.  Is there a generic LDAP cacheing
system that's been used?  What?

> > > The current patchset here has pieces of what is usually contained in
> > > HBA (the LDAP host/port/base/filter/etc.) effectively moved into
> > > pg_ident, while other pieces (TLS settings) remain in the HBA and the
> > > environment. That's what I'm referring to. If that is workable for you
> > > in the end, that's fine, but for me it'd be much easier to maintain if
> > > the mapping query and the LDAP connection settings for that mapping
> > > query were next to each other.
> >
> > I can agree with the point that it would be nicer to have the ldap
> > host/port/base/filter be in the hba instead, if there is a way to
> > accomplish that reasonably. Did you have a suggestion in mind for how
> > to do that..?  If there’s an alternative approach to consider, it’d
> > be useful to see them next to each other and then we could all
> > contemplate which is better.
>
> I didn't say I necessarily wanted it all in the HBA, just that I wanted
> it all in the same spot.
>
> I don't see a good way to push the filter back into the HBA, because it
> may very well depend on the users being mapped (i.e. there may need to
> be multiple lines in the map). Same for the query attributes. In fact
> if I'm already using AD Kerberos or SSPI and I want to be able to
> handle users coming from multiple domains, couldn't I be querying
> entirely different servers depending on the username presented?

Yeah, that's a good point and which argues for putting everything into
the ident.  In such a situation as you describe above, we wouldn't
actually have any LDAP configuration in the HBA and I'm entirely fine
with that- we'd just have it all in ident.  I don't see how you'd make
that work with, as you suggest above, LDAP-based authentication and the
idea of having only one connection be used for the LDAP-based auth and
the mapping lookup, but I'm also not generally worried about LDAP-based
auth and would rather we rip it out entirely. :)

As such, I'd say that you've largely convinced me that we should just
move all of the LDAP configuration for the lookup into the ident and
discourage people from using LDAP-based authentication and from putting
LDAP configuration into the hba.  I'm still a fan of the general idea of
having a way to configure such ldap parameters in one place in whatever
file they go into and then re-using that multiple times on the general
assumption that folks are likely to need to reference a particular LDAP
configuration more than once, wherever it's configured.

> > > > When it comes to the question of "how to connect to an LDAP server for
> > > > $whatever", it seems like it'd be nice to be able to configure that once
> > > > and reuse that configuration.  Not sure I have a great suggestion for
> > > > how to do that. The approach this patch takes of adding options to
> > > > pg_hba for that, just like other options in pg_hba do, strikes me as
> > > > pretty reasonable.
> > >
> > > Right. That part seems less reasonable to me, given the current format
> > > of the HBA. YMMV.
> >
> > If the ldap connection info and filters and such could all exist in
> > the hba, then perhaps a way to define those credentials in one place
> > in the hba file and then use them on other lines would be
> > possible..?  Seems like that would be easier than having them also in
> > the ident or having the ident refer to something defined elsewhere.
> >
> > Consider in the hba having:
> >
> > LDAPSERVER[ldap1]=“ldaps://whatever other options go here”
> >
> > Then later:
> >
> > hostssl all all ::0/0 ldap ldapserver=ldap1 ldapmapserver=ldap1 map=myldapmap
> >
> > Clearly needs more thought needed due to different requirements for
> > ldap authentication vs. the map, but still, the general idea being to
> > have all of it in the hba and then a way to define ldap server
> > configuration in the hba once and then reused.
>
> You're open to the idea of bolting a new key/value grammar onto the HBA
> parser, but not to the idea of brainstorming a different configuration
> DSL?

Short answer- yes (or, as mentioned just above, into the ident file vs.
the hba).  I'd rather we build on the existing configuration systems
that we have rather than invent something new that will then have to
work with the others, as I don't see it as likely that we could just
replace the existing ones with something new and make everyone
change.  Having yet another one strikes me as worse than making
improvements to the existing ones (be those 'bolted on' or otherwise).

Thanks,

Stephen

Attachment

signature.asc

Re: [PoC] Delegating pg_ident to a third party

From

Jacob Champion

Date:

02 February 2022, 19:45:04

On Mon, 2022-01-10 at 15:09 -0500, Stephen Frost wrote:
> Greetings,

Sorry for the delay, the last few weeks have been insane.

> * Jacob Champion (pchampion@vmware.com) wrote:
> > On Tue, 2022-01-04 at 22:24 -0500, Stephen Frost wrote:
> > > On Tue, Jan 4, 2022 at 18:56 Jacob Champion <pchampion@vmware.com> wrote:
> > > > Could you talk more about the use cases for which having the "actual
> > > > user" is better? From an auditing perspective I don't see why
> > > > "authenticated as jacob@example.net, logged in as admin" is any worse
> > > > than "logged in as jacob".
> > > 
> > > The above case isn’t what we are talking about, as far as I
> > > understand anyway. You’re suggesting “authenticated as 
> > > jacob@example.net, logged in as sales” where the user in the database
> > > is “sales”.  Consider triggers which only have access to “sales”, or
> > > a tool like pgaudit which only has access to “sales”.
> > 
> > Okay. So an additional getter function in miscadmin.h, and surfacing
> > that function to trigger languages, are needed to make authn_id more
> > generally useful. Any other cases you can think of?
> 
> That would help but now you've got two different things that have to be
> tracked, potentially, because for some people you might not want to use
> their system auth'd-as ID.  I don't see that as a great solution and
> instead as a workaround.

There's nothing to be worked around. If you have a user mapping set up
using the features that exist today, and you want to audit who logged
in at some point in the past, then you need to log both the
authenticated ID and the authorized role. There's no getting around
that. It's not enough to say "just check the configuration" because the
config can change over time.

> > But to elaborate: *forcing* one-user-per-role is wasteful, because if I
> > have a thousand employees, and I want to give all my employees access
> > to a guest role in the database, then I have to administer a thousand
> > roles: maintaining them through dump/restores and pg_upgrades, auditing
> > them to figure out why Bob in Accounting somehow got a different
> > privilege GRANT than the rest of the users, adding new accounts,
> > purging old ones, maintaining the inevitable scripts that will result.
> 
> pg_upgrade just handles it, no?  pg_dumpall -g does too.  Having to deal
> with roles in general is a pain but the number of them isn't necessarily
> an issue.  A guest role which doesn't have any auditing requirements
> might be a decent use-case for what you're talking about here but I
> don't know that we'd implement this for just that case.  Part of this
> discussion was specifically about addressing the other challenges- like
> having automation around the account addition/removal and sync'ing role
> membership too.  As for auditing privileges, that should be done
> regardless and the case you outline isn't somehow different from others
> (the same could be as easily said for how the 'guest' account got access
> to whatever it did).

I think there's a difference between auditing a small fixed number of
roles and auditing many thousands of them that change on a weekly or
daily basis. I'd rather maintain the former, given the choice. It's
harder for things to slip through the cracks with fewer moving pieces.

> > If none of the users need to be "special" in any way, that's all wasted
> > overhead. (If they do actually need to be special, then at least some
> > of that overhead becomes necessary. Otherwise it's waste.) You may be
> > able to mitigate the cost of the waste, or absorb the mitigations into
> > Postgres so that the user can't see the waste, or decide that the waste
> > is not costly enough to care about. It's still waste.
> 
> Except the amount of 'wasted' overhead being claimed here seems to be
> hardly any.  The biggest complaint levied at this seems to really be
> just the issues around the load on the ldap systems from having to deal
> with the frequent sync queries, and that's largely a solvable issue in
> the majority of environments out there today.

As long as we're in agreement that there is waste, I don't think I'm
going to convince you about the cost. It's tangential anyway if you're
not going to remove many-to-many maps.

> > > Not sure exactly what you’re referring to here by “administer role
> > > privileges externally too”..?  Curious to hear what you are imagining
> > > specifically.
> > 
> > Just that it would be nice to centrally provision role GRANTs as well
> > as role membership, that's all. No specifics in mind, and I'm not even
> > sure if LDAP would be a helpful place to put that sort of config.
> 
> GRANT's on objects, you mean?  I agree, that would be interesting to
> consider though it would involve custom entries in the LDAP directory,
> no?  Role membership would be able to be sync'd as part of group
> membership and that was something I was thinking would be handled as
> part of this in a similar manner to what the 3rd party solutions provide
> today using the cron-based approach.

Agreed. I haven't put too much thought into those use cases yet.

> > > I’d also point out though that having to do an ldap lookup on every
> > > login to PG is *already* an issue in some environments, having to do
> > > multiple amplifies that.
> > 
> > You can't use the LDAP auth method with this patch yet, so this concern
> > is based on code that doesn't exist. It's entirely possible that you
> > could do the role query as part of the first bound connection. If that
> > proves unworkable, then yes, I agree that it's a concern.
> 
> Perhaps it could be done as part of the same connection but that then
> has an impact on what the configuration of the ident LDAP lookup would
> be, no?  That seems like an important thing to flesh out before we move
> too much farther with this patch, to make sure that, if we want that to
> work, that there's a clear way to configure it to avoid the double LDAP
> connection.  I'm guessing you already have an idea how that'll work
> though..?

It's only relevant if the other thread (which you've said you're
ignoring) progresses. The patch discussed here does not touch that code
path.

But yes, I have a general idea that as long as a user can look up (but
not modify) their own role information, this should work just fine.

> > > Not to mention that when the ldap servers can’t be reached for some
> > > reason, no one can log into the database and that’s rather
> > > unfortunate too.
> > 
> > Assuming you have no caches, then yes. That might be a pretty good
> > argument for allowing ldapmap and map to be used together, actually, so
> > that you can have some critical users who can always log in as
> > "themselves" or "admin" or etc. Or maybe it's an argument for allowing
> > HBA to handle fallback methods of authentication.
> 
> Ok, so now we're talking about a cache that needs to be implemented
> which will ... store the user's password for LDAP authentication?  Or
> what the mapping is for various LDAP IDs to PG roles?  And how will that
> cache be managed?  Would it be handled by dump/restore?  What about
> pg_upgrade?  How will entries in the cache be removed?

You keep pulling the authentication discussion, which this patch does
not touch on purpose, into this discussion about authorization. The
authz info requested by this patch seems like it can be cached.

People currently using LDAP authentication (which again, this patch
cannot use because there is no LDAP user mapping) either have existing
HA infrastructure that they're happy with, or they don't. This patch
shouldn't make that situation any better or worse -- *if* the lookup
can be done on one connection.

> And mainly- how is this different from just having all the roles in PG
> to begin with..?

This comment seems counterproductive. One major difference is that
Postgres doesn't have to duplicate the authentication info that some
other system already holds.

> > Luckily I think it's pretty easy to communicate to LDAP users that if
> > *all* your login infrastructure goes down, you will no longer be able
> > to log in. They're probably used to that idea, if they haven't set up
> > any availability infra.
> 
> Except that most of the rest of the infrastructure may continue to work
> just fine except for logging in- which is something most folks only do
> once a day.  That is, why is the SQL Server system still happily
> accepting connections while the AD is being rebooted?  Or why can I
> still log into the company website even though AD is down, but I can't
> get into PG?  Not everything in an environment is tied to LDAP being up
> and running all the time, so it's not nearly so cut and dry in many,
> many cases.

Whatever LDAP users currently deal with, this patch doesn't change
their experience, right? It seems like it's a lot easier to add caching
to a synchronous check, to make it asynchronous and a little more
fault-tolerant, than it is to do the reverse.

> > >  These are, of course, arguments for moving away from methods that
> > > require checking with some other system synchronously during login-
> > > which is another reason why it’s better to have the authentication
> > > credentials easily map to the PG role, without the need for external
> > > checks at login time.  That’s done with today’s pg_ident, but this
> > > patch would change that.
> > 
> > There are arguments for moving towards synchronous checks as well.
> > Central revocation of credentials (in timeframes shorter than ticket
> > expiration) is what comes to mind. Revocation is hard and usually
> > conflicts with the desire for availability.
> 
> Revocation in less time than ticket lifetime and everything falling over
> due to the AD being restarted are very different.  The approaches being
> discussed are all much shorter than ticket lifetime and so that's hardly
> an appropriate comparison to be making.  I didn't suggest that waiting
> for ticket expiration would be appropriate when it comes to syncing
> accounts between AD and PG or that it would be appropriate for
> revocation.  Regarding the cache'ing proposed above- in such a case,
> clearly, revocation wouldn't be syncronous either.  Certainly in the
> cases today where cronjobs are being used to perform the sync,
> revocation also isn't syncronous (unless also using LDAP for
> authentication, of course, though that wouldn't do anything for existing
> sessions, while removing role memberships does...).

Sure. Again: tradeoffs.

> > The two systems have different architectures, and different security
> > properties, and you have me at a disadvantage in that you can see the
> > experimental code I have written and I cannot see the hypothetical code
> > in your head.
> 
> I've barely glanced at the code you've written <snip>

This is frustrating to read. I think we're talking past each other,
because I'm trying to talk about this patch and you're talking about
other things.

> The only reference to hypothetical code
> is the idea of a background or other worker that subscribes to changes
> in LDAP and implements those changes in PG instead of having something
> cron-based do it

Yes. That's what I was referring to.

> , but that doesn't really change anything about the
> architectural question of if we cache (either with an explicit cache, as
> you've opined us adding above, though which there is no code for today,

LDAP caches exist... I'm not suggesting we implement a Postgres-branded 
LDAP cache.

> or just by using PG's existing role/membership system) or call out to
> LDAP for every login.
> 
> > It sounds like I'm more concerned with the ability to have an online
> > central source of truth for access control, accepting that denial of
> > service may cause the system to fail shut; and you're more concerned
> > with availability in the face of network failure, accepting that denial
> > of service may cause the system to fail open. I think that's a design
> > decision that belongs to an end user.
> 
> There is more to it than just failing shut/closed.  Part of the argument
> being used to drive this change was that it would help to reduce the
> load on the LDAP servers because there wouldn't be a need to run large
> queries on them frequently out of cron to keep PG's understanding of
> what the roles are and their mappings is matching what's in LDAP.

Yes.

> > The distributed availability problems you're describing are, in my
> > experience, typically solved by caching. With your not-yet-written
> > solution, the caching is built into Postgres, and it's on all of the
> > time, but may (see below) only actually perform well with Active
> > Directory. With my solution, any caching is optional, because it has to
> > be implemented/maintained external to Postgres, but because it's just
> > generic "LDAP caching" then it should be broadly compatible and we
> > don't have to maintain it. I can see arguments for and against both
> > approaches.
> 
> I'm a bit confused by the this- either you're referring to the cache
> being PG's existing system, which certainly has already been written,
> and has existed since it was committed and released as part of 8.1, and
> is, indeed, on all the time ... or you're talking about something else
> which hasn't been written and could therefore be anything, though I'm
> generally against the idea of having an independent cache for this, as
> described above.

You just proposed an internal caching system, immediately upthread:
"I'd go a step further and suggest that the way to do this is with a
background worker that's started up and connects to an LDAP
infrastructure and listens for changes, allowing the system to pick up
on new roles/memberships as soon as they're created in the LDAP
environment." That proposal is what I was referring to by "your not-
yet-written solution".

> As for optional cacheing with some generic LDAP caching system, that
> strikes me as clearly even worse than building something into PG for
> this as it requires maintaining yet another system in order to have a
> reasonably well working system and that isn't good.

A choice for the end user. If they don't want to deal with LDAP
infrastructure, they don't have to use it.

> While it's good
> that we have pgbouncer, it'd certainly be better if we didn't need it
> and it's got a bunch of downsides to it.  I strongly suspect the same
> would be true of some external generic "LDAP cacheing" system as is
> referred to above, though as there isn't anything to look at, I can't
> say for sure.

We can take a look at OpenLDAP's proxy caching for some info. That
won't be perfectly representative but I don't think there's "nothing to
look at".

> Regarding 'performing well', while lots of little queries may be better
> in some cases than less frequent larger queries, that's really going to
> depend on the frequency of each and therefore really be rather dependent
> on the environment and usage.  In any case, however, being able to
> leverage change modifications instead of fully resyncing will definitely
> be better.  At the same time, however, if we have the external generic
> LDAP cacheing system that's being claimed ... why wouldn't we simply use
> that with the cron-based system today to offload those from the main
> LDAP systems?

I think there's an architectural difference between a proxy cache that
is set up to reduce load on a central server, and one that is set up to
handle network partitions while ensuring liveness. To be fair, I don't
know which use cases existing solutions can handle. But those two don't
seem to be the same to me.

I know that I have users who are okay with the query load from logins,
but not with the query load of their role-sync scripts. That's a good
enough datapoint for me.

> > That would certainly be a useful thing to implement for deployments
> > that can use it. But my personal interest in writing "LDAP" code that
> > only works with AD is nil, at least in the short term.
> > 
> > (The continued attitude that Microsoft Active Directory is "the one
> > that really matters" is really frustrating. I have users on LDAP
> > without Active Directory. Postgres tests are written against OpenLDAP.)
> 
> What would you consider the important directories to worry about beyond
> AD?  I don't consider the PG testing framework to be particularly
> indicative of what enterprises are actually running.

I have end users running
- NetIQ/Novell eDirectory
- Oracle Directory Server
- Red Hat IdM
in addition to AD.

> > > OpenLDAP has an audit log system which can be used though it’s
> > > certainly not as nice and would require code specific to it.
> > > 
> > > This talks a bit about other directories: 
> > >
https://docs.informatica.com/data-integration/powerexchange-adapters-for-powercenter/10-1/powerexchange-for-ldap-user-guide-for-powercenter/ldap-sessions/configuring-change-data-capture/methods-for-tracking-changes-in-different-directories.html
> > > 
> > > I do wish they all supported it cleanly in the same way.
> > 
> > Okay. But the answer to "is persistent search widely implemented?"
> > appears to be "No."
> 
> I'm curious as to how the large environments that you've worked with
> have generally solved this issue.  Is there a generic LDAP cacheing
> system that's been used?  What?

They haven't solved the issue; that's why I'm poking at it. Several
users have to cobble together scripts because of poor interaction with
their existing LDAP deployments (or complete lack of support, in the
case of pgbouncer).

> > I don't see a good way to push the filter back into the HBA, because it
> > may very well depend on the users being mapped (i.e. there may need to
> > be multiple lines in the map). Same for the query attributes. In fact
> > if I'm already using AD Kerberos or SSPI and I want to be able to
> > handle users coming from multiple domains, couldn't I be querying
> > entirely different servers depending on the username presented?
> 
> Yeah, that's a good point and which argues for putting everything into
> the ident.  In such a situation as you describe above, we wouldn't
> actually have any LDAP configuration in the HBA and I'm entirely fine
> with that- we'd just have it all in ident.  I don't see how you'd make
> that work with, as you suggest above, LDAP-based authentication and the
> idea of having only one connection be used for the LDAP-based auth and
> the mapping lookup, but I'm also not generally worried about LDAP-based
> auth and would rather we rip it out entirely. :)
> 
> As such, I'd say that you've largely convinced me that we should just
> move all of the LDAP configuration for the lookup into the ident and
> discourage people from using LDAP-based authentication and from putting
> LDAP configuration into the hba. 

I'm willing to bet that Postgres dropping support will not result in my
end users abandoning their LDAP infrastructure. Either I and others in
my position will need to maintain forks, or my end users will find a
different database.

If there's widespread agreement that the project doesn't want to
maintain an LDAP auth method -- so far I think you've provided the only
such opinion, that I've seen at least -- that might be a good argument
for introducing pluggable auth so that the community can maintain the
methods that are important to them.

> I'm still a fan of the general idea of
> having a way to configure such ldap parameters in one place in whatever
> file they go into and then re-using that multiple times on the general
> assumption that folks are likely to need to reference a particular LDAP
> configuration more than once, wherever it's configured.

Sure.

> > You're open to the idea of bolting a new key/value grammar onto the HBA
> > parser, but not to the idea of brainstorming a different configuration
> > DSL?
> 
> Short answer- yes (or, as mentioned just above, into the ident file vs.
> the hba).  I'd rather we build on the existing configuration systems
> that we have rather than invent something new that will then have to
> work with the others, as I don't see it as likely that we could just
> replace the existing ones with something new and make everyone
> change.  Having yet another one strikes me as worse than making
> improvements to the existing ones (be those 'bolted on' or otherwise).

I think the key to maintaining incrementally built systems is that at
some point, eventually, you refactor the thing. There was a brief
question on what that might look like, from Peter. You stepped in with
some very strong opinions.

--Jacob