Thread: Non-superuser subscription owners

Non-superuser subscription owners

From
Mark Dilger
Date:
These patches have been split off the now deprecated monolithic "Delegating superuser tasks to new security roles"
threadat [1]. 

The purpose of these patches is to allow non-superuser subscription owners without risk of them overwriting tables they
lackprivilege to write directly. This both allows subscriptions to be managed by non-superusers, and protects servers
withsubscriptions from malicious activity on the publisher side. 



[1] https://www.postgresql.org/message-id/flat/F9408A5A-B20B-42D2-9E7F-49CD3D1547BC%40enterprisedb.com
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: Non-superuser subscription owners

From
Ronan Dunklau
Date:
Le mercredi 20 octobre 2021, 20:40:39 CEST Mark Dilger a écrit :
> These patches have been split off the now deprecated monolithic "Delegating
> superuser tasks to new security roles" thread at [1].
>
> The purpose of these patches is to allow non-superuser subscription owners
> without risk of them overwriting tables they lack privilege to write
> directly. This both allows subscriptions to be managed by non-superusers,
> and protects servers with subscriptions from malicious activity on the
> publisher side.

Thank you Mark for splitting this.

This patch looks good to me, and provides both better security (by closing the
"dropping superuser role" loophole) and usefule features.


--
Ronan Dunklau





Re: Non-superuser subscription owners

From
Andrew Dunstan
Date:
On 10/20/21 14:40, Mark Dilger wrote:
> These patches have been split off the now deprecated monolithic "Delegating superuser tasks to new security roles"
threadat [1].
 
>
> The purpose of these patches is to allow non-superuser subscription owners without risk of them overwriting tables
theylack privilege to write directly. This both allows subscriptions to be managed by non-superusers, and protects
serverswith subscriptions from malicious activity on the publisher side.
 
>
> [1] https://www.postgresql.org/message-id/flat/F9408A5A-B20B-42D2-9E7F-49CD3D1547BC%40enterprisedb.com


These patches look good on their face. The code changes are very
straightforward.


w.r.t. this:

+   On the subscriber, the subscription owner's privileges are
re-checked for
+   each change record when applied, but beware that a change of
ownership for a
+   subscription may not be noticed immediately by the replication workers.
+   Changes made on the publisher may be applied on the subscriber as
+   the old owner.  In such cases, the old owner's privileges will be
the ones
+   that matter.  Worse still, it may be hard to predict when replication
+   workers will notice the new ownership.  Subscriptions created
disabled and
+   only enabled after ownership has been changed will not be subject to
this
+   race condition.


maybe we should disable the subscription before making such a change and
then re-enable it?


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 1, 2021, at 7:18 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> w.r.t. this:
>
> +   On the subscriber, the subscription owner's privileges are
> re-checked for
> +   each change record when applied, but beware that a change of
> ownership for a
> +   subscription may not be noticed immediately by the replication workers.
> +   Changes made on the publisher may be applied on the subscriber as
> +   the old owner.  In such cases, the old owner's privileges will be
> the ones
> +   that matter.  Worse still, it may be hard to predict when replication
> +   workers will notice the new ownership.  Subscriptions created
> disabled and
> +   only enabled after ownership has been changed will not be subject to
> this
> +   race condition.
>
>
> maybe we should disable the subscription before making such a change and
> then re-enable it?

Right.  I commented the code that way because there is a clear concern, but I was uncertain which way around the
problemwas best. 

ALTER SUBSCRIPTION..[ENABLE | DISABLE] do not synchronously start or stop subscription workers.  The ALTER command
updatesthe catalog's subenabled field, but workers only lazily respond to that.  Disabling and enabling the
subscriptionas part of the OWNER TO would not reliably accomplish anything. 

The attached patch demonstrates the race condition.  It sets up a publisher and subscriber, and toggles the
subscriptionon and off on the subscriber node, interleaved with inserts and deletes on the publisher node.  If the
ALTERSUBSCRIPTION commands were synchronous, the test results would be deterministic, with only the inserts performed
whilethe subscription is enabled being replicated, but because the ALTER commands are asynchronous, the results are
nondeterministic.

It is unclear that I can make ALTER SUBSCRIPTION..OWNER TO synchronous without redesigning the way workers respond to
pg_subscriptioncatalog updates generally.  That may be a good project to eventually tackle, but I don't see that it is
moreimportant to close the race condition in an OWNER TO than for a DISABLE. 

Thoughts?


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 1, 2021, at 10:58 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
> ALTER SUBSCRIPTION..[ENABLE | DISABLE] do not synchronously start or stop subscription workers.  The ALTER command
updatesthe catalog's subenabled field, but workers only lazily respond to that.  Disabling and enabling the
subscriptionas part of the OWNER TO would not reliably accomplish anything. 

Having discussed this with Andrew off-list, we've concluded that updating the documentation for logical replication to
makethis point more clear is probably sufficient, but I wonder if anyone thinks otherwise? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Mon, Nov 1, 2021 at 6:44 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
> > ALTER SUBSCRIPTION..[ENABLE | DISABLE] do not synchronously start or stop subscription workers.  The ALTER command
updatesthe catalog's subenabled field, but workers only lazily respond to that.  Disabling and enabling the
subscriptionas part of the OWNER TO would not reliably accomplish anything. 
>
> Having discussed this with Andrew off-list, we've concluded that updating the documentation for logical replication
tomake this point more clear is probably sufficient, but I wonder if anyone thinks otherwise? 

The question in my mind is whether there's some reasonable amount of
time that a user should expect to have to wait for the changes to take
effect. If it could easily happen that the old permissions are still
in use a month after the change is made, I think that's probably not
good. If there's reason to think that, barring unusual circumstances,
changes will be noticed within a few minutes, I think that's fine.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 1, 2021, at 10:58 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
> ALTER SUBSCRIPTION..[ENABLE | DISABLE] do not synchronously start or stop subscription workers.  The ALTER command
updatesthe catalog's subenabled field, but workers only lazily respond to that.  Disabling and enabling the
subscriptionas part of the OWNER TO would not reliably accomplish anything. 

I have rethought my prior analysis.  The problem in the previous patch was that the subscription apply workers did not
checkfor a change in ownership the way they checked for other changes, instead only picking up the new ownership
informationwhen the worker restarted for some other reason.  This next patch set fixes that.  The application of a
changerecord may continue under the old ownership permissions when a concurrent command changes the ownership of the
subscription,but the worker will pick up the new permissions before applying the next record.  I think that is
consistentenough with reasonable expectations. 

The first two patches are virtually unchanged.  The third updates the behavior of the apply workers, and updates the
documentationto match. 


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Mon, 2021-11-01 at 10:58 -0700, Mark Dilger wrote:
> It is unclear that I can make ALTER SUBSCRIPTION..OWNER TO
> synchronous without redesigning the way workers respond to
> pg_subscription catalog updates generally.  That may be a good
> project to eventually tackle, but I don't see that it is more
> important to close the race condition in an OWNER TO than for a
> DISABLE.
> 
> Thoughts?

What if we just say that OWNER TO must be done by a superuser, changing
from one superuser to another, just like today? That would preserve
backwards compatibility, but people with non-superuser subscriptions
would need to drop/recreate them.

When we eventually do tackle the problem, we can lift the restriction.

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 16, 2021, at 10:08 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Mon, 2021-11-01 at 10:58 -0700, Mark Dilger wrote:
>> It is unclear .....
>>
>> Thoughts?
>
> What if we just say that OWNER TO must be done by a superuser, changing
> from one superuser to another, just like today? That would preserve
> backwards compatibility, but people with non-superuser subscriptions
> would need to drop/recreate them.

The paragraph I wrote on 11/01 and you are responding to is no longer relevant.  The patch submission on 11/03 tackled
theproblem.  Have you had a chance to take a look at the new design? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Andrew Dunstan
Date:
On 11/3/21 15:50, Mark Dilger wrote:
>> On Nov 1, 2021, at 10:58 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>>
>> ALTER SUBSCRIPTION..[ENABLE | DISABLE] do not synchronously start or stop subscription workers.  The ALTER command
updatesthe catalog's subenabled field, but workers only lazily respond to that.  Disabling and enabling the
subscriptionas part of the OWNER TO would not reliably accomplish anything.
 
> I have rethought my prior analysis.  The problem in the previous patch was that the subscription apply workers did
notcheck for a change in ownership the way they checked for other changes, instead only picking up the new ownership
informationwhen the worker restarted for some other reason.  This next patch set fixes that.  The application of a
changerecord may continue under the old ownership permissions when a concurrent command changes the ownership of the
subscription,but the worker will pick up the new permissions before applying the next record.  I think that is
consistentenough with reasonable expectations.
 
>
> The first two patches are virtually unchanged.  The third updates the behavior of the apply workers, and updates the
documentationto match.
 


I'm generally happier about this than the previous patch set. With the
exception of some slight documentation modifications I think it's
basically committable. There doesn't seem to be a CF item for it but I'm
inclined to commit it in a couple of days time.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 16, 2021, at 12:06 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> There doesn't seem to be a CF item for it but I'm
> inclined to commit it in a couple of days time.

https://commitfest.postgresql.org/36/3414/

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Andrew Dunstan
Date:
On 11/16/21 15:08, Mark Dilger wrote:
>
>> On Nov 16, 2021, at 12:06 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>> There doesn't seem to be a CF item for it but I'm
>> inclined to commit it in a couple of days time.
> https://commitfest.postgresql.org/36/3414/
>

OK, got it, thanks.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2021-11-03 at 12:50 -0700, Mark Dilger wrote:
> The first two patches are virtually unchanged.  The third updates the
> behavior of the apply workers, and updates the documentation to
> match.

v2-0001 corrects some surprises, but may create others. Why is renaming
allowed, but not changing the options? What if we add new options, and
some of them seem benign for a non-superuser to change?

The commit message part of the patch says that it's to prevent non-
superusers from being able to (effectively) create subscriptions, but
don't we want privileged non-superusers to be able to create
subscriptions?

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 16, 2021, at 8:11 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Wed, 2021-11-03 at 12:50 -0700, Mark Dilger wrote:
>> The first two patches are virtually unchanged.  The third updates the
>> behavior of the apply workers, and updates the documentation to
>> match.
>
> v2-0001 corrects some surprises, but may create others. Why is renaming
> allowed, but not changing the options? What if we add new options, and
> some of them seem benign for a non-superuser to change?

The patch cannot anticipate which logical replication options may be added to the project in some later commit.  We can
letthat commit adjust the behavior to allow the option if we agree it is sensible for non-superusers to do so. 

> The commit message part of the patch says that it's to prevent non-
> superusers from being able to (effectively) create subscriptions, but
> don't we want privileged non-superusers to be able to create
> subscriptions?

Perhaps, but I don't think merely owning a subscription should entitle a role to create new subscriptions.
Administratorsmay quite intentionally create low-power users, ones without access to anything but a single table, or a
singleschema, as a means of restricting the damage that a subscription might do (or more precisely, what the publisher
mightdo via the subscription.)  It would be surprising if that low-power user was then able to recreate the
subscriptioninto something different. 

We should probably come back to this topic in a different patch, perhaps a patch that introduces a new
pg_manage_subscriptionsrole or such. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2021-11-17 at 07:44 -0800, Mark Dilger wrote:
> Administrators may quite
> intentionally create low-power users, ones without access to anything
> but a single table, or a single schema, as a means of restricting the
> damage that a subscription might do (or more precisely, what the
> publisher might do via the subscription.)  It would be surprising if
> that low-power user was then able to recreate the subscription into
> something different.

I am still trying to understand this use case. It doesn't feel like
"ownership" to me, it feels more like some kind of delegation.

Is GRANT a better fit here? That would allow more than one user to
REFRESH, or ENABLE/DISABLE the same subscription. It wouldn't allow
RENAME, but I don't see why we'd separate privileges for
CREATE/DROP/RENAME anyway.

This would not address the weirdness of the existing code where a
superuser loses their superuser privileges but still owns a
subscription. But perhaps we can solve that a different way, like just
performing a check when someone loses their superuser privileges that
they don't own any subscriptions.

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 17, 2021, at 9:33 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> I am still trying to understand this use case. It doesn't feel like
> "ownership" to me, it feels more like some kind of delegation.
>
> Is GRANT a better fit here? That would allow more than one user to
> REFRESH, or ENABLE/DISABLE the same subscription. It wouldn't allow
> RENAME, but I don't see why we'd separate privileges for
> CREATE/DROP/RENAME anyway.

We may eventually allow non-superusers to create subscriptions, but there are lots of details to work out.  Should
therebe limits on how many subscriptions they can create?  Should there be limits to the number of simultaneously open
connectionsthey can create out to other database servers (publishers)?  Should they need to be granted USAGE on a
databasepublisher in order to use the connection string for that publisher in a subscription they create?  Should they
needto be granted USAGE on a publication in order to replicate it?  Yes, there may be restrictions on the publisher
side,too, but the user model on subscriber and publisher might differ, and the connection string used might not match
thesubscription owner, so some restriction on the subscriber side may be needed. 

The implementation of [CREATE | ALTER] SUBSCRIPTION was designed at a time when only superusers could execute them, and
asfar as I can infer from the design, no effort to constrain the effects of those commands was made.  Since we're
tryingto make subscriptions into things that non-superusers can use, we have to deal with some things in those
functions. For example, ALTER SUBSCRIPTION can change the database connection parameter, or the publication subscribed,
orwhether synchronous_commit is used.  I don't see that a subscription owner should necessarily be allowed to mess with
that,at least not without some other privilege checks beyond mere ownership. 

I think this is pretty analogous to how security definer functions work.  You might call those "delegation" also, but
thebasic idea is that the function will run under the privileges of the function's owner, who might be quite privileged
ifyou want the function to do highly secure things for you, but who could also intentionally be limited in privilege.
Itwouldn't make much sense to say the owner of a security definer function can arbitrarily escalate their privileges to
dothings like open connections to other database servers, or have the transactions in which they run have a different
settingof synchronous_commit.  Yet with subscriptions, if the subscription owner can run all forms of ALTER
SUBSCRIPTION,that's what they can do. 

I took a conservative position in the design of the patch to avoid giving away too much.  I suspect that we'll come
backto these design decisions and relax them at some point, but the exact way in which we relax them is not obvious.
Wecould just agree to remove them (as you seem to propose), or we might agree to create predefined roles and say that
thesubscription owner can change certain aspects of the subscription if and only if they are members of one or more of
thoseroles, or we may create new grantable privileges.  Each of those debates may be long and hard fought, so I don't
wantto invite that as part of this thread, or this patch will almost surely miss the cutoff for v15. 

> This would not address the weirdness of the existing code where a
> superuser loses their superuser privileges but still owns a
> subscription. But perhaps we can solve that a different way, like just
> performing a check when someone loses their superuser privileges that
> they don't own any subscriptions.

I gave that a slight amount of thought during the design of this patch, but didn't think we could refuse to revoke
superuseron such a basis, and didn't see what we should do with the subscription other than have it continue to be
ownedby the recently-non-superuser.  If you have a better idea, we can discuss it, but to some degree I think that is
alsoorthogonal to the purpose of this patch.  The only sense in which this patch depends on that issue is that this
patchproposes that non-superuser subscription owners are already an issue, and therefore that this patch isn't creating
anew issue, but rather making more sane something that already can happen. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 17, 2021, at 9:33 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> Is GRANT a better fit here? That would allow more than one user to
> REFRESH, or ENABLE/DISABLE the same subscription. It wouldn't allow
> RENAME, but I don't see why we'd separate privileges for
> CREATE/DROP/RENAME anyway.

I don't think I answered this directly in my last reply.

GRANT *might* be part of some solution, but it is unclear to me how best to do it.  The various configuration
parameterson subscriptions entail different security concerns.  We might take a fine-grained approach and create a
predefinedrole for each, or we might take a course-grained approach and create a single pg_manage_subscriptions role
whichcan set any parameter on any subscription, or maybe just parameters on subscriptions that the role also owns, or
wemight do something else, like burn some privilege bits and define new privileges that can be granted per subscription
ratherthan globally.  (I think that last one is a non-starter, but just mention it as an example of another approach.) 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Andrew Dunstan
Date:
On 11/16/21 15:06, Andrew Dunstan wrote:
> On 11/3/21 15:50, Mark Dilger wrote:
>>> On Nov 1, 2021, at 10:58 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>>>
>>> ALTER SUBSCRIPTION..[ENABLE | DISABLE] do not synchronously start or stop subscription workers.  The ALTER command
updatesthe catalog's subenabled field, but workers only lazily respond to that.  Disabling and enabling the
subscriptionas part of the OWNER TO would not reliably accomplish anything.
 
>> I have rethought my prior analysis.  The problem in the previous patch was that the subscription apply workers did
notcheck for a change in ownership the way they checked for other changes, instead only picking up the new ownership
informationwhen the worker restarted for some other reason.  This next patch set fixes that.  The application of a
changerecord may continue under the old ownership permissions when a concurrent command changes the ownership of the
subscription,but the worker will pick up the new permissions before applying the next record.  I think that is
consistentenough with reasonable expectations.
 
>>
>> The first two patches are virtually unchanged.  The third updates the behavior of the apply workers, and updates the
documentationto match.
 
>
> I'm generally happier about this than the previous patch set. With the
> exception of some slight documentation modifications I think it's
> basically committable. There doesn't seem to be a CF item for it but I'm
> inclined to commit it in a couple of days time.
>
>

Given there is some debate about the patch set I will hold off any
action for the time being.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2021-11-17 at 10:25 -0800, Mark Dilger wrote:
> We may eventually allow non-superusers to create subscriptions, but
> there are lots of details to work out.

I am setting aside the idea of subscriptions created by non-superusers.

My comments were about your idea for "low-power users" that can still
do things with subscriptions. And for that, GRANT seems like a better
fit than ownership.

With v2-0001, there are several things that seem weird to me:

  * Why can there be only one low-power user per subscription?
  * Why is RENAME a separate capability from CREATE/DROP?
  * What if you want to make the privileges more fine-grained, or make
changes in the future? Ownership is a single bit, so it requires that
everyone agree. Maybe some people want RENAME to be a part of that, and
others don't.

GRANT seems to provide better answers here.

> Since we're trying to make subscriptions into things that non-
> superusers can use, we have to deal with some things in those
> functions.

I understand the use case where a superuser isn't required anywhere in
the process, and some special users can create and own subscriptions. I
also understand that's not what these patches are trying to accomplish
(though v2-0003 seems like a good step in that direction).

I don't understand the use case as well where a non-superuser can
merely "use" a subscription. I'm sure such use cases exist and I'm fine
to go along with that idea, but I'd like to understand why ownership
(partial ownership?) is the right way to do this and GRANT is the wrong
way.

> For example, ALTER SUBSCRIPTION can change the database connection
> parameter, or the publication subscribed, or whether
> synchronous_commit is used.  I don't see that a subscription owner
> should necessarily be allowed to mess with that, at least not without
> some other privilege checks beyond mere ownership.

That violates my expectations of what "ownership" means.

> I think this is pretty analogous to how security definer functions
> work.

The analogy to SECURITY DEFINER functions seems to support my
suggestion for GRANT at least as much as your modified definition of
ownership.

> > This would not address the weirdness of the existing code where a
> > superuser loses their superuser privileges but still owns a
> > subscription. But perhaps we can solve that a different way, like
> > just
> > performing a check when someone loses their superuser privileges
> > that
> > they don't own any subscriptions.
> 
> I gave that a slight amount of thought during the design of this
> patch, but didn't think we could refuse to revoke superuser on such a
> basis,

I don't necessarily see a problem there, but I could be missing
something.

>  and didn't see what we should do with the subscription other than
> have it continue to be owned by the recently-non-superuser.  If you
> have a better idea, we can discuss it, but to some degree I think
> that is also orthogonal to the purpose of this patch.  The only sense
> in which this patch depends on that issue is that this patch proposes
> that non-superuser subscription owners are already an issue, and
> therefore that this patch isn't creating a new issue, but rather
> making more sane something that already can happen.

By introducing and documenting a way to get non-superusers to own a
subscription, it makes it more likely that people will do it, and
harder for us to change. That means the standard should be "this is
what we really want", rather than just "more sane than before".

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2021-11-17 at 10:48 -0800, Mark Dilger wrote:
> GRANT *might* be part of some solution, but it is unclear to me how
> best to do it.  The various configuration parameters on subscriptions
> entail different security concerns.  We might take a fine-grained
> approach and create a predefined role for each

I think you misunderstood the idea: not using predefined roles, just
plain old ordinary GRANT on a subscription object to ordinary roles.

   GRANT REFRESH ON SUBSCRIPTION sub1 TO nonsuper;

This should be easy enough because the subscription is a real object,
right?

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 17, 2021, at 1:10 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> I think you misunderstood the idea: not using predefined roles, just
> plain old ordinary GRANT on a subscription object to ordinary roles.
>
>   GRANT REFRESH ON SUBSCRIPTION sub1 TO nonsuper;
>
> This should be easy enough because the subscription is a real object,
> right?

/*
 * Grantable rights are encoded so that we can OR them together in a bitmask.
 * The present representation of AclItem limits us to 16 distinct rights,
 * even though AclMode is defined as uint32.  See utils/acl.h.
 *
 * Caution: changing these codes breaks stored ACLs, hence forces initdb.
 */
typedef uint32 AclMode;         /* a bitmask of privilege bits */

#define ACL_INSERT      (1<<0)  /* for relations */
#define ACL_SELECT      (1<<1)
#define ACL_UPDATE      (1<<2)
#define ACL_DELETE      (1<<3)
#define ACL_TRUNCATE    (1<<4)
#define ACL_REFERENCES  (1<<5)
#define ACL_TRIGGER     (1<<6)
#define ACL_EXECUTE     (1<<7)  /* for functions */
#define ACL_USAGE       (1<<8)  /* for languages, namespaces, FDWs, and
                                 * servers */
#define ACL_CREATE      (1<<9)  /* for namespaces and databases */
#define ACL_CREATE_TEMP (1<<10) /* for databases */
#define ACL_CONNECT     (1<<11) /* for databases */


We only have 4 values left in the bitmask, and I doubt that burning those slots for multiple new types of rights that
onlyhave meaning for subscriptions is going to be accepted.  For full disclosure, I'm proposing adding ACL_SET and
ACL_ALTER_SYSTEMin another patch and my proposal there could get shot down for the same reasons, but I think your
argumentwould be even harder to defend.  Maybe others feel differently. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2021-11-17 at 15:07 -0800, Mark Dilger wrote:
> We only have 4 values left in the bitmask, and I doubt that burning
> those slots for multiple new types of rights that only have meaning
> for subscriptions is going to be accepted.  For full disclosure, I'm
> proposing adding ACL_SET and ACL_ALTER_SYSTEM in another patch and my
> proposal there could get shot down for the same reasons, but I think
> your argument would be even harder to defend.  Maybe others feel
> differently.

Why not overload ACL_USAGE again, and say:

    GRANT USAGE ON SUBSCRIPTION sub1 TO nonsuper;

would allow ENABLE/DISABLE and REFRESH.

Again, I don't really understand the use case behind "can use a
subscription but not create one", so I'm not making a proposal. But
assuming that the use case exists, GRANT seems like a much better
approach.

(Aside: for me to commit something like this I'd want to understand the
"can use a subscription but not create one" use case better.)

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 17, 2021, at 1:06 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Wed, 2021-11-17 at 10:25 -0800, Mark Dilger wrote:
>> We may eventually allow non-superusers to create subscriptions, but
>> there are lots of details to work out.
>
> I am setting aside the idea of subscriptions created by non-superusers.

Ok, fair enough.  I think eventually we'll want that, but I'm also setting that aside for this patch.

> My comments were about your idea for "low-power users" that can still
> do things with subscriptions. And for that, GRANT seems like a better
> fit than ownership.

This patch has basically no value beyond the fact that it allows the replication to be *applied* as a user other than
superuser. Throw that out, and there isn't any point.  Everything else is window dressing.  The real security problem
withsubscriptions is that they act with superuser power. That naturally means that they must be owned and operated by
superuser,too, otherwise they serve as a privilege escalation attack vector.  It really doesn't make any sense to think
ofsubscriptions as operating under the permissions of multiple non-superusers.  You must choose a single role you want
thesubscription to run under.  What purpose would be served by GRANTing privileges on a subscription to more than one
non-superuser? It still operates as just the one user.  I agree you *could* give multiple users privileges to mess with
it,but you'd still need to assign a single role as the one whose privileges matter for the purpose of applying
replicationchanges.  I'm using "owner" for that purpose, and I think that is consistent with how security definer
functionswork.  They run as the owner, too.  It's perfectly well-precedented to use "owner" for this. 

I think the longer term plan is that non-superusers who have some privileged role will be allowed to create
subscriptions,and naturally they will own the subscriptions that they create, at least until an ALTER
SUBSCRIPTION..OWNERTO is successfully executed to transfer ownership.  Once that longer term plan is complete,
non-superuserswill be able to create publications of their tables on one database, and subscriptions to those
publicationson another database, all without needing the help of a superuser.  This patch doesn't get us all the way
there,but it heads directly toward that goal. 

> With v2-0001, there are several things that seem weird to me:
>
>  * Why can there be only one low-power user per subscription?

Because the apply workers run as only one user.  Currently it is always superuser.  After this patch, it is always the
owner,which amounts to the same thing for legacy subscriptions created and owned by superuser prior to upgrading to
v15,but not necessarily for new ones or ones that have ownership transferred after upgrade. 

We could think about subscriptions that act under multiple roles, perhaps taking role information as part of the
data-streamfrom the publisher, but that's a pretty complicated proposal, and it is far from clear that we want it
anyway. There is a security case to be made for *not* allowing the publisher to call all the shots, so such a proposal
wouldat best be an alternate mode of operation, not the one and only mode. 

>  * Why is RENAME a separate capability from CREATE/DROP?

I don't care enough to argue this point.  If you want me to remove RENAME privilege from the owner, I can resubmit with
itremoved.  It just doesn't seem like it's dangerous to allow a non-superuser to rename their subscriptions, so I saw
nocompelling reason to disallow it. 

CREATE clearly must be disallowed since it gives the creator the ability to form network connections, set fsync modes,
etc.,and there is no reason to assume arbitrary non-superusers should be able to do that. 

The argument against DROP is a bit weaker.  It doesn't seem like a user who cannot bring subscriptions into existence
shouldbe able to drop them either.  I was expecting to visit that issue in a follow-on patch which deals with
non-superuserpredefined roles that have some power to create and drop subscriptions.  What that patch will propose to
dois not obvious, since some of what you can do with subscriptions is so powerful we may not want non-superusers doing
it,even with a privileged role.  If you can't picture what I mean, consider that you might use a connection parameter
thatconnects outside and embeds data into the connection string, with a server listening on the other end, not really
topublish data, but to harvest the secret data that you are embedding in the network connection attempt. 

>  * What if you want to make the privileges more fine-grained, or make
> changes in the future? Ownership is a single bit, so it requires that
> everyone agree.

We can modify the patch to have the subscription owner have zero privileges on the subscription, not even the ability
tosee how it is defined, and just have "owner" mean the role under whose privileges the logical replication workers
applychanges.  Would that be better?  I would expect people to find that odd. 

The problem is that we want a setuid/setgid type behavior.  Actual setuid/setgid programs act as the user/group of the
executable. There's no reason that user/group needs to be one that any real human uses to log into the system.
Likewise,we need the subscription to act under a role, and we're establishing which role that is by having that role
ownthe subscription.  That is like how setuid/setgid programs work by executing as the user/group that owns the
executable,except that postgres doesn't have separate user/group concepts, just roles.  Isn't this design pattern
completelycommonplace? 

> Maybe some people want RENAME to be a part of that, and
> others don't.

Fair enough.  Should I remove RENAME from what the patch allows the owner to do?  On this particular point, I genuinely
don'tcare.  I think it can be reasonably argued either way. 

> GRANT seems to provide better answers here.

No, because we don't have infinite privilege bits to burn.

>> Since we're trying to make subscriptions into things that non-
>> superusers can use, we have to deal with some things in those
>> functions.
>
> I understand the use case where a superuser isn't required anywhere in
> the process, and some special users can create and own subscriptions. I
> also understand that's not what these patches are trying to accomplish
> (though v2-0003 seems like a good step in that direction).

There is a cart-before-the-horse problem here.  If I propose a patch with a privileged role for creating and owning
subscriptions*before* I tighten down how non-superuser-owned subscriptions work, that patch would surely be rejected.
SoI either propose this first, and only if/when it gets accepted, propose the other, or I propose them together.
That'sa damned-if-you-do--damned-if-you-dont situation, because if I propose them together, I'll get arguments that
theyare clearly separable and should be proposed separately, and if I do them one before the other, I'll get the
argumentthat you are making now.  I fully expect the privileged role proposal to be made (possibly by me), though it is
unclearif there will be time left to do it in v15. 

> I don't understand the use case as well where a non-superuser can
> merely "use" a subscription. I'm sure such use cases exist and I'm fine
> to go along with that idea, but I'd like to understand why ownership
> (partial ownership?) is the right way to do this and GRANT is the wrong
> way.

Even if we had the privilege bits to burn, no spelling of that GRANT idea sounds all that great:

    GRANT RUN AS ON subscription TO role;
    GRANT RUN AS ON role TO subscription;
    GRANT SUDO ON subscription TO role;
    GRANT SETUID ON role TO subscription;
    ...

I just don't see how that really works.  I'm not inclined to spend time being more clever, since I already know that
privilegebits are in short supply, but if you want to propose something, go ahead.  Elsewhere you proposed GRANT
REFRESHor something, not looking at that email just now, but that's not the same thing as GRANT RUN AS, and burns
anotherprivilege bit, and still doesn't get us all the way there, because you presumably also want GRANT RENAME, GRANT
ALTERCONNECTION SETTING, GRANT ALTER FSYNC SETTING, ..., and we're out of privilege bits before we're done. 

>> For example, ALTER SUBSCRIPTION can change the database connection
>> parameter, or the publication subscribed, or whether
>> synchronous_commit is used.  I don't see that a subscription owner
>> should necessarily be allowed to mess with that, at least not without
>> some other privilege checks beyond mere ownership.
>
> That violates my expectations of what "ownership" means.

I think that's because you're thinking of these settings as properties of the subscription.  You may *own* the
subscription,but the subscription doesn't *own* the right to make connections to arbitrary databases, nor *own* the
rightto change buffer cache settings, nor *own* the right to bring data from a publication on some other server which,
ifit existed on the local server, would violate site policy and possibly constitute a civil or criminal violation of
dataprivacy laws.  I may own my house, and the land it sits on, and my driveway, but that doesn't mean I own the
abilityto make my driveway go across my neighbor's field, down through town, and to the waterfront.  But that's the
kindof ownership definition you seem to be defending. 

Some of what I perceive as the screwiness of your argument I must admin is not your fault.  The properties of
subscriptionsare defined in ways that don't make sense to me.  It would be far more sensible if connection strings were
objectsin their own right, and you could grant USAGE on a connection string to a role, and USAGE on a subscription to a
role,and only if the subscription worker's role had privileges on the connection string could they use it as part of
fulfillingtheir task of replicating the data, and otherwise they'd error out in the attempt.  Likewise, fsync modes
couldbe proper objects, and only if the subscription's role had privileges on the fsync mode they wanted to use would
theybe able to use it.  But we don't have these things as proper objects, with acl lists on them, so we're stuck trying
todesign around that.  To my mind, that means subscription owners *do not own* properties associated with the
subscription. To your mind, that's not what "ownership" means.  What to do? 

>> I think this is pretty analogous to how security definer functions
>> work.
>
> The analogy to SECURITY DEFINER functions seems to support my
> suggestion for GRANT at least as much as your modified definition of
> ownership.

I don't see how.  Can you please explain?

>>> This would not address the weirdness of the existing code where a
>>> superuser loses their superuser privileges but still owns a
>>> subscription. But perhaps we can solve that a different way, like
>>> just
>>> performing a check when someone loses their superuser privileges
>>> that
>>> they don't own any subscriptions.
>>
>> I gave that a slight amount of thought during the design of this
>> patch, but didn't think we could refuse to revoke superuser on such a
>> basis,
>
> I don't necessarily see a problem there, but I could be missing
> something.

Close your eyes and imagine that I have superuser on your database... really picture it in your mind.  Now, do you want
therevoke command you are issuing to work? 

>> and didn't see what we should do with the subscription other than
>> have it continue to be owned by the recently-non-superuser.  If you
>> have a better idea, we can discuss it, but to some degree I think
>> that is also orthogonal to the purpose of this patch.  The only sense
>> in which this patch depends on that issue is that this patch proposes
>> that non-superuser subscription owners are already an issue, and
>> therefore that this patch isn't creating a new issue, but rather
>> making more sane something that already can happen.
>
> By introducing and documenting a way to get non-superusers to own a
> subscription, it makes it more likely that people will do it, and
> harder for us to change. That means the standard should be "this is
> what we really want", rather than just "more sane than before".

Ok, I'll wait to hear back from you on the points above.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Wed, Nov 17, 2021 at 11:56 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
>
> > On Nov 17, 2021, at 9:33 AM, Jeff Davis <pgsql@j-davis.com> wrote:
> >
>
> > This would not address the weirdness of the existing code where a
> > superuser loses their superuser privileges but still owns a
> > subscription. But perhaps we can solve that a different way, like just
> > performing a check when someone loses their superuser privileges that
> > they don't own any subscriptions.
>
> I gave that a slight amount of thought during the design of this patch, but didn't think we could refuse to revoke
superuseron such a basis, and didn't see what we should do with the subscription other than have it continue to be
ownedby the recently-non-superuser.  If you have a better idea, we can discuss it, but to some degree I think that is
alsoorthogonal to the purpose of this patch.  The only sense in which this patch depends on that issue is that this
patchproposes that non-superuser subscription owners are already an issue, and therefore that this patch isn't creating
anew issue, but rather making more sane something that already can happen. 
>

Don't we want to close this gap irrespective of the other part of the
feature? I mean if we take out the part of your 0003 patch that checks
whether the current user has permission to perform a particular
operation on the target table then the gap related to the owner losing
superuser privileges should be addressed.

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Thu, Nov 4, 2021 at 1:20 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
> > On Nov 1, 2021, at 10:58 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
> >
> > ALTER SUBSCRIPTION..[ENABLE | DISABLE] do not synchronously start or stop subscription workers.  The ALTER command
updatesthe catalog's subenabled field, but workers only lazily respond to that.  Disabling and enabling the
subscriptionas part of the OWNER TO would not reliably accomplish anything. 
>
> I have rethought my prior analysis.  The problem in the previous patch was that the subscription apply workers did
notcheck for a change in ownership the way they checked for other changes, instead only picking up the new ownership
informationwhen the worker restarted for some other reason.  This next patch set fixes that.  The application of a
changerecord may continue under the old ownership permissions when a concurrent command changes the ownership of the
subscription,but the worker will pick up the new permissions before applying the next record. 
>

Are you talking about the below change in the above paragraph?

@@ -2912,6 +2941,7 @@ maybe_reread_subscription(void)
  strcmp(newsub->slotname, MySubscription->slotname) != 0 ||
  newsub->binary != MySubscription->binary ||
  newsub->stream != MySubscription->stream ||
+ newsub->owner != MySubscription->owner ||
  !equal(newsub->publications, MySubscription->publications))
  {

If so, I am not sure how it will ensure that we check the ownership
change before applying each change? I think this will be invoked at
each transaction boundary, so, if there is a transaction with a large
number of changes, all the changes will be processed under the
previous owner.

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 18, 2021, at 2:50 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>> I gave that a slight amount of thought during the design of this patch, but didn't think we could refuse to revoke
superuseron such a basis, and didn't see what we should do with the subscription other than have it continue to be
ownedby the recently-non-superuser.  If you have a better idea, we can discuss it, but to some degree I think that is
alsoorthogonal to the purpose of this patch.  The only sense in which this patch depends on that issue is that this
patchproposes that non-superuser subscription owners are already an issue, and therefore that this patch isn't creating
anew issue, but rather making more sane something that already can happen. 
>>
>
> Don't we want to close this gap irrespective of the other part of the
> feature? I mean if we take out the part of your 0003 patch that checks
> whether the current user has permission to perform a particular
> operation on the target table then the gap related to the owner losing
> superuser privileges should be addressed.

I don't think there is a gap.  The patch does the right thing, causing the subscription whose owner has had superuser
revokedto itself no longer function with superuser privileges.  Whether that causes the subscription to fail depends on
whetherthe previously-superuser now non-superuser owner now lacks sufficient privileges on the target relation(s).  I
thinkremoving that part of the patch would be a regression. 

Let's compare two scenarios.  In the first, we have a regular user "alice" who owns a subscription which replicates
intotable "accounting.receipts" for which she has been granted privileges by the table's owner.  What would you expect
tohappen after the table's owner revokes privileges from alice?  I would expect that the subscription can no longer
function,and periodic attempts to replicate into that table result in permission denied errors in the logs. 

In the second, we have a superuser "alice" who owns a subscription that replicates into table "accounting.receipts",
andshe only has sufficient privileges to modify "accounting.receipts" by virtue of being superuser.  I would expect
thatwhen she has superuser revoked, the subscription can likewise no longer function.   

Now, maybe I'm wrong in both cases, and both should continue to function.  But I would find it really strange if the
firstsituation behaved differently from the second. 

I think intuitions about how subscriptions behave differ depending on the reason you expect the subscription to be
ownedby a particular user.  If the reason the user owns the subscription is that the user just happens to be the user
whocreated it, but isn't in your mind associated with the subscription, then having the subscription continue to
functionregardless of what happens to the user, even the user being dropped, is probably consistent with your
expectations. In a sense, you think of the user who creates the subscription as having gifted it to the universe rather
thancontinuing to own it.  Or perhaps you think of the creator of the subscription as a solicitor/lawyer/agent working
onbehalf of client, and once that legal transaction is completed, you don't expect the lawyer being disbarred should
impactthe subscription which exists for the benefit of the client. 

If instead you think about the subscription owner as continuing to be closely associated with the subscription (as I
do),then you expect changes in the owner's permissions to impact the subscription. 

I think the "gifted to the universe"/"lawyer" mental model is not consistent with how the system is already designed to
work. You can't drop the subscription's owner without first running REASSIGN OWNED, or ALTER SUBSCRIPTION..OWNER TO, or
simplydropping the subscription: 

  DROP ROLE regress_subscription_user;
  ERROR:  role "regress_subscription_user" cannot be dropped because some objects depend on it
  DETAIL:  owner of subscription regress_testsub


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 18, 2021, at 3:37 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>> I have rethought my prior analysis.  The problem in the previous patch was that the subscription apply workers did
notcheck for a change in ownership the way they checked for other changes, instead only picking up the new ownership
informationwhen the worker restarted for some other reason.  This next patch set fixes that.  The application of a
changerecord may continue under the old ownership permissions when a concurrent command changes the ownership of the
subscription,but the worker will pick up the new permissions before applying the next record. 
>>
>
> Are you talking about the below change in the above paragraph?
>
> @@ -2912,6 +2941,7 @@ maybe_reread_subscription(void)
>  strcmp(newsub->slotname, MySubscription->slotname) != 0 ||
>  newsub->binary != MySubscription->binary ||
>  newsub->stream != MySubscription->stream ||
> + newsub->owner != MySubscription->owner ||
>  !equal(newsub->publications, MySubscription->publications))
>  {
>
> If so, I am not sure how it will ensure that we check the ownership
> change before applying each change? I think this will be invoked at
> each transaction boundary, so, if there is a transaction with a large
> number of changes, all the changes will be processed under the
> previous owner.

Yes, your analysis appears correct.  I was sloppy to say "before applying the next record".  It will pick up the change
beforeapplying the next transaction. 

The prior version of the patch only picked up the change if it happened to start a new worker, but could process
multipletransactions without noticing the change.  Now, it is limited to finishing the current transaction.  Would you
preferthat the worker noticed the change in ownership and aborted the transaction on the subscriber side?  Or should
theALTER SUBSCRIPTION..OWNER TO block?  I don't see much advantage to either of those options, but I also don't think I
haveany knock-down argument for my approach either.  What do you think? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2021-11-17 at 16:17 -0800, Mark Dilger wrote:
>  You must choose a single role you want the subscription to run
> under.

I think that was the source of a lot of my confusion: 

Your patches are creating the notion of a "run as" user by assigning
ownership-that-isn't-really-ownership. I got distracted wondering why
you would really care if some user could enable/disable/refresh/rename
a subscription, but the main point was to change who the subscription
runs as.

That's a more general idea: I could see how "run as" could apply to
subscriptions as well as functions (right now it can only run as the
owner or the invoker, not an arbitrary role). And I better understand
your analogy to security definer now.

But it's also not exactly a simple idea, and I think the current
patches oversimplify it and conflate it with ownership. 

> I think the longer term plan is that non-superusers who have some
> privileged role will be allowed to create subscriptions,

You earlier listed some challenges with that:


https://postgr.es/m/CF56AC0D-7495-4E8D-A48F-FF38BD8074EB@enterprisedb.com

But it seems like it's really the right direction to go. Probably the
biggest concern is connection strings that read server files, but
dblink solved that by requiring password auth.

What are the reasonable steps to get there? Do you think anything is
doable for v15?

> There is a cart-before-the-horse problem here.

I don't think we need to hold up v2-0003. It seems like a step in the
right direction, though I haven't looked closely yet.

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2021-11-17 at 16:17 -0800, Mark Dilger wrote:
> Some of what I perceive as the screwiness of your argument I must
> admin is not your fault.  The properties of subscriptions are defined
> in ways that don't make sense to me.  It would be far more sensible
> if connection strings were objects in their own right, and you could
> grant USAGE on a connection string to a role,

We sort of have that with CREATE SERVER, in fact dblink can use a
server instead of a string. 

Regards,
    Jeff Davis






Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Thu, Nov 18, 2021 at 9:03 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
>
> > On Nov 18, 2021, at 2:50 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >> I gave that a slight amount of thought during the design of this patch, but didn't think we could refuse to revoke
superuseron such a basis, and didn't see what we should do with the subscription other than have it continue to be
ownedby the recently-non-superuser.  If you have a better idea, we can discuss it, but to some degree I think that is
alsoorthogonal to the purpose of this patch.  The only sense in which this patch depends on that issue is that this
patchproposes that non-superuser subscription owners are already an issue, and therefore that this patch isn't creating
anew issue, but rather making more sane something that already can happen. 
> >>
> >
> > Don't we want to close this gap irrespective of the other part of the
> > feature? I mean if we take out the part of your 0003 patch that checks
> > whether the current user has permission to perform a particular
> > operation on the target table then the gap related to the owner losing
> > superuser privileges should be addressed.
>
> I don't think there is a gap.  The patch does the right thing, causing the subscription whose owner has had superuser
revokedto itself no longer function with superuser privileges.  Whether that causes the subscription to fail depends on
whetherthe previously-superuser now non-superuser owner now lacks sufficient privileges on the target relation(s).  I
thinkremoving that part of the patch would be a regression. 
>

I think we are saying the same thing. I intend to say that your 0003*
patch closes the current gap in the code and we should consider
applying it irrespective of what we do with respect to changing the
... OWNER TO .. behavior. Is there a reason why 0003* patch (or
something on those lines) shouldn't be considered to be applied?

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Thu, Nov 18, 2021 at 9:15 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
>
> The prior version of the patch only picked up the change if it happened to start a new worker, but could process
multipletransactions without noticing the change.  Now, it is limited to finishing the current transaction.  Would you
preferthat the worker noticed the change in ownership and aborted the transaction on the subscriber side?  Or should
theALTER SUBSCRIPTION..OWNER TO block?  I don't see much advantage to either of those options, but I also don't think I
haveany knock-down argument for my approach either.  What do you think? 
>

How about allowing to change ownership only for disabled
subscriptions? Basically, users need to first disable the subscription
and then change its ownership. Now, disabling is an asynchronous
operation but we won't allow the ownership change command to proceed
unless the subscription is marked disabled and all the apply/sync
workers are not running. After the ownership is changed, users can
enable it. We already have 'slot_name' parameter's dependency on
whether the subscription is marked enabled or not.

This will add some steps in changing the ownership of a subscription
but I think it will be predictable.

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Fri, Nov 19, 2021 at 12:00 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Wed, 2021-11-17 at 16:17 -0800, Mark Dilger wrote:
> >  You must choose a single role you want the subscription to run
> > under.
>
> I think that was the source of a lot of my confusion:
>
> Your patches are creating the notion of a "run as" user by assigning
> ownership-that-isn't-really-ownership. I got distracted wondering why
> you would really care if some user could enable/disable/refresh/rename
> a subscription, but the main point was to change who the subscription
> runs as.
>
> That's a more general idea: I could see how "run as" could apply to
> subscriptions as well as functions (right now it can only run as the
> owner or the invoker, not an arbitrary role). And I better understand
> your analogy to security definer now.
>

I was thinking why not separate the ownership and "run as" privileges
for the subscriptions? We can introduce a new syntax in addition to
the current syntax for "Owner" for this as:

Create Subscription sub RUNAS <role_name> ...;
Alter Subscription sub RUNAS <role_name>

Now, RUNAS role will be used to apply changes and perform the initial
table sync. The owner will be tied to changing some of the other
properties (enabling, disabling, etc.) of the subscription. Now, we
still need a superuser to create subscription and change properties
like CONNECTION for a subscription but we can solve that by having
roles specific to it as being indicated by Mark in some of his
previous emails.

-- 
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 19, 2021, at 1:44 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I think we are saying the same thing. I intend to say that your 0003*
> patch closes the current gap in the code and we should consider
> applying it irrespective of what we do with respect to changing the
> ... OWNER TO .. behavior. Is there a reason why 0003* patch (or
> something on those lines) shouldn't be considered to be applied?

Jeff Davis and I had a long conversation off-list yesterday and reached the same conclusion.  I will be submitting a
versionof 0003 which does not depend on the prior two patches. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 19, 2021, at 1:56 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> How about allowing to change ownership only for disabled
> subscriptions? Basically, users need to first disable the subscription
> and then change its ownership.

There are some open issues about non-superuser owners that Jeff would like to address before allowing transfers of
ownershipto non-superusers.  Your proposal about requiring the subscription to be disabled seems reasonable to me, but
I'dlike to see how it would interact with whatever Jeff proposes.  So I think I will change the patch as you suggest,
butconsider it a WIP patch until then. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 19, 2021, at 2:23 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I was thinking why not separate the ownership and "run as" privileges
> for the subscriptions? We can introduce a new syntax in addition to
> the current syntax for "Owner" for this as:
>
> Create Subscription sub RUNAS <role_name> ...;
> Alter Subscription sub RUNAS <role_name>
>
> Now, RUNAS role will be used to apply changes and perform the initial
> table sync. The owner will be tied to changing some of the other
> properties (enabling, disabling, etc.) of the subscription. Now, we
> still need a superuser to create subscription and change properties
> like CONNECTION for a subscription but we can solve that by having
> roles specific to it as being indicated by Mark in some of his
> previous emails.

I feel disquieted about the "runas" idea.  I can't really say why yet.  Maybe it is ok, but it feels like a larger
designdecision than just an implementation detail about how subscriptions work.  We should consider if we won't soon be
doingthe same thing for other parts of the system.  If so, we should choose a solution that makes sense when applied
morebroadly. 

Security definer functions could benefit from splitting the owner from the runas role.

Event triggers might benefit from having a runas role.  Currently, event triggers are always owned by superusers, but
we'vediscussed allowing non-superuser owners.  That proposal still has outstanding issues to be resolved, so I can't be
sureif runas would be helpful, but it might. 

Table triggers might benefit from having a runas role.  I don't have a proposal here, just an intuition that we should
thinkabout this before designing how "runas" works. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 19, 2021, at 7:25 AM, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
> Jeff Davis and I had a long conversation off-list yesterday and reached the same conclusion.  I will be submitting a
versionof 0003 which does not depend on the prior two patches. 

Renamed as 0001 in version 3, as it is the only remaining patch.  For anyone who reviewed the older patch set, please
notethat I made some changes to the src/test/subscription/t/026_nosuperuser.pl test case relative to the prior version. 


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Fri, 2021-11-19 at 16:45 -0800, Mark Dilger wrote:
> Renamed as 0001 in version 3, as it is the only remaining patch.  For
> anyone who reviewed the older patch set, please note that I made some
> changes to the src/test/subscription/t/026_nosuperuser.pl test case
> relative to the prior version.

We need to do permission checking for WITH CHECK OPTION and RLS. The
patch right now allows the subscription to write data that an RLS
policy forbids.

A couple other points:

 * We shouldn't refer to the behavior of previous versions in the docs
unless there's a compelling reason
 * Do we need to be smarter about partitioned tables, where an insert
can turn into an update?
 * Should we refactor to borrow logic from ExecInsert so that it's less
likely that we miss something in the future?

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Thu, Nov 25, 2021 at 6:00 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Fri, 2021-11-19 at 16:45 -0800, Mark Dilger wrote:
> > Renamed as 0001 in version 3, as it is the only remaining patch.  For
> > anyone who reviewed the older patch set, please note that I made some
> > changes to the src/test/subscription/t/026_nosuperuser.pl test case
> > relative to the prior version.
>
> We need to do permission checking for WITH CHECK OPTION and RLS. The
> patch right now allows the subscription to write data that an RLS
> policy forbids.
>

Won't it be better to just check if the current user is superuser
before applying each change as a matter of this first patch? Sorry, I
was under impression that first, we want to close the current gap
where we allow to proceed with replication if the user's superuser
privileges were revoked during replication. To allow non-superusers
owners, I thought it might be better to first try to detect the change
of ownership as soon as possible instead of at the transaction
boundary.

-- 
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Thu, 2021-11-25 at 09:51 +0530, Amit Kapila wrote:
> Won't it be better to just check if the current user is superuser
> before applying each change as a matter of this first patch? Sorry, I
> was under impression that first, we want to close the current gap
> where we allow to proceed with replication if the user's superuser
> privileges were revoked during replication.

That could be a first step, and I don't oppose it. But it seems like a
very small first step that would be made obsolete when v3-0001 is
ready, which I think will be very soon.

>  To allow non-superusers
> owners, I thought it might be better to first try to detect the
> change
> of ownership

In the case of revoked superuser privileges, there's no change in
ownership, just a change of privileges (SUPERUSER -> NOSUPERUSER). And
if we're detecting a change of privileges, why not just do it in
something closer to the right way, which is what v3-0001 is attempting
to do.

>  as soon as possible instead of at the transaction
> boundary.

I don't understand why it's important to detect a loss of privileges
faster than a transaction boundary. Can you elaborate?

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 24, 2021, at 4:30 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> We need to do permission checking for WITH CHECK OPTION and RLS. The
> patch right now allows the subscription to write data that an RLS
> policy forbids.

Thanks for reviewing and for this observation!  I can verify that RLS is not being honored on the subscriber side.  I
agreethis is a problem for subscriptions owned by non-superusers. 

The implementation of the table sync worker uses COPY FROM, which makes this problem hard to fix, because COPY FROM
doesnot support row level security.  We could do some work to honor the RLS policies during the apply workers' INSERT
statements,but then some data would circumvent RLS during table sync and other data would honor RLS during worker
apply,which would make the implementation not only wrong but inconsistently so. 

I think a more sensible approach for v15 is to raise an ERROR if a non-superuser owned subscription is trying to
replicateinto a table which has RLS enabled.  We might try to be more clever and check whether the RLS policies could
possiblyreject the operation (by comparing the TO and FOR clauses of the policies against the role and operation type)
butthat seems like a partial re-implementation of RLS.  It would be simpler and more likely correct if we just
unconditionallyreject replicating into tables which have RLS enabled. 

What do you think?

> A couple other points:
>
> * We shouldn't refer to the behavior of previous versions in the docs
> unless there's a compelling reason

Fair enough.

> * Do we need to be smarter about partitioned tables, where an insert
> can turn into an update?

Do you mean an INSERT statement with an ON CONFLICT DO UPDATE clause that is running against a partitioned table?  If
so,I don't think we need to handle that on the subscriber side under the current logical replication design.  I would
expectthe plain INSERT or UPDATE that ultimately executes on the publisher to be what gets replicated to the
subscriber,and not the original INSERT .. ON CONFLICT DO UPDATE statement. 

> * Should we refactor to borrow logic from ExecInsert so that it's less
> likely that we miss something in the future?

Hooking into the executor at a higher level, possibly ExecInsert or ExecModifyTable would do a lot more than what
logicalreplication currently does.  If we also always used INSERT/UPDATE/DELETE statements and never COPY FROM
statements,we might solve several problems at once, including honoring RLS policies and honoring rules defined for the
targettable on the subscriber side. 

Doing this would clearly be a major design change, and possibly one we do not want.  Can we consider this out of scope?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Fri, Nov 26, 2021 at 1:36 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> >  as soon as possible instead of at the transaction
> > boundary.
>
> I don't understand why it's important to detect a loss of privileges
> faster than a transaction boundary. Can you elaborate?
>

The first reason is that way it would be consistent with what we can
see while doing the operations from the backend. For example, if we
revoke privileges from the user during the transaction, the results
will be reflected.
postgres=> Begin;
BEGIN
postgres=*> insert into t1 values(1);
INSERT 0 1
postgres=*> insert into t1 values(2);
ERROR:  permission denied for table t1

In this case, after the first insert, I have revoked the privileges of
the user from table t1 and the same is reflected in the very next
operation. Another reason is to make behavior predictable as users can
always expect when exactly the privilege change will be reflected and
it won't depend on the number of changes in the transaction.

-- 
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Sat, Nov 27, 2021 at 11:37 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
>
>
>
> > On Nov 24, 2021, at 4:30 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> >
> > We need to do permission checking for WITH CHECK OPTION and RLS. The
> > patch right now allows the subscription to write data that an RLS
> > policy forbids.
>
> Thanks for reviewing and for this observation!  I can verify that RLS is not being honored on the subscriber side.  I
agreethis is a problem for subscriptions owned by non-superusers. 
>
...
>
> > A couple other points:
> >
>
> > * Do we need to be smarter about partitioned tables, where an insert
> > can turn into an update?
>
> Do you mean an INSERT statement with an ON CONFLICT DO UPDATE clause that is running against a partitioned table?  If
so,I don't think we need to handle that on the subscriber side under the current logical replication design.  I would
expectthe plain INSERT or UPDATE that ultimately executes on the publisher to be what gets replicated to the
subscriber,and not the original INSERT .. ON CONFLICT DO UPDATE statement. 
>

Yeah, that is correct but I think the update case is more relevant
here. In ExecUpdate(), we convert Update to DELETE+INSERT when the
partition constraint is failed whereas, on the subscriber-side, it
will simply fail in this case. It is not clear to me how that is
directly related to this patch but surely it will be a good
improvement on its own and might help if that requires us to change
some infrastructure here like hooking into executor at a higher level.

> > * Should we refactor to borrow logic from ExecInsert so that it's less
> > likely that we miss something in the future?
>
> Hooking into the executor at a higher level, possibly ExecInsert or ExecModifyTable would do a lot more than what
logicalreplication currently does.  If we also always used INSERT/UPDATE/DELETE statements and never COPY FROM
statements,we might solve several problems at once, including honoring RLS policies and honoring rules defined for the
targettable on the subscriber side. 
>
> Doing this would clearly be a major design change, and possibly one we do not want.  Can we consider this out of
scope?
>

I agree that if we want to do all of this then that would require a
lot of changes. However, giving an error for RLS-enabled tables might
also be too restrictive. The few alternatives could be that (a) we
allow subscription owners to be either have "bypassrls" attribute or
they could be superusers. (b) don't allow initial table_sync for rls
enabled tables. (c) evaluate/analyze what is required to allow Copy
From to start respecting RLS policies. (d) reject replicating any
changes to tables that have RLS enabled.

I see that you are favoring (d) which clearly has merits like lesser
code/design change but not sure if that is the best way forward or we
can do something better than that either by following one of (a), (b),
(c), or something less restrictive than (d).

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 28, 2021, at 9:56 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> In ExecUpdate(), we convert Update to DELETE+INSERT when the
> partition constraint is failed whereas, on the subscriber-side, it
> will simply fail in this case. It is not clear to me how that is
> directly related to this patch but surely it will be a good
> improvement on its own and might help if that requires us to change
> some infrastructure here like hooking into executor at a higher level.

I would rather get a fix for non-superuser subscription owners committed than expand the scope of work and have this
patchlinger until the v16 development cycle.  This particular DELETE+INSERT problem sounds important but unrelated and
outof scope. 

> I agree that if we want to do all of this then that would require a
> lot of changes. However, giving an error for RLS-enabled tables might
> also be too restrictive. The few alternatives could be that (a) we
> allow subscription owners to be either have "bypassrls" attribute or
> they could be superusers. (b) don't allow initial table_sync for rls
> enabled tables. (c) evaluate/analyze what is required to allow Copy
> From to start respecting RLS policies. (d) reject replicating any
> changes to tables that have RLS enabled.
>
> I see that you are favoring (d) which clearly has merits like lesser
> code/design change but not sure if that is the best way forward or we
> can do something better than that either by following one of (a), (b),
> (c), or something less restrictive than (d).

I was favoring option (d) only when RLS policies exist for one or more of the target relations.

Skipping the table_sync step while replicating tables that have RLS policies for subscriptions that are owned by users
wholack bypassrls is interesting.  If we make that work, it will be a more complete solution than option (d). 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Mon, 2021-11-29 at 08:26 -0800, Mark Dilger wrote:
> > On Nov 28, 2021, at 9:56 PM, Amit Kapila <amit.kapila16@gmail.com>
> > wrote:
> > 
> > In ExecUpdate(), we convert Update to DELETE+INSERT when the
> > partition constraint is failed whereas, on the subscriber-side, it
> > will simply fail in this case.

Thank you, yes, that's the more important case.

> This particular DELETE+INSERT problem sounds important but unrelated
> and out of scope.

+1

> > I agree that if we want to do all of this then that would require a
> > lot of changes. However, giving an error for RLS-enabled tables
> > might
> > also be too restrictive. The few alternatives could be that (a) we
> > allow subscription owners to be either have "bypassrls" attribute
> > or
> > they could be superusers. (b) don't allow initial table_sync for
> > rls
> > enabled tables. (c) evaluate/analyze what is required to allow Copy
> > From to start respecting RLS policies. (d) reject replicating any
> > changes to tables that have RLS enabled.

Maybe a combination?

Allow subscriptions with copy_data=true iff the subscription owner is
bypassrls or superuser. And then enforce RLS+WCO during
insert/update/delete.

I don't think it's a big change (correct me if I'm wrong), and it
allows good functionality now, and room to improve in the future if we
want to bring in more of ExecInsert into logical replication.

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Mon, 2021-11-29 at 09:43 +0530, Amit Kapila wrote:
> The first reason is that way it would be consistent with what we can
> see while doing the operations from the backend.

Logical replication is not interactive, so it doesn't seem quite the
same.

If you have a long running INSERT INTO SELECT or COPY FROM, the
permission checks just happen at the beginning. As a user, it wouldn't
surprise me if logical replication was similar.

> operation. Another reason is to make behavior predictable as users
> can
> always expect when exactly the privilege change will be reflected and
> it won't depend on the number of changes in the transaction.

This patch does detect ownership changes more quickly (at the
transaction boundary) than the current code (only when it reloads for
some other reason). Transaction boundary seems like a reasonable time
to detect the change to me.

Detecting faster might be nice, but I don't have a strong opinion about
it and I don't see why it necessarily needs to happen before this patch
goes in.

Also, do you think the cost of doing maybe_reread_subscription() per-
tuple instead of per-transaction would be detectable? If we lock
ourselves into semantics that detect changes quickly, it will be harder
to optimize the per-tuple path later.

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Mon, Nov 29, 2021 at 11:52 PM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Mon, 2021-11-29 at 08:26 -0800, Mark Dilger wrote:
>
> > > I agree that if we want to do all of this then that would require a
> > > lot of changes. However, giving an error for RLS-enabled tables
> > > might
> > > also be too restrictive. The few alternatives could be that (a) we
> > > allow subscription owners to be either have "bypassrls" attribute
> > > or
> > > they could be superusers. (b) don't allow initial table_sync for
> > > rls
> > > enabled tables. (c) evaluate/analyze what is required to allow Copy
> > > From to start respecting RLS policies. (d) reject replicating any
> > > changes to tables that have RLS enabled.
>
> Maybe a combination?
>
> Allow subscriptions with copy_data=true iff the subscription owner is
> bypassrls or superuser. And then enforce RLS+WCO during
> insert/update/delete.
>

Yeah, that sounds reasonable to me.

> I don't think it's a big change (correct me if I'm wrong),
>

Yeah, I also don't think it should be a big change.

-- 
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Tue, Nov 30, 2021 at 12:56 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Mon, 2021-11-29 at 09:43 +0530, Amit Kapila wrote:
> > The first reason is that way it would be consistent with what we can
> > see while doing the operations from the backend.
>
> Logical replication is not interactive, so it doesn't seem quite the
> same.
>
> If you have a long running INSERT INTO SELECT or COPY FROM, the
> permission checks just happen at the beginning. As a user, it wouldn't
> surprise me if logical replication was similar.
>
> > operation. Another reason is to make behavior predictable as users
> > can
> > always expect when exactly the privilege change will be reflected and
> > it won't depend on the number of changes in the transaction.
>
> This patch does detect ownership changes more quickly (at the
> transaction boundary) than the current code (only when it reloads for
> some other reason). Transaction boundary seems like a reasonable time
> to detect the change to me.
>
> Detecting faster might be nice, but I don't have a strong opinion about
> it and I don't see why it necessarily needs to happen before this patch
> goes in.
>

I think it would be better to do it before we allow subscription
owners to be non-superusers.

> Also, do you think the cost of doing maybe_reread_subscription() per-
> tuple instead of per-transaction would be detectable?
>

Yeah, it is possible that is why I suggested in one of the emails
above to allow changing the owners only for disabled subscriptions.

-- 
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Tue, 2021-11-30 at 17:25 +0530, Amit Kapila wrote:
> I think it would be better to do it before we allow subscription
> owners to be non-superusers.

There are a couple other things to consider before allowing non-
superusers to create subscriptions anyway. For instance, a non-
superuser shouldn't be able to use a connection string that reads the
certificate file from the server unless they also have
pg_read_server_files privs.

> Yeah, it is possible that is why I suggested in one of the emails
> above to allow changing the owners only for disabled subscriptions.

The current patch detects the following cases at the transaction
boundary:

 * ALTER SUBSCRIPTION ... OWNER TO ...
 * ALTER ROLE ... NOSUPERUSER
 * privileges revoked one way or another (aside from the RLS/WCO
problems, which will be fixed)

If we want to detect at row boundaries we need to capture all of those
cases too, or else we're being inconsistent. The latter two cannot be
tied to whether the subscription is disabled or not, so I don't think
that's a complete solution.

How about (as a separate patch) we just do maybe_reread_subscription()
every K operations within a transaction? That would speed up
permissions errors if a revoke happens.

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Wed, Dec 1, 2021 at 2:12 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Tue, 2021-11-30 at 17:25 +0530, Amit Kapila wrote:
> > I think it would be better to do it before we allow subscription
> > owners to be non-superusers.
>
> There are a couple other things to consider before allowing non-
> superusers to create subscriptions anyway. For instance, a non-
> superuser shouldn't be able to use a connection string that reads the
> certificate file from the server unless they also have
> pg_read_server_files privs.
>

Isn't allowing to create subscriptions via non-superusers and allowing
to change the owner two different things? I am under the impression
that the latter one is more towards allowing the workers to apply
changes with a non-superuser role.

-- 
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Dec 1, 2021, at 5:36 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 1, 2021 at 2:12 AM Jeff Davis <pgsql@j-davis.com> wrote:
>>
>> On Tue, 2021-11-30 at 17:25 +0530, Amit Kapila wrote:
>>> I think it would be better to do it before we allow subscription
>>> owners to be non-superusers.
>>
>> There are a couple other things to consider before allowing non-
>> superusers to create subscriptions anyway. For instance, a non-
>> superuser shouldn't be able to use a connection string that reads the
>> certificate file from the server unless they also have
>> pg_read_server_files privs.
>>
>
> Isn't allowing to create subscriptions via non-superusers and allowing
> to change the owner two different things? I am under the impression
> that the latter one is more towards allowing the workers to apply
> changes with a non-superuser role.

The short-term goal is to have logical replication workers respect the privileges of the role which owns the
subscription.

The long-term work probably includes creating a predefined role with permission to create subscriptions, and the
abilityto transfer those subscriptions to roles who might be neither superuser nor members of any particular predefined
role;the idea being that logical replication subscriptions can be established without any superuser involvement, and
maythereafter run without any special privilege. 

The more recent patches on this thread are not as ambitious as the earlier patch-sets.  We are no longer trying to
supporttransferring subscriptions to non-superusers. 

Right now, on HEAD, if a subscription owner has superuser revoked, the subscription can continue to operate as
superuserin so far as its replication actions are concerned.  That seems like a pretty big security hole. 

This patch mostly plugs that hole by adding permissions checks, so that a subscription owned by a role who has
privilegesrevoked cannot (for the most part) continue to act under the old privileges. 

There are two problematic edge cases that can occur after transfer of ownership.  Remember, the new owner is required
tobe superuser for the transfer of ownership to occur. 

1) A subscription is transferred to a new owner, and the new owner then has privilege revoked.

2) A subscription is transferred to a new owner, and then the old owner has privileges increased.

In both cases, a currently running logical replication worker may finish a transaction in progress acting with the
currentprivileges of the old owner.  The clearest solution is, as you suggest, to refuse transfer of ownership of
subscriptionsthat are enabled. 

Doing so will create a failure case for REASSIGN OWNED BY.  Will that be ok?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Thu, Dec 2, 2021 at 12:51 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
>
>
> > On Dec 1, 2021, at 5:36 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Dec 1, 2021 at 2:12 AM Jeff Davis <pgsql@j-davis.com> wrote:
> >>
> >> On Tue, 2021-11-30 at 17:25 +0530, Amit Kapila wrote:
> >>> I think it would be better to do it before we allow subscription
> >>> owners to be non-superusers.
> >>
> >> There are a couple other things to consider before allowing non-
> >> superusers to create subscriptions anyway. For instance, a non-
> >> superuser shouldn't be able to use a connection string that reads the
> >> certificate file from the server unless they also have
> >> pg_read_server_files privs.
> >>
> >
> > Isn't allowing to create subscriptions via non-superusers and allowing
> > to change the owner two different things? I am under the impression
> > that the latter one is more towards allowing the workers to apply
> > changes with a non-superuser role.
>
> The short-term goal is to have logical replication workers respect the privileges of the role which owns the
subscription.
>
> The long-term work probably includes creating a predefined role with permission to create subscriptions, and the
abilityto transfer those subscriptions to roles who might be neither superuser nor members of any particular predefined
role;the idea being that logical replication subscriptions can be established without any superuser involvement, and
maythereafter run without any special privilege. 
>
> The more recent patches on this thread are not as ambitious as the earlier patch-sets.  We are no longer trying to
supporttransferring subscriptions to non-superusers. 
>
> Right now, on HEAD, if a subscription owner has superuser revoked, the subscription can continue to operate as
superuserin so far as its replication actions are concerned.  That seems like a pretty big security hole. 
>
> This patch mostly plugs that hole by adding permissions checks, so that a subscription owned by a role who has
privilegesrevoked cannot (for the most part) continue to act under the old privileges. 
>

If we want to maintain the property that subscriptions can only be
owned by superuser for your first version then isn't a simple check
like ((!superuser()) for each of the operations is sufficient?

> There are two problematic edge cases that can occur after transfer of ownership.  Remember, the new owner is required
tobe superuser for the transfer of ownership to occur. 
>
> 1) A subscription is transferred to a new owner, and the new owner then has privilege revoked.
>
> 2) A subscription is transferred to a new owner, and then the old owner has privileges increased.
>

In (2), I am not clear what do you mean by "the old owner has
privileges increased"? If the owners can only be superusers then what
does it mean to increase the privileges.

> In both cases, a currently running logical replication worker may finish a transaction in progress acting with the
currentprivileges of the old owner.  The clearest solution is, as you suggest, to refuse transfer of ownership of
subscriptionsthat are enabled. 
>
> Doing so will create a failure case for REASSIGN OWNED BY.  Will that be ok?
>

I think so. Do we see any problem with that? I think we have some
failure cases currently as well like "All Tables Publication" can only
be owned by superusers whereas ownership for others can be to
non-superusers and similarly we can't change ownership for pinned
objects. I think the case being discussed is not exactly the same but
I am not able to see a problem with it.

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Fri, Dec 3, 2021 at 10:37 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
>
> > On Dec 2, 2021, at 1:29 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > If we want to maintain the property that subscriptions can only be
> > owned by superuser for your first version then isn't a simple check
> > like ((!superuser()) for each of the operations is sufficient?
>
> As things stand today, nothing prevents a superuser subscription owner from having superuser revoked.  The patch does
nothingto change this.
 
>

I understand that but won't that get verified when we look up the
information in pg_authid as part of superuser() check?

-- 
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Dec 6, 2021, at 2:19 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>>> If we want to maintain the property that subscriptions can only be
>>> owned by superuser

We don't want to maintain such a property, or at least, that's not what I want.  I don't think that's what Jeff wants,
either.

To clarify, I'm not entirely sure how to interpret the verb "maintain" in your question, since before the patch the
propertydoes not exist, and after the patch, it continues to not exist.  We could *add* such a property, of course,
thoughthis patch does not attempt any such thing. 

> I understand that but won't that get verified when we look up the
> information in pg_authid as part of superuser() check?

If we added a superuser() check, then yes, but that would take things in a direction I do not want to go.

As I perceive the roadmap:

1) Fix the current bug wherein subscription changes are applied with superuser force after the subscription owner has
superuserprivileges revoked. 
2) Allow the transfer of subscriptions to non-superuser owners.
3) Allow the creation of subscriptions by non-superusers who are members of some as yet to be created predefined role,
say"pg_create_subscriptions" 

I may be wrong, but it sounds like you interpret the intent of this patch as enforcing superuserness.  That's not so.
Thispatch intends to correctly handle the situation where a subscription is owned by a non-superuser (task 1, above)
withoutgoing so far as creating new paths by which that situation could arise (tasks 2 and 3, above). 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Ronan Dunklau
Date:
Le lundi 6 décembre 2021, 16:56:56 CET Mark Dilger a écrit :
> > On Dec 6, 2021, at 2:19 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>> If we want to maintain the property that subscriptions can only be
> >>> owned by superuser
>
> We don't want to maintain such a property, or at least, that's not what I
> want.  I don't think that's what Jeff wants, either.

That's not what I want either: the ability to run and refresh subscriptions as
a non superuser is a desirable feature.

The REFRESH part was possible before PG 14, when it was allowed to run REFRESH
in a function, which could be made to run as security definer.


> As I perceive the roadmap:
>
> 1) Fix the current bug wherein subscription changes are applied with
> superuser force after the subscription owner has superuser privileges
> revoked. 2) Allow the transfer of subscriptions to non-superuser owners.
> 3) Allow the creation of subscriptions by non-superusers who are members of
> some as yet to be created predefined role, say "pg_create_subscriptions"

This roadmap seems sensible.

--
Ronan Dunklau





Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Mon, Dec 6, 2021 at 9:26 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
> > On Dec 6, 2021, at 2:19 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >>> If we want to maintain the property that subscriptions can only be
> >>> owned by superuser
>
> We don't want to maintain such a property, or at least, that's not what I want.  I don't think that's what Jeff
wants,either. 
>
> To clarify, I'm not entirely sure how to interpret the verb "maintain" in your question, since before the patch the
propertydoes not exist, and after the patch, it continues to not exist.  We could *add* such a property, of course,
thoughthis patch does not attempt any such thing. 
>

Okay, let me try to explain again. Following is the text from docs
[1]: " (a) To create a subscription, the user must be a superuser. (b)
The subscription apply process will run in the local database with the
privileges of a superuser. (c) Privileges are only checked once at the
start of a replication connection. They are not re-checked as each
change record is read from the publisher, nor are they re-checked for
each change when applied.

My understanding is that we want to improve what is written as (c)
which I think is the same as what you mentioned later as "Fix the
current bug wherein subscription changes are applied with superuser
force after the subscription owner has superuser privileges revoked.".
Am I correct till here? If so, I think what I am suggesting should fix
this with the assumption that we still want to follow (b) at least for
the first patch. One possibility is that our understanding of the
first problem is the same but you want to allow apply worker running
even when superuser privileges are revoked provided the user with
which it is running has appropriate privileges on the objects being
accessed by apply worker.

We will talk about other points of the roadmap you mentioned once our
understanding for the first one matches.

[1] - https://www.postgresql.org/docs/devel/logical-replication-security.html

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Dec 7, 2021, at 2:29 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Okay, let me try to explain again. Following is the text from docs
> [1]: " (a) To create a subscription, the user must be a superuser. (b)
> The subscription apply process will run in the local database with the
> privileges of a superuser. (c) Privileges are only checked once at the
> start of a replication connection. They are not re-checked as each
> change record is read from the publisher, nor are they re-checked for
> each change when applied.
>
> My understanding is that we want to improve what is written as (c)
> which I think is the same as what you mentioned later as "Fix the
> current bug wherein subscription changes are applied with superuser
> force after the subscription owner has superuser privileges revoked.".
> Am I correct till here? If so, I think what I am suggesting should fix
> this with the assumption that we still want to follow (b) at least for
> the first patch.

Ok, that's a point of disagreement.  I was trying to fix both (b) and (c) in the first patch.

> One possibility is that our understanding of the
> first problem is the same but you want to allow apply worker running
> even when superuser privileges are revoked provided the user with
> which it is running has appropriate privileges on the objects being
> accessed by apply worker.

Correct, that's what I'm trying to make safe.

> We will talk about other points of the roadmap you mentioned once our
> understanding for the first one matches.

I am happy to have an off-list phone call with you, if you like.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Tue, Dec 7, 2021 at 8:25 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
> > On Dec 7, 2021, at 2:29 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Okay, let me try to explain again. Following is the text from docs
> > [1]: " (a) To create a subscription, the user must be a superuser. (b)
> > The subscription apply process will run in the local database with the
> > privileges of a superuser. (c) Privileges are only checked once at the
> > start of a replication connection. They are not re-checked as each
> > change record is read from the publisher, nor are they re-checked for
> > each change when applied.
> >
> > My understanding is that we want to improve what is written as (c)
> > which I think is the same as what you mentioned later as "Fix the
> > current bug wherein subscription changes are applied with superuser
> > force after the subscription owner has superuser privileges revoked.".
> > Am I correct till here? If so, I think what I am suggesting should fix
> > this with the assumption that we still want to follow (b) at least for
> > the first patch.
>
> Ok, that's a point of disagreement.  I was trying to fix both (b) and (c) in the first patch.
>

But, I think as soon as we are trying to fix (b), we seem to be
allowing non-superusers to apply changes. If we want to do that then
we should be even allowed to change the owners to non-superusers. I
was thinking of the below order:
1. First fix (c) from the above description "Privileges are only
checked once at the start of a replication connection."
2A. Allow the transfer of subscriptions to non-superuser owners. This
will be allowed only on disabled subscriptions to make this action
predictable.
2B. The apply worker should be able to apply the changes provided the
user has appropriate privileges on the objects being accessed by apply
worker.
3) Allow the creation of subscriptions by non-superusers who are
members of some as yet to be created predefined role, say
"pg_create_subscriptions"

We all seem to agree that (3) can be done later as an independent
project. 2A, 2B can be developed as separate patches but they need to
be considered together for commit. After 2A, 2B, the first one (1)
won't be required so, in fact, we can just ignore (1) but the only
benefit I see is that if we stuck with some design problem during the
development of  2A, 2B, we would have at least something better than
what we have now.

You seem to be indicating let's do 2B first as that will anyway be
used later after 2A and 1 won't be required if we do that. I see that
but I personally feel either we should follow 1, 2(A, B) or just do
2(A, B).

-- 
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Tue, Nov 30, 2021 at 6:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > This patch does detect ownership changes more quickly (at the
> > transaction boundary) than the current code (only when it reloads for
> > some other reason). Transaction boundary seems like a reasonable time
> > to detect the change to me.
> >
> > Detecting faster might be nice, but I don't have a strong opinion about
> > it and I don't see why it necessarily needs to happen before this patch
> > goes in.
>
> I think it would be better to do it before we allow subscription
> owners to be non-superusers.

I think it would be better not to ever do it at any time.

It seems like a really bad idea to me to change the run-as user in the
middle of a transaction. That seems prone to producing all sorts of
confusing behavior that's hard to understand, and hard to test. So
what are we to do if a change occurs mid-transaction? I think we can
either finish replicating the current transaction and then switch to
the new owner for the next transaction, or we could abort the current
attempt to replicate the transaction and retry the whole transaction
with the new run-as user. My guess is that most users would prefer the
former behavior to the latter.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Dec 8, 2021 at 11:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> But, I think as soon as we are trying to fix (b), we seem to be
> allowing non-superusers to apply changes. If we want to do that then
> we should be even allowed to change the owners to non-superusers. I
> was thinking of the below order:
> 1. First fix (c) from the above description "Privileges are only
> checked once at the start of a replication connection."
> 2A. Allow the transfer of subscriptions to non-superuser owners. This
> will be allowed only on disabled subscriptions to make this action
> predictable.
> 2B. The apply worker should be able to apply the changes provided the
> user has appropriate privileges on the objects being accessed by apply
> worker.
> 3) Allow the creation of subscriptions by non-superusers who are
> members of some as yet to be created predefined role, say
> "pg_create_subscriptions"
>
> We all seem to agree that (3) can be done later as an independent
> project. 2A, 2B can be developed as separate patches but they need to
> be considered together for commit. After 2A, 2B, the first one (1)
> won't be required so, in fact, we can just ignore (1) but the only
> benefit I see is that if we stuck with some design problem during the
> development of  2A, 2B, we would have at least something better than
> what we have now.
>
> You seem to be indicating let's do 2B first as that will anyway be
> used later after 2A and 1 won't be required if we do that. I see that
> but I personally feel either we should follow 1, 2(A, B) or just do
> 2(A, B).

1 and 2B seem to require changing the same code, or related code. 1A
seems to require a completely different set of changes. If I'm right
about that, it seems like a good reason for doing 1+2B first and
leaving 2A for a separate patch.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Dec 9, 2021, at 7:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Nov 30, 2021 at 6:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> This patch does detect ownership changes more quickly (at the
>>> transaction boundary) than the current code (only when it reloads for
>>> some other reason). Transaction boundary seems like a reasonable time
>>> to detect the change to me.
>>>
>>> Detecting faster might be nice, but I don't have a strong opinion about
>>> it and I don't see why it necessarily needs to happen before this patch
>>> goes in.
>>
>> I think it would be better to do it before we allow subscription
>> owners to be non-superusers.
>
> I think it would be better not to ever do it at any time.
>
> It seems like a really bad idea to me to change the run-as user in the
> middle of a transaction.

I agree.  We allow SET ROLE inside transactions, but faking one on the subscriber seems odd.  No such role change was
performedon the publisher side, nor is there a principled reason for assuming the old run-as role has membership in the
newrun-as role, so we'd be pretending to do something that might otherwise be impossible. 

There was some discussion off-list about having the apply worker take out a lock on its subscription, thereby blocking
ownershipchanges mid-transaction.  I coded that and it seems to work fine, but I have a hard time seeing how the lock
trafficwould be worth expending.  Between (a) changing roles mid-transaction, and (b) locking the subscription for each
transaction,I'd prefer to do neither, but (b) seems far better than (a).  Thoughts? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Dec 9, 2021, at 7:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> 1 and 2B seem to require changing the same code, or related code. 1A
> seems to require a completely different set of changes. If I'm right
> about that, it seems like a good reason for doing 1+2B first and
> leaving 2A for a separate patch.

There are unresolved problems with 2A and 3 which were discussed upthread.  I don't want to include fixes for them in
thispatch, as it greatly expands the scope of this patch, and is a logically separate effort.  We can come back to
thoseproblems after this first patch is committed. 


Specifically, a non-superuser owner can perform ALTER SUBSCRIPTION and do things that are morally equivalent to
creatinga new subscription.  This is problematic where things like the connection string are concerned, because it
meansthe non-superuser owner can connect out to entirely different servers, without any access control checks to make
surethe owner should be able to connect to these servers. 

This problem already exists, right now.  I'm not fixing it in this first patch, but I'm also not making it any worse.

The solution Jeff Davis proposed seems right to me.  We change subscriptions to use a foreign server rather than a
freeformconnection string.  When creating or altering a subscription, the role performing the action must have
privilegeson any foreign server they use. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Thu, Dec 9, 2021 at 10:52 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
>
> > On Dec 9, 2021, at 7:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Tue, Nov 30, 2021 at 6:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>> This patch does detect ownership changes more quickly (at the
> >>> transaction boundary) than the current code (only when it reloads for
> >>> some other reason). Transaction boundary seems like a reasonable time
> >>> to detect the change to me.
> >>>
> >>> Detecting faster might be nice, but I don't have a strong opinion about
> >>> it and I don't see why it necessarily needs to happen before this patch
> >>> goes in.
> >>
> >> I think it would be better to do it before we allow subscription
> >> owners to be non-superusers.
> >
> > I think it would be better not to ever do it at any time.
> >
> > It seems like a really bad idea to me to change the run-as user in the
> > middle of a transaction.
>
> I agree.  We allow SET ROLE inside transactions, but faking one on the subscriber seems odd.  No such role change was
performedon the publisher side, nor is there a principled reason for assuming the old run-as role has membership in the
newrun-as role, so we'd be pretending to do something that might otherwise be impossible. 
>
> There was some discussion off-list about having the apply worker take out a lock on its subscription, thereby
blockingownership changes mid-transaction.  I coded that and it seems to work fine, but I have a hard time seeing how
thelock traffic would be worth expending.  Between (a) changing roles mid-transaction, and (b) locking the subscription
foreach transaction, I'd prefer to do neither, but (b) seems far better than (a).  Thoughts? 
>

Yeah, to me also (b) sounds better than (a). However, a few points
that we might want to consider in that regard are as follows: 1.
locking the subscription for each transaction will add new blocking
areas considering we acquire AccessExclusiveLock to change any
property of subscription. But as Alter Subscription won't be that
frequent operation it might be acceptable. 2. It might lead to adding
some cost to small transactions but not sure if that will be
noticeable. 3. Tomorrow, if we want to make the apply-process parallel
(IIRC, we do have the patch for that somewhere in archives) especially
for large in-progress transactions then this locking will have
additional blocking w.r.t Altering Subscription. But again, this also
might be acceptable.

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Thu, Dec 9, 2021 at 11:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> Yeah, to me also (b) sounds better than (a). However, a few points
> that we might want to consider in that regard are as follows: 1.
> locking the subscription for each transaction will add new blocking
> areas considering we acquire AccessExclusiveLock to change any
> property of subscription. But as Alter Subscription won't be that
> frequent operation it might be acceptable.

The problem isn't the cost of the locks taken by ALTER SUBSCRIPTION.
It's the cost of locking and unlocking the relation for every
transaction we apply. Suppose it's a pgbench-type workload with a
single UPDATE per transaction. You've just limited the maximum
possible apply speed to about, I think, 30,000 transactions per second
no matter how many parallel workers you use, because that's how fast
the lock manager is (or was, unless newer hardware or newer PG
versions have changed things in a way I don't know about). That seems
like a poor idea. There's nothing wrong with noticing changes at the
next transaction boundary, as long as we document it. So why would we
incur a possibly-significant performance cost to provide a stricter
guarantee?

I bet users wouldn't even like this behavior. It would mean that if
you are replicating a long-running transaction, an ALTER SUBSCRIPTION
command might block for a long time until replication of that
transaction completes. I have a hard time understanding why anyone
would consider that an improvement.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Andrew Dunstan
Date:
On 12/10/21 09:09, Robert Haas wrote:
> On Thu, Dec 9, 2021 at 11:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Yeah, to me also (b) sounds better than (a). However, a few points
>> that we might want to consider in that regard are as follows: 1.
>> locking the subscription for each transaction will add new blocking
>> areas considering we acquire AccessExclusiveLock to change any
>> property of subscription. But as Alter Subscription won't be that
>> frequent operation it might be acceptable.
> The problem isn't the cost of the locks taken by ALTER SUBSCRIPTION.
> It's the cost of locking and unlocking the relation for every
> transaction we apply. Suppose it's a pgbench-type workload with a
> single UPDATE per transaction. You've just limited the maximum
> possible apply speed to about, I think, 30,000 transactions per second
> no matter how many parallel workers you use, because that's how fast
> the lock manager is (or was, unless newer hardware or newer PG
> versions have changed things in a way I don't know about). That seems
> like a poor idea. There's nothing wrong with noticing changes at the
> next transaction boundary, as long as we document it. So why would we
> incur a possibly-significant performance cost to provide a stricter
> guarantee?
>
> I bet users wouldn't even like this behavior. It would mean that if
> you are replicating a long-running transaction, an ALTER SUBSCRIPTION
> command might block for a long time until replication of that
> transaction completes. I have a hard time understanding why anyone
> would consider that an improvement.
>


+1


I think noticing changes at the transaction boundary is perfectly
acceptable.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Fri, Dec 10, 2021 at 7:39 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Dec 9, 2021 at 11:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Yeah, to me also (b) sounds better than (a). However, a few points
> > that we might want to consider in that regard are as follows: 1.
> > locking the subscription for each transaction will add new blocking
> > areas considering we acquire AccessExclusiveLock to change any
> > property of subscription. But as Alter Subscription won't be that
> > frequent operation it might be acceptable.
>
> The problem isn't the cost of the locks taken by ALTER SUBSCRIPTION.
> It's the cost of locking and unlocking the relation for every
> transaction we apply.
>

This point is not clear to me as we are already locking the relation
while applying changes. I think here additional cost is to lock a
particular subscription as well in addition to the relation on which
we are going to perform apply. I agree that has a cost and that is why
I mentioned it as one of the points above and then also the
concurrency effect as you also noted could make this idea moot.

> Suppose it's a pgbench-type workload with a
> single UPDATE per transaction. You've just limited the maximum
> possible apply speed to about, I think, 30,000 transactions per second
> no matter how many parallel workers you use, because that's how fast
> the lock manager is (or was, unless newer hardware or newer PG
> versions have changed things in a way I don't know about). That seems
> like a poor idea. There's nothing wrong with noticing changes at the
> next transaction boundary, as long as we document it.
>

If we want to just document this then I think we should also keep in
mind that these could be N transactions as well if say tomorrow we
have N parallel apply workers applying the N transactions in parallel.
I think it might also be possible that RLS policies won't be applied
for initial table sync whereas those will be applied for later changes
even though the ownership has changed before both operations and one
of those happens to miss it. If that is possible, then it might be
better to avoid the same as it could appear inconsistent as mentioned
by Mark [1] as well. Now, it might be possible to avoid this by
implementation or we can say that we don't care about this or just
document it. But it seems to me that if we have some way to detect the
change of ownership at each operation level then no such possibilities
would arise.

The other alternative we discussed was to allow a change of ownership
on disabled subscriptions that way the apply behavior will always be
predictable.

There is clearly a merit in noticing the change of ownership at
transaction boundary but just wanted to consider other possibilities.
It could be that detecting at transaction-boundary is the best we can
do but I think there is no harm in considering other possibilities.

> So why would we
> incur a possibly-significant performance cost to provide a stricter
> guarantee?
>
> I bet users wouldn't even like this behavior. It would mean that if
> you are replicating a long-running transaction, an ALTER SUBSCRIPTION
> command might block for a long time until replication of that
> transaction completes.
>

Agreed and if we decide to lock the subscription during the initial
table sync phase then that could also take a long time for which again
users might not be happy.

[1] - https://www.postgresql.org/message-id/FE7D7024-6723-4ACB-82AB-94F6A194BE0D%40enterprisedb.com

-- 
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Nov 24, 2021, at 4:30 PM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> We need to do permission checking for WITH CHECK OPTION and RLS. The
> patch right now allows the subscription to write data that an RLS
> policy forbids.

Version 4 of the patch, attached, no longer allows RLS to be circumvented, but does so in a course-grained fashion.  If
thetarget table has row-level security policies which are enforced against the subscription owner, the replication
drawsan error, much as with a permissions failure.  This seems sufficient for now, as superusers, roles with bypassrls,
andtarget table owners should be able to replicate as before.  We may want to revisit this later, perhaps if/when we
addressyour ExecInsert question, below. 

>
> A couple other points:
>
> * We shouldn't refer to the behavior of previous versions in the docs
> unless there's a compelling reason

Fixed.

> * Do we need to be smarter about partitioned tables, where an insert
> can turn into an update?

Indeed, the logic of apply_handle_tuple_routing() required a bit of refactoring.  Fixed in v4.

> * Should we refactor to borrow logic from ExecInsert so that it's less
> likely that we miss something in the future?

Let's just punt on this for now.



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Attachment

Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Thu, Dec 16, 2021 at 1:53 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
>
> > On Nov 24, 2021, at 4:30 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> >
> > We need to do permission checking for WITH CHECK OPTION and RLS. The
> > patch right now allows the subscription to write data that an RLS
> > policy forbids.
>
> Version 4 of the patch, attached.
>

For Update/Delete, we do read the table first via
FindReplTupleInLocalRel(), so is there a need to check ACL_SELECT
before that?

-- 
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Sat, 2022-01-08 at 12:27 +0530, Amit Kapila wrote:
> For Update/Delete, we do read the table first via
> FindReplTupleInLocalRel(), so is there a need to check ACL_SELECT
> before that?

If it's logically an update/delete, then I think ACL_UPDATE/DELETE is
the right one to check. Do you have a different opinion?

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2021-12-15 at 12:23 -0800, Mark Dilger wrote:
> > On Nov 24, 2021, at 4:30 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> > 
> > We need to do permission checking for WITH CHECK OPTION and RLS.
> > The
> > patch right now allows the subscription to write data that an RLS
> > policy forbids.
> 
> Version 4 of the patch, attached, no longer allows RLS to be
> circumvented, but does so in a course-grained fashion.

Committed.

I tried to do some performance testing to see if there was any impact
of the extra catalog + ACL checks. Logical replication seems slow
enough -- something like 3X slower than local inserts -- that it didn't
seem to make a difference.

To test it, I did the following:
  1. sent a SIGSTOP to the logical apply worker
  2. loaded more data in publisher
  3. made the subscriber a sync replica
  4. timed the following:
    a. sent a SIGCONT to the logical apply worker
    b. insert a single tuple on the publisher side
    c. wait for the insert to return, indicating that logical
       replication is done up to that point

Does anyone have a better way to measure logical replication
performance?

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Sat, Jan 8, 2022 at 1:01 PM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Sat, 2022-01-08 at 12:27 +0530, Amit Kapila wrote:
> > For Update/Delete, we do read the table first via
> > FindReplTupleInLocalRel(), so is there a need to check ACL_SELECT
> > before that?
>
> If it's logically an update/delete, then I think ACL_UPDATE/DELETE is
> the right one to check. Do you have a different opinion?
>

But shouldn't we do it the first time before accessing the table?

-- 
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Sat, 2022-01-08 at 15:35 +0530, Amit Kapila wrote:
> On Sat, Jan 8, 2022 at 1:01 PM Jeff Davis <pgsql@j-davis.com> wrote:
> > 
> > On Sat, 2022-01-08 at 12:27 +0530, Amit Kapila wrote:
> > > For Update/Delete, we do read the table first via
> > > FindReplTupleInLocalRel(), so is there a need to check ACL_SELECT
> > > before that?
> > 
> > If it's logically an update/delete, then I think ACL_UPDATE/DELETE
> > is
> > the right one to check. Do you have a different opinion?
> > 
> 
> But shouldn't we do it the first time before accessing the table?

I'm not sure I follow the reasoning. Are you saying that, to logically
replay a simple DELETE, the subscription owner should have SELECT
privileges on the destination table?

Is there a way that a subscription owner could somehow exploit a DELETE
privilege to see the contents of a table on which they have no SELECT
privileges? Or is it purely an internal read, which is necessary for
any ordinary local DELETE/UPDATE anyway?

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> I'm not sure I follow the reasoning. Are you saying that, to logically
> replay a simple DELETE, the subscription owner should have SELECT
> privileges on the destination table?

We consider that DELETE WHERE <condition> requires SELECT privilege
on the column(s) read by the <condition>.  I suppose that the point
here is to enforce the same privilege checks that occur in normal
SQL operation, so yes.

> Is there a way that a subscription owner could somehow exploit a DELETE
> privilege to see the contents of a table on which they have no SELECT
> privileges?

BEGIN;
DELETE FROM tab WHERE col = 'foo';
-- note deletion count
ROLLBACK;

Now you have some information about whether "col" contains 'foo'.
Admittedly, it might be a pretty low-bandwidth way to extract data,
but we still regard it as a privilege issue.

            regards, tom lane



Re: Non-superuser subscription owners

From
Tom Lane
Date:
... btw, I'd like to complain that this new test script consumes
a completely excessive amount of time.  On my fairly-new primary
workstation:

[12:48:00] t/027_nosuperuser.pl ............... ok    22146 ms ( 0.02 usr  0.00 sys +  1.12 cusr  0.95 csys =  2.09
CPU)

The previously-slowest script in the subscription suite is

[12:48:23] t/100_bugs.pl ...................... ok     7048 ms ( 0.00 usr  0.00 sys +  2.85 cusr  0.99 csys =  3.84
CPU)

and the majority of the scripts clock in at more like 2 to 4 seconds.
So I don't think I'm out of line in saying that this test is consuming
an order of magnitude more time than is justified.  I do not wish to
see this much time added to every check-world run till kingdom come
for this one feature/issue.

            regards, tom lane



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Sat, 2022-01-08 at 12:57 -0500, Tom Lane wrote:
> ... btw, I'd like to complain that this new test script consumes
> a completely excessive amount of time.

Should be fixed now; I brought the number of tests down from 100 to 14.
It's not running in 2 seconds on my machine, but it's in line with the
other tests.

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> On Sat, 2022-01-08 at 12:57 -0500, Tom Lane wrote:
>> ... btw, I'd like to complain that this new test script consumes
>> a completely excessive amount of time.

> Should be fixed now; I brought the number of tests down from 100 to 14.
> It's not running in 2 seconds on my machine, but it's in line with the
> other tests.

Thanks, I appreciate that.

            regards, tom lane



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Sat, Jan 8, 2022 at 2:38 AM Jeff Davis <pgsql@j-davis.com> wrote:
> Committed.

I was just noticing that what was committed here didn't actually fix
the problem implied by the subject line. That is, non-superuser still
can't own subscriptions. To put that another way, there's no way for
the superuser to delegate the setup and administration of logical
replication to a non-superuser. That's a bummer.

Reading the thread, I'm not quite sure why we seemingly did all the
preparatory work and then didn't actually fix the problem. It was
previously proposed that we introduce a new predefined role
pg_create_subscriptions and allow users who have the privileges of
that predefined role to create and alter subscriptions. There are a
few issues with that which, however, seem fairly solvable to me:

1. Jeff pointed out that if you supply a connection string that is
going to try to access local files, you'd better have
pg_read_server_files, or else we should not let you use that
connection string. I guess that's mostly a function of which
parameters you specify, e.g. passfile, sslcert, sslkey, though maybe
for host it depends on whether the value starts with a slash. We might
need to think a bit here to make sure we get the rules right but it
seems like a pretty solvable problem.

2. There was also quite a bit of discussion of what to do if a user
who was previously eligible to own a subscription ceases to be
eligible, in particular around a superuser who is made into a
non-superuser, but the same problem would apply if you had
pg_create_subscriptions or pg_read_server_files and then lost it. My
vote is to not worry about it too much. Specifically, I think we
should certainly check whether the user has permission to create a
subscription before letting them do so, but we could handle the case
where the user already owns a subscription and tries to modify it by
either allowing or denying the operation and I think either of those
would be fine. I even think we could do one of those in some cases and
the other in other cases and as long as there is some principle to the
thing, it's fine. I argue that it's not a normal configuration and
therefore it doesn't have to work in a particularly useful way. It
shouldn't break the world in some horrendous way, but that's about as
good as it needs to be. I'd argue for example that DROP SUBSCRIPTION
could just check whether you own the object, and that ALTER
SUBSCRIPTION could check whether you own the object and, if you're
changing the connection string, also whether you would have privileges
to set that new connection string on a new subscription.

3. There was a bit of discussion of maybe wanting to allow users to
create subscriptions with some connection strings but not others,
perhaps by having some kind of intermediate object that owns the
connection string and is owned by a superuser or someone with lots of
privileges, and then letting a less-privileged user point a
subscription at that object. I agree that might be useful to somebody,
but I don't see why it's a hard requirement to get anything at all
done here. Right now, a subscription contains a connection string
directly. If in the future someone wants to introduce a CREATE
REPLICATION DESTINATION command (or whatever) and have a way to point
a subscription at a replication destination rather than a connection
string directly, cool. Or if someone wants to wire this into CREATE
SERVER somehow, also cool. But if you don't care about restricting
which IPs somebody can try to access by providing a connection string
of their choice, then you would be happy if we just did something
simple here and left this problem for another day.

I am very curious to know (a) why work on this was abandoned (perhaps
the answer is just lack of round tuits, in which case there is no more
to be said), and (b) what people think of (1)-(3) above, and (c)
whether anyone knows of further problems that need to be considered
here.

Thanks,

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Jan 18, 2023, at 11:38 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> I was just noticing that what was committed here didn't actually fix
> the problem implied by the subject line. That is, non-superuser still
> can't own subscriptions.

Not so.  They can.  See src/test/subscription/027_nosuperuser.pl

> To put that another way, there's no way for
> the superuser to delegate the setup and administration of logical
> replication to a non-superuser.

True.

> That's a bummer.

Also true.

> Reading the thread, I'm not quite sure why we seemingly did all the
> preparatory work and then didn't actually fix the problem.

Prior to the patch, if a superuser created a subscription, then later was demoted to non-superuser, the subscription
applyworkers still applied the changes with superuser force.  So creating a superuser Alice, letting Alice create a
subscription,then revoking superuser from Alice didn't accomplish anything interesting.  But after the patch, it does.
Thesuperuser can now create non-superuser subscriptions.  (I'm not sure this ability is well advertised.)  But the
problemof non-superuser roles creating non-superuser subscriptions is not solved. 

From a security perspective, the bit that was solved may be the more important part; from a usability perspective,
perhapsnot. 

> It was
> previously proposed that we introduce a new predefined role
> pg_create_subscriptions and allow users who have the privileges of
> that predefined role to create and alter subscriptions. There are a
> few issues with that which, however, seem fairly solvable to me:
>
> 1. Jeff pointed out that if you supply a connection string that is
> going to try to access local files, you'd better have
> pg_read_server_files, or else we should not let you use that
> connection string. I guess that's mostly a function of which
> parameters you specify, e.g. passfile, sslcert, sslkey, though maybe
> for host it depends on whether the value starts with a slash. We might
> need to think a bit here to make sure we get the rules right but it
> seems like a pretty solvable problem.
>
> 2. There was also quite a bit of discussion of what to do if a user
> who was previously eligible to own a subscription ceases to be
> eligible, in particular around a superuser who is made into a
> non-superuser, but the same problem would apply if you had
> pg_create_subscriptions or pg_read_server_files and then lost it. My
> vote is to not worry about it too much. Specifically, I think we
> should certainly check whether the user has permission to create a
> subscription before letting them do so, but we could handle the case
> where the user already owns a subscription and tries to modify it by
> either allowing or denying the operation and I think either of those
> would be fine. I even think we could do one of those in some cases and
> the other in other cases and as long as there is some principle to the
> thing, it's fine. I argue that it's not a normal configuration and
> therefore it doesn't have to work in a particularly useful way. It
> shouldn't break the world in some horrendous way, but that's about as
> good as it needs to be. I'd argue for example that DROP SUBSCRIPTION
> could just check whether you own the object, and that ALTER
> SUBSCRIPTION could check whether you own the object and, if you're
> changing the connection string, also whether you would have privileges
> to set that new connection string on a new subscription.
>
> 3. There was a bit of discussion of maybe wanting to allow users to
> create subscriptions with some connection strings but not others,
> perhaps by having some kind of intermediate object that owns the
> connection string and is owned by a superuser or someone with lots of
> privileges, and then letting a less-privileged user point a
> subscription at that object. I agree that might be useful to somebody,
> but I don't see why it's a hard requirement to get anything at all
> done here. Right now, a subscription contains a connection string
> directly. If in the future someone wants to introduce a CREATE
> REPLICATION DESTINATION command (or whatever) and have a way to point
> a subscription at a replication destination rather than a connection
> string directly, cool. Or if someone wants to wire this into CREATE
> SERVER somehow, also cool. But if you don't care about restricting
> which IPs somebody can try to access by providing a connection string
> of their choice, then you would be happy if we just did something
> simple here and left this problem for another day.
>
> I am very curious to know (a) why work on this was abandoned (perhaps
> the answer is just lack of round tuits, in which case there is no more
> to be said)

Mostly, it was a lack of round-tuits.  After the patch was committed, I quickly switched my focus elsewhere.

> , and (b) what people think of (1)-(3) above

There are different ways of solving (1), and Jeff and I discussed them in Dec 2021.  My recollection was that idea (3)
wasthe cleanest.  Other ideas might be simpler than (3), or they may just appear simpler but in truth turn into a can
ofworms.  I don't know, since I never went as far as trying to implement either approach. 

Idea (2) seems to contemplate non-superuser subscription owners as a theoretical thing, but it's quite real already.
Again,see 027_nosuperuser.pl. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Jan 18, 2023 at 3:26 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> Prior to the patch, if a superuser created a subscription, then later was demoted to non-superuser, the subscription
applyworkers still applied the changes with superuser force.  So creating a superuser Alice, letting Alice create a
subscription,then revoking superuser from Alice didn't accomplish anything interesting.  But after the patch, it does.
Thesuperuser can now create non-superuser subscriptions.  (I'm not sure this ability is well advertised.)  But the
problemof non-superuser roles creating non-superuser subscriptions is not solved. 

Ah, OK, thanks for the clarification!

> There are different ways of solving (1), and Jeff and I discussed them in Dec 2021.  My recollection was that idea
(3)was the cleanest.  Other ideas might be simpler than (3), or they may just appear simpler but in truth turn into a
canof worms.  I don't know, since I never went as far as trying to implement either approach. 
>
> Idea (2) seems to contemplate non-superuser subscription owners as a theoretical thing, but it's quite real already.
Again,see 027_nosuperuser.pl. 

I think the solution to the problem of a connection string trying to
access local files is to just look at the connection string, decide
whether it does that, and if yes, require the owner to have
pg_read_server_files as well as pg_create_subscription. (3) is about
creating some more sophisticated and powerful solution to that
problem, but that seems like a nice-to-have, not something essential,
and a lot more complicated to implement.

I guess what I listed as (2) is not relevant since I didn't understand
correctly what the current state of things is.

Unless I'm missing something, it seems like this could be a quite small patch.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Jan 18, 2023, at 12:51 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> Unless I'm missing something, it seems like this could be a quite small patch.

I didn't like the idea of the create/alter subscription commands needing to parse the connection string and think about
whatit might do, because at some point in the future we might extend what things are allowed in that string, and we
haveto keep everything that contemplates that string in sync.  I may have been overly hesitant to tackle that problem.
Ormaybe I just ran short of round tuits. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Jan 18, 2023 at 3:58 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> > On Jan 18, 2023, at 12:51 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > Unless I'm missing something, it seems like this could be a quite small patch.
>
> I didn't like the idea of the create/alter subscription commands needing to parse the connection string and think
aboutwhat it might do, because at some point in the future we might extend what things are allowed in that string, and
wehave to keep everything that contemplates that string in sync.  I may have been overly hesitant to tackle that
problem. Or maybe I just ran short of round tuits. 

I wouldn't be OK with writing our own connection string parser for
this purpose, but using PQconninfoParse seems OK. We still have to
embed knowledge of which connection string parameters can trigger
local file access, but that doesn't seem like a massive problem to me.
If we already had (or have) that logic someplace else, it would
probably make sense to reuse it, but if we don't, writing new logic
doesn't seem prohibitively scary. I'm not 100% confident of my ability
to get those rules right on the first try, but I feel like whatever
problems are there are just bugs that can be fixed with a few lines of
code changes. The basic idea that by looking at which connection
string properties are set we can tell what kinds of things the
connection string is going to do seems sound to me.

If there's some reason that it isn't, that would be good to discover
now rather than later.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2023-01-18 at 14:38 -0500, Robert Haas wrote:
> I was just noticing that what was committed here didn't actually fix
> the problem implied by the subject line. That is, non-superuser still
> can't own subscriptions. To put that another way, there's no way for
> the superuser to delegate the setup and administration of logical
> replication to a non-superuser. That's a bummer.

Right, though as Mark pointed out, it does accomplish something even if
it's a bit unsatisfying. We could certainly do better here.

> 2. There was also quite a bit of discussion of what to do if a user
> who was previously eligible to own a subscription ceases to be
> eligible, in particular around a superuser who is made into a
> non-superuser, but the same problem would apply

Correct, that's not a new problem, but exists in only a few places now.
Our privilege system is focused on "what action can the user take right
now?", and gets weirder when it comes to object ownership, which is a
more permanent thing.

Extending that system to a subscription object, which has its own
capabilities including a long-lived process, is cause for some
hesitation. I agree it's not necessarily a blocker.

> 3. There was a bit of discussion of maybe wanting to allow users to
> create subscriptions with some connection strings but not others,

This was an alternative to trying to sanitize connection strings,
because it's a bit difficult to reason about what might be "safe"
connection strings for a non-superuser, because it's environment-
dependent. But if we do identify a reasonable set of sanitization
rules, we can proceed without 3.

> I am very curious to know (a) why work on this was abandoned (perhaps
> the answer is just lack of round tuits, in which case there is no
> more
> to be said), and (b) what people think of (1)-(3) above, and (c)
> whether anyone knows of further problems that need to be considered
> here.

(a) Mostly round-tuits. There are problems and questions; but there are
with any work, and they could be solved. Or, if they don't turn out to
be terribly serious, we could ignore them.

(b) When I pick this up again I would be inclined towards the
following: try to solve 4-5 (listed below) first, which are
independently useful; then look at both 1 and 3 to see which one
presents an agreeable solution faster. I'll probably ignore 2 because I
couldn't get agreement the last time around (I think Mark objected to
the idea of denying a drop in privileges).

(c) Let me add:

4. There are still differences between the subscription worker applying
a change and going through the ordinary INSERT paths, for instance with
RLS. Also solvable.

5. Andres raised in another thread the idea of switching to the table
owner when applying changes (perhaps in a
SECURITY_RESTRICTED_OPERATION?):

https://www.postgresql.org/message-id/20230112033355.u5tiyr2bmuoc4jf4@awork3.anarazel.de

That seems related, and I like the idea.


--
Jeff Davis
PostgreSQL Contributor Team - AWS





Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Thu, 2023-01-19 at 10:45 -0500, Robert Haas wrote:
> I wouldn't be OK with writing our own connection string parser for
> this purpose, but using PQconninfoParse seems OK. We still have to
> embed knowledge of which connection string parameters can trigger
> local file access, but that doesn't seem like a massive problem to
> me.

Another idea (I discussed with Andres some time ago) was to have an
option to libpq to turn off file access entirely. That could be a new
API function or a new connection option.

That would be pretty valuable by itself. Though we might want to
support a way to pass SSL keys as values rather than file paths, so
that we can still do SSL.

So perhaps the answer is that it will be a small patch to get non-
superuser subscription owners, but we need three or four preliminary
patches first.


--
Jeff Davis
PostgreSQL Contributor Team - AWS





Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Thu, Jan 19, 2023 at 1:40 PM Jeff Davis <pgsql@j-davis.com> wrote:
> On Thu, 2023-01-19 at 10:45 -0500, Robert Haas wrote:
> > I wouldn't be OK with writing our own connection string parser for
> > this purpose, but using PQconninfoParse seems OK. We still have to
> > embed knowledge of which connection string parameters can trigger
> > local file access, but that doesn't seem like a massive problem to
> > me.
>
> Another idea (I discussed with Andres some time ago) was to have an
> option to libpq to turn off file access entirely. That could be a new
> API function or a new connection option.
>
> That would be pretty valuable by itself. Though we might want to
> support a way to pass SSL keys as values rather than file paths, so
> that we can still do SSL.

Maybe all of that would be useful, but it doesn't seem that mandatory.

> So perhaps the answer is that it will be a small patch to get non-
> superuser subscription owners, but we need three or four preliminary
> patches first.

I guess I'm not quite seeing it. Why can't we write a small patch to
get this working right now, probably in a few hours, and deal with any
improvements that people want at a later time?

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Thu, 2023-01-19 at 14:11 -0500, Robert Haas wrote:
> I guess I'm not quite seeing it. Why can't we write a small patch to
> get this working right now, probably in a few hours, and deal with
> any
> improvements that people want at a later time?

To me, it's worrisome when there are more than a few loose ends, and
here it seems like there are more like five. No single issue is a
blocker, but I believe we'd end up with a better user-facing solution
if we solved a couple of these lower-level issues (and think a little
more about the other ones) before we expose new functionality to the
user.

The predefined role is probably the biggest user-facing part of the
change. Does it mean that members can create any number of any kind of
subscription? If so it may be hard to tighten down later, because we
don't know what existing setups might break.

Perhaps we can just permit a superuser to "ALTER SUBSCRIPTION ... OWNER
TO <non-super>", which makes it simpler to use while still leaving the
responisbility with the superuser to get it right. Maybe we even block
the user from altering their own subscription (would be weird but not
much weirder than what we have now)? I don't know if that solves the
problem you're trying to solve, but it seems lower-risk.

--
Jeff Davis
PostgreSQL Contributor Team - AWS





Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-01-19 10:45:35 -0500, Robert Haas wrote:
> On Wed, Jan 18, 2023 at 3:58 PM Mark Dilger
> <mark.dilger@enterprisedb.com> wrote:
> > > On Jan 18, 2023, at 12:51 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> > >
> > > Unless I'm missing something, it seems like this could be a quite small patch.
> >
> > I didn't like the idea of the create/alter subscription commands needing to parse the connection string and think
aboutwhat it might do, because at some point in the future we might extend what things are allowed in that string, and
wehave to keep everything that contemplates that string in sync.  I may have been overly hesitant to tackle that
problem. Or maybe I just ran short of round tuits.
 
> 
> I wouldn't be OK with writing our own connection string parser for
> this purpose, but using PQconninfoParse seems OK. We still have to
> embed knowledge of which connection string parameters can trigger
> local file access, but that doesn't seem like a massive problem to me.

> If we already had (or have) that logic someplace else, it would
> probably make sense to reuse it

We hve. See at least postgres_fdw's check_conn_params(), dblink's
dblink_connstr_check() and dblink_security_check().

As part of the fix for https://postgr.es/m/20220925232237.p6uskba2dw6fnwj2%40awork3.anarazel.de
I am planning to introduce a bunch of server side helpers for dealing with
libpq (for establishing a connection while accepting interrupts). We could try
to centralize knowledge for those checks there.

The approach of checking, after connection establishment (see
dblink_security_check()), that we did in fact use the specified password,
scares me somewhat. See also below.


> The basic idea that by looking at which connection string properties are set
> we can tell what kinds of things the connection string is going to do seems
> sound to me.

I don't think you *can* check it purely based on existing connection string
properties, unfortunately. Think of e.g. a pg_hba.conf line of "local all user
peer" (quite reasonable config) or "host all all 127.0.0.1/32 trust" (less so).

Hence the hack with dblink_security_check().


I think there might be a discussion somewhere about adding an option to force
libpq to not use certain auth methods, e.g. plaintext password/md5. It's
possible this could be integrated.


Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-01-19 17:16:20 -0800, Jeff Davis wrote:
> The predefined role is probably the biggest user-facing part of the
> change. Does it mean that members can create any number of any kind of
> subscription?

I don't think we need to support complicated restriction schemes around this
now. I'm sure such needs exist, but I think there's more places where a simple
"allowed/not allowed" suffices.

You'd presumably just grant such a permission to "pseudo superuser"
users. They can typically do a lot of bad things already, so I don't really
see the common need to prevent them from creating many subscriptions.


> If so it may be hard to tighten down later, because we don't know what
> existing setups might break.

Presumably the unlimited number of subs case would still exist as an option
later - so I don't see the problem?


> Perhaps we can just permit a superuser to "ALTER SUBSCRIPTION ... OWNER
> TO <non-super>", which makes it simpler to use while still leaving the
> responisbility with the superuser to get it right. Maybe we even block
> the user from altering their own subscription (would be weird but not
> much weirder than what we have now)? I don't know if that solves the
> problem you're trying to solve, but it seems lower-risk.

That seems to not really get us very far. It's hard to use for users, and hard
to make secure for the hosted PG providers.

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Thu, 2023-01-19 at 17:51 -0800, Andres Freund wrote:
> I don't think we need to support complicated restriction schemes
> around this
> now. I'm sure such needs exist, but I think there's more places where
> a simple
> "allowed/not allowed" suffices.

If we did follow a path like 3 (having some kind of other object
represent the connection string), then it would create two different
kinds of subscriptions that might be controlled different ways, and
there might be some rough edges. Might also be fine, or we might never
pursue 3.

I feel like my words are being interpreted as though I don't want this
feature. I do, and I'm happy Robert re-raised it. I'm just trying to
answer his questions about why I set the work down, which is that I
felt some groundwork should be done before proceeding to a documented
feature, and I still feel that's the right thing.

But (a) that's not a very strong objection; and (b) my efforts are
better spent doing some of that groundwork than arguing about the order
in which the work should be done. So, time permitting, I may be able to
put out a patch or two for the next 'fest.


--
Jeff Davis
PostgreSQL Contributor Team - AWS





Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Thu, Jan 19, 2023 at 8:46 PM Andres Freund <andres@anarazel.de> wrote:
> > I wouldn't be OK with writing our own connection string parser for
> > this purpose, but using PQconninfoParse seems OK. We still have to
> > embed knowledge of which connection string parameters can trigger
> > local file access, but that doesn't seem like a massive problem to me.
>
> > If we already had (or have) that logic someplace else, it would
> > probably make sense to reuse it
>
> We hve. See at least postgres_fdw's check_conn_params(), dblink's
> dblink_connstr_check() and dblink_security_check().

That's not the same thing. It doesn't know anything about other
parameters that might try to consult a local file, like sslcert,
sslkey, sslrootcert, sslca, sslcrl, sslcrldir, and maybe service.
Maybe you want to argue we don't need that, but that's what the
earlier discussion was about.

> As part of the fix for https://postgr.es/m/20220925232237.p6uskba2dw6fnwj2%40awork3.anarazel.de
> I am planning to introduce a bunch of server side helpers for dealing with
> libpq (for establishing a connection while accepting interrupts). We could try
> to centralize knowledge for those checks there.

Maybe. We could also add something into libpq, as Jeff proposed, e.g.
a new connection parameter
the_other_connection_parameters_might_try_to_trojan_the_local_host=1
blocks all that stuff from doing anything.

> The approach of checking, after connection establishment (see
> dblink_security_check()), that we did in fact use the specified password,
> scares me somewhat. See also below.

Yes, I find that extremely dubious. It blocks things that you might
want to do for legitimate reasons, including things that might be more
secure than using a password. And there's no guarantee that it
accomplishes the intended objective either. The stated motivation for
that restriction was. I believe, that we don't want the outbound
connection to rely on the privileges available from the context in
which PostgreSQL itself is running -- but for all we know the remote
side has an IP filter that only allows the PostgreSQL host and no
others. Moreover, it relies on us knowing what the behavior of the
remote server is, even though we have no way of knowing that that
server shares our security interests.

Worse still, I have always felt that the security vulnerability that
led to these controls being installed is pretty much fabricated: it's
an imaginary problem. Today I went back and found the original CVE at
https://nvd.nist.gov/vuln/detail/CVE-2007-3278 and it seems that at
least one other person agrees. The Red Hat vendor statement on that
page says: "Red Hat does not consider this do be a security issue.
dblink is disabled in default configuration of PostgreSQL packages as
shipped with Red Hat Enterprise Linux versions 2.1, 3, 4 and 5, and it
is a configuration decision whether to grant local users arbitrary
access." I think whoever wrote that has an excellent point. I'm unable
to discern any legitimate security purpose for this restriction. What
I think it mostly does is (a) inconvenience users or (b) force them to
rely on a less-secure authentication method than they would otherwise
have chosen.

> > The basic idea that by looking at which connection string properties are set
> > we can tell what kinds of things the connection string is going to do seems
> > sound to me.
>
> I don't think you *can* check it purely based on existing connection string
> properties, unfortunately. Think of e.g. a pg_hba.conf line of "local all user
> peer" (quite reasonable config) or "host all all 127.0.0.1/32 trust" (less so).
>
> Hence the hack with dblink_security_check().
>
> I think there might be a discussion somewhere about adding an option to force
> libpq to not use certain auth methods, e.g. plaintext password/md5. It's
> possible this could be integrated.

I still think you're talking about a different problem here. I'm
talking about the problem of knowing whether local files are going to
be accessed by the connection string.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Fri, Jan 20, 2023 at 8:25 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I still think you're talking about a different problem here. I'm
> talking about the problem of knowing whether local files are going to
> be accessed by the connection string.

So here's a dumb patch for this. At least in my mind, the connection
string sanitization/validation is the major design problem here, and
I'm not at all sure that what I did in the attached patch is right.
But let's talk about that. This approach is inspired by Jeff's
comments about local file access upthread, but as Andres pointed out,
that's a completely different set of things than we worry about in
other places. I'm not quite sure what the right model is here.

This patch incidentally allows ALTER SUBSCRIPTION .. SKIP for any
subscription owner, removing the existing check that limits that
operation to superusers and replacing it with nothing. I can't really
see why this needs to be any more restricted than that, and
regrettably neither the check in the existing code nor the commit that
added it have any comments explaining the logic behind that check. If,
for example, skipping a subscription could lead to a server crash,
that would be a reason to restrict the feature to superusers (or
revert it). If it's just a case of the operation being maybe not the
right thing to do, that's not a sufficient reason to restrict it to
superusers. This change is really independent of the rest of the patch
and, if we want to do this, I will separate it into its own patch. But
since this is just for discussion, I didn't worry about that right
now.

Aside from the above, I don't yet see a problem here that I would
consider to be serious enough that we couldn't proceed. I'll try to
avoid too much repetition of what's already been said on this topic,
but I do want to add that I think that creating subscriptions is
properly viewed as a *slightly* scary operation, not a *very* scary
operation. It lets you do two things that you couldn't otherwise. One
is get background processes running that take up process slots and
consume resources -- but note that your ability to consume resources
with however many normal database connections you can make is
virtually unlimited. The other thing it lets you do is poke around the
network, maybe figure out whether some ports are open or closed, and
try to replicate data from any accessible servers you can find, which
could include ports or servers that you can't access directly. I think
that the superuser will be in a good position to evaluate whether that
is a risk in a certain environment or not, and I think many superusers
will conclude that it isn't a big risk. I think that the main
motivation for NOT handing out pg_create_subscription will turn out to
be administrative rather than security-related i.e. they'll want to be
something that falls under their authority rather than someone else's.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-01-20 08:25:46 -0500, Robert Haas wrote:
> Worse still, I have always felt that the security vulnerability that
> led to these controls being installed is pretty much fabricated: it's
> an imaginary problem. Today I went back and found the original CVE at
> https://nvd.nist.gov/vuln/detail/CVE-2007-3278 and it seems that at
> least one other person agrees. The Red Hat vendor statement on that
> page says: "Red Hat does not consider this do be a security issue.
> dblink is disabled in default configuration of PostgreSQL packages as
> shipped with Red Hat Enterprise Linux versions 2.1, 3, 4 and 5, and it
> is a configuration decision whether to grant local users arbitrary
> access." I think whoever wrote that has an excellent point. I'm unable
> to discern any legitimate security purpose for this restriction. What
> I think it mostly does is (a) inconvenience users or (b) force them to
> rely on a less-secure authentication method than they would otherwise
> have chosen.

FWIW, I've certainly seen situations where having the checks prevented easy
paths to privilege escalations. That's not to say that I like the checks, but
I also don't think we can get away without them (or a better replacement, of
course).

There are good reasons to have 'peer' authentication set up for the user
running postgres, so admin scripts can connect without issues. Which
unfortunately then also means that postgres_fdw etc can connect to the current
database as superuser, without that check. Which imo clearly is an issue.

Why do you think this is a fabricated issue?


The solution we have is quite bad, of course. Just because the user isn't a
superuser "immediately" doesn't mean it doesn't have the rights to become
one somehow.


> > > The basic idea that by looking at which connection string properties are set
> > > we can tell what kinds of things the connection string is going to do seems
> > > sound to me.
> >
> > I don't think you *can* check it purely based on existing connection string
> > properties, unfortunately. Think of e.g. a pg_hba.conf line of "local all user
> > peer" (quite reasonable config) or "host all all 127.0.0.1/32 trust" (less so).
> >
> > Hence the hack with dblink_security_check().
> >
> > I think there might be a discussion somewhere about adding an option to force
> > libpq to not use certain auth methods, e.g. plaintext password/md5. It's
> > possible this could be integrated.
> 
> I still think you're talking about a different problem here. I'm
> talking about the problem of knowing whether local files are going to
> be accessed by the connection string.

Why is this only about local files, rather than e.g. also using the local
user?

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-01-20 11:08:54 -0500, Robert Haas wrote:
>  /*
> - * Validate connection info string (just try to parse it)
> + * Validate connection info string, and determine whether it might cause
> + * local filesystem access to be attempted.
> + *
> + * If the connection string can't be parsed, this function will raise
> + * an error and will not return. If it can, it will return true if local
> + * filesystem access may be attempted and false otherwise.
>   */
> -static void
> +static bool
>  libpqrcv_check_conninfo(const char *conninfo)
>  {
>      PQconninfoOption *opts = NULL;
> +    PQconninfoOption *opt;
>      char       *err = NULL;
> +    bool        result = false;
>  
>      opts = PQconninfoParse(conninfo, &err);
>      if (opts == NULL)
> @@ -267,7 +274,40 @@ libpqrcv_check_conninfo(const char *conninfo)
>                   errmsg("invalid connection string syntax: %s", errcopy)));
>      }
>  
> +    for (opt = opts; opt->keyword != NULL; ++opt)
> +    {
> +        /* Ignore connection options that are not present. */
> +        if (opt->val == NULL)
> +            continue;
> +
> +        /* For all these parameters, the value is a local filename. */
> +        if (strcmp(opt->keyword, "passfile") == 0 ||
> +            strcmp(opt->keyword, "sslcert") == 0 ||
> +            strcmp(opt->keyword, "sslkey") == 0 ||
> +            strcmp(opt->keyword, "sslrootcert") == 0 ||
> +            strcmp(opt->keyword, "sslcrl") == 0 ||
> +            strcmp(opt->keyword, "sslcrldir") == 0 ||
> +            strcmp(opt->keyword, "service") == 0)
> +        {
> +            result = true;
> +            break;
> +        }

Do we need to think about 'options' allowing anything bad? I don't
immediately* see a problem, but ...


> +
> +        /*
> +         * For the host parameter, the value might be a local filename.
> +         * It might also be a reference to the local host's abstract UNIX
> +         * socket namespace, which we consider equivalent to a local pathname
> +         * for security purporses.
> +         */
> +        if (strcmp(opt->keyword, "host") == 0 && is_unixsock_path(opt->val))
> +        {
> +            result = true;
> +            break;
> +        }
> +    }

Hm, what about kerberos / gss / SSPI? Aren't those essentially also tied to
the local filesystem / user?

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Sat, 2023-01-21 at 14:01 -0800, Andres Freund wrote:
> There are good reasons to have 'peer' authentication set up for the
> user
> running postgres, so admin scripts can connect without issues. Which
> unfortunately then also means that postgres_fdw etc can connect to
> the current
> database as superuser, without that check. Which imo clearly is an
> issue.

Perhaps we should have a way to directly turn on/off authentication
methods in libpq through API functions and/or options?

This reminds me of the "channel_binding=required" option. We considered
some similar alternatives for that feature.

> Why is this only about local files, rather than e.g. also using the
> local
> user?

It's not, but we happen to already have pg_read_server_files, and it
makes sense to use that at least for files referenced directly in the
connection string. You're right that it's incomplete, and also that it
doesn't make a lot of sense for files accessed indirectly.


--
Jeff Davis
PostgreSQL Contributor Team - AWS





Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-01-22 09:05:27 -0800, Jeff Davis wrote:
> On Sat, 2023-01-21 at 14:01 -0800, Andres Freund wrote:
> > There are good reasons to have 'peer' authentication set up for the
> > user
> > running postgres, so admin scripts can connect without issues. Which
> > unfortunately then also means that postgres_fdw etc can connect to
> > the current
> > database as superuser, without that check. Which imo clearly is an
> > issue.
> 
> Perhaps we should have a way to directly turn on/off authentication
> methods in libpq through API functions and/or options?

Yes. There's an in-progress patch adding, I think, pretty much what is
required here:
https://www.postgresql.org/message-id/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3.camel@vmware.com

require_auth=a,b,c

I think an allowlist approach is the right thing for the subscription (and
postgres_fdw/dblink) use case, otherwise we'll add some auth method down the
line without updating what's disallowed in the subscription code.


> > Why is this only about local files, rather than e.g. also using the local
> > user?
> 
> It's not, but we happen to already have pg_read_server_files, and it
> makes sense to use that at least for files referenced directly in the
> connection string. You're right that it's incomplete, and also that it
> doesn't make a lot of sense for files accessed indirectly.

I just meant that we need to pay attention to user-based permissions as well.

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Sat, Jan 21, 2023 at 5:01 PM Andres Freund <andres@anarazel.de> wrote:
> There are good reasons to have 'peer' authentication set up for the user
> running postgres, so admin scripts can connect without issues. Which
> unfortunately then also means that postgres_fdw etc can connect to the current
> database as superuser, without that check. Which imo clearly is an issue.
>
> Why do you think this is a fabricated issue?

Well, if I have a bunch of PostgreSQL machines on the network that all
allow each other to log in without requiring anything much in the way
of passwords or closely-guarded SSL certificates or anything, and then
I grant to the users on those machines the right to make connections
to the other machines using arbitrary connection strings, whose fault
is it when security is compromised? We seem to be taking the policy
that it's PostgreSQL's fault if it doesn't block something bad from
happening there, but it seems to me that if you gate incoming
PostgreSQL connections only by source IP address and then also give
unprivileged users the ability to choose their source IP address, you
should expect to have a problem.

I will admit that this is not an open-and-shut case, because a
passwordless login back to the bootstrap superuser account from the
local machine is a pretty common scenario and doesn't feel
intrinsically unreasonable to me, and I hadn't thought about that as a
potential attack vector.

However, I still think there's a problem with putting all the
responsibility on PostgreSQL. The problem, specifically, is that we're
speculating wildly as to the user's intent. If we say, as we currently
do, that we're only going to allow connections if they require a
password, then we're making a judgement that the superuser couldn't
have intended to allow the postgres_fdw to make a passwordless
connection. On the other hand, if we say, as we also currently do,
that the postgres_fdw user is free to set the sslcert parameter to
anything they like, then we're making a judgement that the superuser
is totally OK with that being set to any file on the local filesystem.
Neither of those conclusions seems sound to me. The superuser may, or
may not, have intended to allow passwordless logins, and they may, or
may not, have intended for any SSL certificates stored locally to be
usable by outbound connection attempts.

And that's what I really dislike about the you-must-use-a-password
rule that we have right now. It embeds a policy decision about what
users do or do not want to allow. We've uncritically copied that
policy decision around to more and more places, and we've added
workarounds in some places for the fact that, well, you know, it might
not actually be what everybody wants (6136e94d), but it doesn't seem
like we've ever really acknowledged that we *made* a policy decision.
And that means we haven't really had a debate about the merits of this
*particular* rule, which seems to me to be highly debatable. It looks
to me like there's both stuff you might not want to allow that this
rule does not block, and also stuff you might want to allow that this
rule does block, and also that different people can want different
things yet this rule applies uniformly to everyone.

> > I still think you're talking about a different problem here. I'm
> > talking about the problem of knowing whether local files are going to
> > be accessed by the connection string.
>
> Why is this only about local files, rather than e.g. also using the local
> user?

Because there's nothing you can do about the local-user case.

If I'm asked to attempt to connect to a PostgreSQL server, and I
choose to do that, and the connection succeeds, all I know is that the
connection actually succeeded. I do not know why the remote machine
chose to accept the connection. If I supplied a password or an SSL
certificate or some such thing, then it seems likely that the remote
machine accepted that connection because I supplied that particular
password or SSL certificate, but it could also be because the remote
machine accepts all connections from Robert, or all connections
whatsoever, or all connections on Mondays. I just don't know. If I'm
worried that the person is asking me to make the connection is trying
to trick me into doing something that they can't do themselves, I
could refuse to read a password from a local password store or an SSL
certificate from a local certificate file or otherwise refuse to do
anything special to try to get their connection request accepted, and
then if it does get accepted anyway, I know that they weren't relying
on any of those resources that I refused to use. But, if I attempt a
plain vanilla, totally password-less connection and it works, I have
no way of knowing whether that happened because I'm Robert or for some
other reason.

To put that another way, if I'm making a connection on behalf of an
untrusted party, I can choose not to supply an SSL certificate, or not
to supply a password. But I cannot choose to not be myself.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Sun, Jan 22, 2023 at 8:52 PM Andres Freund <andres@anarazel.de> wrote:
> > Perhaps we should have a way to directly turn on/off authentication
> > methods in libpq through API functions and/or options?
>
> Yes. There's an in-progress patch adding, I think, pretty much what is
> required here:
> https://www.postgresql.org/message-id/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3.camel@vmware.com
>
> require_auth=a,b,c
>
> I think an allowlist approach is the right thing for the subscription (and
> postgres_fdw/dblink) use case, otherwise we'll add some auth method down the
> line without updating what's disallowed in the subscription code.

So what would we do here, exactly? We could force a require_auth
parameter into the provided connection string, although I'm not quite
sure of the details there, but what value should we force? Is that
going to be something hard-coded, or something configurable? If
configurable, where does that configuration get stored?

Regardless, this only allows connection strings to be restricted along
one axis: authentication type. If you want to let people connect only
to a certain subnet or whatever, you're still out of luck.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Sat, Jan 21, 2023 at 5:11 PM Andres Freund <andres@anarazel.de> wrote:
> > +             /* For all these parameters, the value is a local filename. */
> > +             if (strcmp(opt->keyword, "passfile") == 0 ||
> > +                     strcmp(opt->keyword, "sslcert") == 0 ||
> > +                     strcmp(opt->keyword, "sslkey") == 0 ||
> > +                     strcmp(opt->keyword, "sslrootcert") == 0 ||
> > +                     strcmp(opt->keyword, "sslcrl") == 0 ||
> > +                     strcmp(opt->keyword, "sslcrldir") == 0 ||
> > +                     strcmp(opt->keyword, "service") == 0)
> > +             {
> > +                     result = true;
> > +                     break;
> > +             }
>
> Do we need to think about 'options' allowing anything bad? I don't
> immediately* see a problem, but ...

If it is, it'd be a different kind of bad. What these parameters all
have in common is that they allow you to read some local file and
maybe benefit from that during the authentication process. options
doesn't let you to do anything like that, and by definition kind of
can't, because it's just a string to be sent to the remote server. As
I noted in my other responses, the local superuser could want to
impose any arbitrary restriction the connection strings users can
choose, and so it's just as plausible that they want to restrict
options as anything else; but this test is about something more
specific.

> > +             /*
> > +              * For the host parameter, the value might be a local filename.
> > +              * It might also be a reference to the local host's abstract UNIX
> > +              * socket namespace, which we consider equivalent to a local pathname
> > +              * for security purporses.
> > +              */
> > +             if (strcmp(opt->keyword, "host") == 0 && is_unixsock_path(opt->val))
> > +             {
> > +                     result = true;
> > +                     break;
> > +             }
> > +     }
>
> Hm, what about kerberos / gss / SSPI? Aren't those essentially also tied to
> the local filesystem / user?

Uh, I don't know. It doesn't seem so directly true as in these cases,
but what's your thought?

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-01-23 11:34:32 -0500, Robert Haas wrote:
> I will admit that this is not an open-and-shut case, because a
> passwordless login back to the bootstrap superuser account from the
> local machine is a pretty common scenario and doesn't feel
> intrinsically unreasonable to me, and I hadn't thought about that as a
> potential attack vector.

I think it's 90% of the problem... There's IMO no particularly good
alternative to a passwordless login for the bootstrap superuser, and it's the
account you least want to expose...


> > > I still think you're talking about a different problem here. I'm
> > > talking about the problem of knowing whether local files are going to
> > > be accessed by the connection string.
> >
> > Why is this only about local files, rather than e.g. also using the local
> > user?
> 
> Because there's nothing you can do about the local-user case.
> 
> If I'm asked to attempt to connect to a PostgreSQL server, and I
> choose to do that, and the connection succeeds, all I know is that the
> connection actually succeeded.

Well, there is PQconnectionUsedPassword()... Not that it's a great answer.

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Jacob Champion
Date:
On Mon, Jan 23, 2023 at 8:35 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I will admit that this is not an open-and-shut case, because a
> passwordless login back to the bootstrap superuser account from the
> local machine is a pretty common scenario and doesn't feel
> intrinsically unreasonable to me, and I hadn't thought about that as a
> potential attack vector.

It seems to me like that's _the_ primary attack vector. I think I
agree with you that the password requirement is an overly large
hammer, but I don't think it's right (or safe/helpful to DBAs reading
along) to describe it as a manufactured concern.

> If I'm asked to attempt to connect to a PostgreSQL server, and I
> choose to do that, and the connection succeeds, all I know is that the
> connection actually succeeded. I do not know why the remote machine
> chose to accept the connection. If I supplied a password or an SSL
> certificate or some such thing, then it seems likely that the remote
> machine accepted that connection because I supplied that particular
> password or SSL certificate, but it could also be because the remote
> machine accepts all connections from Robert, or all connections
> whatsoever, or all connections on Mondays. I just don't know.

As of SYSTEM_USER, I think this is no longer the case -- after
connection establishment, you can ask the server who was authenticated
and why. (It doesn't explain why you were authorized to be that
particular user, but that seems maybe less important wen you're trying
to disallow ambient authentication.)

If my require_auth patchset gets in, you'd be able to improve on this
by rejecting all ambient forms of authentication at the protocol level
(require_auth=password,md5,scram-sha-256). You could even go a step
further and disable ambient transport authentication
(sslcertmode=disable gssencmode=disable), which keeps a proxied
connection from making use of a client cert or a Kerberos cache. But
for postgres_fdw, at least, that carries a risk of disabling current
use cases. Stephen and I had a discussion about one such case in the
Kerberos delegation thread [1].

It doesn't help you if you want to differentiate one form of ambient
auth (trust/peer/etc.) from another, since they look the same to the
protocol. But for e.g. postgres_fdw I'm not sure why you would want to
differentiate between those cases, because they all seem bad.

> To put that another way, if I'm making a connection on behalf of an
> untrusted party, I can choose not to supply an SSL certificate, or not
> to supply a password. But I cannot choose to not be myself.

(IMO, you're driving towards a separation of the proxy identity from
the user identity. Other protocols do that too.)

--Jacob

[1]
https://www.postgresql.org/message-id/flat/23337c51-7a48-d5a8-569d-ef3ce6fe235f%40timescale.com#38b4033256d9d95773963ce938cbe3ea



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-01-23 12:39:50 -0500, Robert Haas wrote:
> On Sun, Jan 22, 2023 at 8:52 PM Andres Freund <andres@anarazel.de> wrote:
> > > Perhaps we should have a way to directly turn on/off authentication
> > > methods in libpq through API functions and/or options?
> >
> > Yes. There's an in-progress patch adding, I think, pretty much what is
> > required here:
> > https://www.postgresql.org/message-id/9e5a8ccddb8355ea9fa4b75a1e3a9edc88a70cd3.camel@vmware.com
> >
> > require_auth=a,b,c
> >
> > I think an allowlist approach is the right thing for the subscription (and
> > postgres_fdw/dblink) use case, otherwise we'll add some auth method down the
> > line without updating what's disallowed in the subscription code.
> 
> So what would we do here, exactly? We could force a require_auth
> parameter into the provided connection string, although I'm not quite
> sure of the details there

If we parse the connection string first, we can ensure that our values take
precedence, that shouldn't be an issue, I think.


> , but what value should we force? Is that going to be something hard-coded,
> or something configurable? If configurable, where does that configuration
> get stored?

I would probably start with something hardcoded, perhaps with an adjusted
value depending on things like pg_read_server_files.

I'd say just allowing password (whichever submethod), ssl is a good start,
with something like your existing code to prevent file access for ssl unless
pg_read_server_files is granted.


I don't think kerberos, gss, peer, sspi would be safe.


> Regardless, this only allows connection strings to be restricted along
> one axis: authentication type. If you want to let people connect only
> to a certain subnet or whatever, you're still out of luck.

True. But I think it'd get us a large percentage of the use cases.

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Mon, Jan 23, 2023 at 1:26 PM Andres Freund <andres@anarazel.de> wrote:
> > If I'm asked to attempt to connect to a PostgreSQL server, and I
> > choose to do that, and the connection succeeds, all I know is that the
> > connection actually succeeded.
>
> Well, there is PQconnectionUsedPassword()... Not that it's a great answer.

Sure, but that's making an inference about why the remote side did
what it did. It's not fantastic to have a security model that relies
on connecting to a server chosen by the user and having it tell us
truthfully whether or not it relied on the password. Granted, it won't
lie unless it's been hacked, and we're trying to protect it, not
ourselves, so the only thing that happens if it does lie is that it
gets hacked a second time, so I guess there's no real vulnerability?
But I feel like we'd be on far sounder footing if we our security
policy were based on deciding what we are willing to do (are we
willing to read that file? are we willing to attempt that
authentication method?) and before we actually do it, rather than on
trying to decide after-the-fact whether what we did is OK based on
what the remote side tells us about how things turned out.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-01-23 10:27:27 -0800, Jacob Champion wrote:
> On Mon, Jan 23, 2023 at 8:35 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > I will admit that this is not an open-and-shut case, because a
> > passwordless login back to the bootstrap superuser account from the
> > local machine is a pretty common scenario and doesn't feel
> > intrinsically unreasonable to me, and I hadn't thought about that as a
> > potential attack vector.
> 
> It seems to me like that's _the_ primary attack vector. I think I
> agree with you that the password requirement is an overly large
> hammer, but I don't think it's right (or safe/helpful to DBAs reading
> along) to describe it as a manufactured concern.

+1


> > If I'm asked to attempt to connect to a PostgreSQL server, and I
> > choose to do that, and the connection succeeds, all I know is that the
> > connection actually succeeded. I do not know why the remote machine
> > chose to accept the connection. If I supplied a password or an SSL
> > certificate or some such thing, then it seems likely that the remote
> > machine accepted that connection because I supplied that particular
> > password or SSL certificate, but it could also be because the remote
> > machine accepts all connections from Robert, or all connections
> > whatsoever, or all connections on Mondays. I just don't know.
> 
> As of SYSTEM_USER, I think this is no longer the case -- after
> connection establishment, you can ask the server who was authenticated
> and why. (It doesn't explain why you were authorized to be that
> particular user, but that seems maybe less important wen you're trying
> to disallow ambient authentication.)

There's not enough documentation for SYSTEM_USER imo.



> You could even go a step further and disable ambient transport
> authentication (sslcertmode=disable gssencmode=disable), which keeps a
> proxied connection from making use of a client cert or a Kerberos cache. But
> for postgres_fdw, at least, that carries a risk of disabling current use
> cases. Stephen and I had a discussion about one such case in the Kerberos
> delegation thread [1].

I did not find that very convincing for today's code. The likelihood of
something useful being prevented seems far far lower than preventing privilege
leakage...


> It doesn't help you if you want to differentiate one form of ambient
> auth (trust/peer/etc.) from another, since they look the same to the
> protocol. But for e.g. postgres_fdw I'm not sure why you would want to
> differentiate between those cases, because they all seem bad.

It might be possible to teach libpq to differentiate peer from trust (by
disabling passing the current user), or we could tell the server via an option
to disable peer. But as you say, I don't think it'd buy us much.

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Mon, Jan 23, 2023 at 1:27 PM Jacob Champion <jchampion@timescale.com> wrote:
> On Mon, Jan 23, 2023 at 8:35 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > I will admit that this is not an open-and-shut case, because a
> > passwordless login back to the bootstrap superuser account from the
> > local machine is a pretty common scenario and doesn't feel
> > intrinsically unreasonable to me, and I hadn't thought about that as a
> > potential attack vector.
>
> It seems to me like that's _the_ primary attack vector. I think I
> agree with you that the password requirement is an overly large
> hammer, but I don't think it's right (or safe/helpful to DBAs reading
> along) to describe it as a manufactured concern.

First, sorry about the wording. I try to get it right, but sometimes I don't.

Second, the reason why I described it as a manufactured issue is
because it's a bit like asking someone to stand under a ladder and
then complaining when they get hit in the head by a falling object.
It's not that I think it's good for people to get a free exploit to
superuser, or to get hit in the head by falling objects. It's just
that you can't have the things that together lead to some outcome
without also getting the outcome. It seems to me that we basically let
the malicious connection to the target host succeed, and then say ...
oh, never mind, we may have made this connection under false
pretenses, so we shan't use it after all. What I was attempting to
argue is that we shouldn't let things get that far. Either the victim
should be able to protect itself from the malicious connection, or the
connection attempt shouldn't be allowed in the first place, or both.
Blocking the connection attempt after the fact feels like too little,
too late.

For instance, what if the connection string itself caused SQL to be
executed on the remote side, as in the case of target_session_attrs?
Or what if we got those logon triggers that people have been wanting
for years? Or what if the remote server speaks the PostgreSQL protocol
but isn't really PostgreSQL and does ... whatever ... when you just
connect to it?

> As of SYSTEM_USER, I think this is no longer the case -- after
> connection establishment, you can ask the server who was authenticated
> and why. (It doesn't explain why you were authorized to be that
> particular user, but that seems maybe less important wen you're trying
> to disallow ambient authentication.)

I think this is too after-the-fact, as discussed above.

> If my require_auth patchset gets in, you'd be able to improve on this
> by rejecting all ambient forms of authentication at the protocol level
> (require_auth=password,md5,scram-sha-256). You could even go a step
> further and disable ambient transport authentication
> (sslcertmode=disable gssencmode=disable), which keeps a proxied
> connection from making use of a client cert or a Kerberos cache. But
> for postgres_fdw, at least, that carries a risk of disabling current
> use cases. Stephen and I had a discussion about one such case in the
> Kerberos delegation thread [1].

Yes, this is why I think that the system administrator needs to have
some control over policy, instead of just having a hard-coded rule
that applies to everyone.

I'm not completely sure that this is good enough in terms of blocking
the attack as early as I think we should. This is all happening in the
midst of a connection attempt. If the remote server says, "hey, what's
your password?" and we refuse to answer that question, well that seems
somewhat OK. But what if we're hoping to be asked for a password and
the remote server doesn't ask? Then we don't find out that things
aren't right until after we've already logged in, and that gets back
to what I talk about above.

> > To put that another way, if I'm making a connection on behalf of an
> > untrusted party, I can choose not to supply an SSL certificate, or not
> > to supply a password. But I cannot choose to not be myself.
>
> (IMO, you're driving towards a separation of the proxy identity from
> the user identity. Other protocols do that too.)

Hmm, interesting.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Mon, Jan 23, 2023 at 2:47 PM Robert Haas <robertmhaas@gmail.com> wrote:
> Second, the reason why I described it as a manufactured issue is
> because it's a bit like asking someone to stand under a ladder and
> then complaining when they get hit in the head by a falling object.
> It's not that I think it's good for people to get a free exploit to
> superuser, or to get hit in the head by falling objects. It's just
> that you can't have the things that together lead to some outcome
> without also getting the outcome.

I left out a sentence here. What I meant to say was we can't both
allow passwordless loopback connections to the bootstrap superuser and
also allow postgres_fdw to connect to anything that the user requests
and then be surprised when that user can get into the superuser
account. The natural outcome of combining those two things is that
superuser gets hacked.

The password requirement just *barely* prevents that attack from
working, almost, maybe, while at the same time managing to block
things that people want to do for totally legitimate reasons. But
IMHO, the real problem is that combining those two things is extremely
dangerous.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Fri, 2023-01-20 at 11:08 -0500, Robert Haas wrote:
> On Fri, Jan 20, 2023 at 8:25 AM Robert Haas <robertmhaas@gmail.com>
> wrote:
> > I still think you're talking about a different problem here. I'm
> > talking about the problem of knowing whether local files are going
> > to
> > be accessed by the connection string.
>
> So here's a dumb patch for this. At least in my mind, the connection
> string sanitization/validation is the major design problem here

I believe your patch conflates two use cases:

(A) Tightly-coupled servers that are managed by the administrator. In
this case, there are a finite number of connection strings to make, and
the admin knows about all of them. Validation is a poor solution for
this use case, because we get into the weeds trying to figure out
what's safe or not, overriding the admin's better judgement in some
cases and letting through connection strings that might be unsafe. A
much better solution is to simply declare the connection strings as
some kind of object (perhaps a SERVER object), and hand out privileges
or inherit them from a predefined role. Having connection string
objects is also just a better UI: it allows changes to connection
strings over time to adapt to changing security needs, and allows a
simple name that is much easier to type and read.

(B) Loosely-coupled servers that the admin doesn't know about, but
which might be perfectly safe to access. Validation is useful here, but
it's a long road of fine-grained privileges around acceptable hosts,
IPs, authentication types, file access, password sources, password
protocols, connection options, etc. The right solution here is to
identify the sub-usecases of loosely-coupled servers, and enable them
(with the appropriate controls) one at a time. Arguably, that's already
what's happened by demanding a password (even if we don't like the
mechanism, it does seem to work for some important cases).

Is your patch targeted at use case (A), (B), or both?


--
Jeff Davis
PostgreSQL Contributor Team - AWS





Re: Non-superuser subscription owners

From
Jacob Champion
Date:
On 1/23/23 11:05, Andres Freund wrote:
> There's not enough documentation for SYSTEM_USER imo.

If we were to make use of SYSTEM_USER programmatically (and based on
what Robert wrote downthread, that's probably not what's desired), I
think we'd have to make more guarantees about how it can be parsed and
the values that you can expect. Right now it's meant mostly for human
consumption.

>> You could even go a step further and disable ambient transport
>> authentication (sslcertmode=disable gssencmode=disable), which keeps a
>> proxied connection from making use of a client cert or a Kerberos cache. But
>> for postgres_fdw, at least, that carries a risk of disabling current use
>> cases. Stephen and I had a discussion about one such case in the Kerberos
>> delegation thread [1].
> 
> I did not find that very convincing for today's code. The likelihood of
> something useful being prevented seems far far lower than preventing privilege
> leakage...

Fair enough. Preventing those credentials from being pulled in by
default would effectively neutralize my concern for the delegation
patchset, too.

--Jacob




Re: Non-superuser subscription owners

From
Jacob Champion
Date:
On 1/23/23 11:52, Robert Haas wrote:
> On Mon, Jan 23, 2023 at 2:47 PM Robert Haas <robertmhaas@gmail.com> wrote:
>> Second, the reason why I described it as a manufactured issue is
>> because it's a bit like asking someone to stand under a ladder and
>> then complaining when they get hit in the head by a falling object.
>> It's not that I think it's good for people to get a free exploit to
>> superuser, or to get hit in the head by falling objects. It's just
>> that you can't have the things that together lead to some outcome
>> without also getting the outcome.
>
> I left out a sentence here. What I meant to say was we can't both
> allow passwordless loopback connections to the bootstrap superuser and
> also allow postgres_fdw to connect to anything that the user requests
> and then be surprised when that user can get into the superuser
> account. The natural outcome of combining those two things is that
> superuser gets hacked.
>
> The password requirement just *barely* prevents that attack from
> working, almost, maybe, while at the same time managing to block
> things that people want to do for totally legitimate reasons. But
> IMHO, the real problem is that combining those two things is extremely
> dangerous.

I don't disagree. I'm worried that the unspoken conclusion being
presented is "it's such an obvious problem that we should just leave it
to the DBAs," which I very much disagree with, but I may be reading too
much into it.

> It seems to me that we basically let
> the malicious connection to the target host succeed, and then say ...
> oh, never mind, we may have made this connection under false
> pretenses, so we shan't use it after all. What I was attempting to
> argue is that we shouldn't let things get that far. Either the victim
> should be able to protect itself from the malicious connection, or the
> connection attempt shouldn't be allowed in the first place, or both.
> Blocking the connection attempt after the fact feels like too little,
> too late.

Expanding on my previous comment, you could give the client a way to say
"I am a proxy, and I'm connecting on behalf of this user, and here are
both my credentials and their credentials. So if you were planning to,
say, authorize me as superuser based on my IP address... maybe don't do
that?"

(You can sort of implement this today, by giving the proxy a client
certificate for transport authn, having it provide the in-band authn for
the user, and requiring both at the server. It's not very flexible.)

I think this has potential overlap with Magnus' PROXY proposal [1], and
also the case where we want pgbouncer to authenticate itself and then
perform actions on behalf of someone else [2], and maybe SASL's authzid
concept. I don't think one solution will hit all of the desired use
cases, but there are directions that can be investigated.

> I'm not completely sure that this is good enough in terms of blocking
> the attack as early as I think we should. This is all happening in the
> midst of a connection attempt. If the remote server says, "hey, what's
> your password?" and we refuse to answer that question, well that seems
> somewhat OK. But what if we're hoping to be asked for a password and
> the remote server doesn't ask?

require_auth should still successfully mitigate the target_session_attrs
case (going back to the examples you provided). It looks like the SQL is
initiated from the client side, so require_auth will notice that there
was no authentication performed and bail out before we get there.

For the hypothetical logon trigger, or any case where the server does
something on behalf of a user upon connection, I agree it doesn't help you.

--Jacob

[1]
https://www.postgresql.org/message-id/flat/CABUevExJ0ifpUEiX4uOREy0s2kHBrBrb=pXLEHhpMTR1vVR1XA@mail.gmail.com
[2]
https://www.postgresql.org/message-id/CAMT0RQR2fxeaPLHXappBCGEjHJiPCBJMPOHoDWiaYLjuieR0sg%40mail.gmail.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Mon, Jan 23, 2023 at 7:24 PM Jacob Champion <jchampion@timescale.com> wrote:
> > The password requirement just *barely* prevents that attack from
> > working, almost, maybe, while at the same time managing to block
> > things that people want to do for totally legitimate reasons. But
> > IMHO, the real problem is that combining those two things is extremely
> > dangerous.
>
> I don't disagree. I'm worried that the unspoken conclusion being
> presented is "it's such an obvious problem that we should just leave it
> to the DBAs," which I very much disagree with, but I may be reading too
> much into it.

To be honest, that was my first instinct here, but I see the problems
better now than I did at the beginning of this discussion.

> Expanding on my previous comment, you could give the client a way to say
> "I am a proxy, and I'm connecting on behalf of this user, and here are
> both my credentials and their credentials. So if you were planning to,
> say, authorize me as superuser based on my IP address... maybe don't do
> that?"
>
> (You can sort of implement this today, by giving the proxy a client
> certificate for transport authn, having it provide the in-band authn for
> the user, and requiring both at the server. It's not very flexible.)
>
> I think this has potential overlap with Magnus' PROXY proposal [1], and
> also the case where we want pgbouncer to authenticate itself and then
> perform actions on behalf of someone else [2], and maybe SASL's authzid
> concept. I don't think one solution will hit all of the desired use
> cases, but there are directions that can be investigated.

I think this has some potential, but it's pretty complex, seeming to
require protocol extensions and having backward-compatibility problems
and so on. What do you think about something in the spirit of a
reverse-pg_hba.conf? The idea being that PostgreSQL facilities that
make outbound connections are supposed to ask it whether those
connections are OK to initiate.  Then you could have a default
configuration that basically says "don't allow loopback connections"
or "require passwords all the time" or whatever we like, and the DBA
can change that as desired. We could teach dblink, postgres_fdw, and
CREATE SUBSCRIPTION to use this new thing, and third-party code could
adopt it if it likes.

Even if we do that, some kind of proxy protocol support might be very
desirable. I'm not against that. But I think that DBAs need better
control over what kind of outbound connections they want to permit,
too.

> For the hypothetical logon trigger, or any case where the server does
> something on behalf of a user upon connection, I agree it doesn't help you.

I don't think the logon trigger thing is all *that* hypothetical. We
don't have it yet, but there have been patches proposed repeatedly for
many years.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Andrew Dunstan
Date:
On 2023-01-24 Tu 08:50, Robert Haas wrote:
>
> What do you think about something in the spirit of a
> reverse-pg_hba.conf? The idea being that PostgreSQL facilities that
> make outbound connections are supposed to ask it whether those
> connections are OK to initiate.  Then you could have a default
> configuration that basically says "don't allow loopback connections"
> or "require passwords all the time" or whatever we like, and the DBA
> can change that as desired. We could teach dblink, postgres_fdw, and
> CREATE SUBSCRIPTION to use this new thing, and third-party code could
> adopt it if it likes.
>

I kinda like this idea, especially if we could specify the context that
rules are to apply in. e.g. postgres_fdw, mysql_fdw etc. I'd certainly
give it an outing in the redis_fdw if appropriate.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Re: Non-superuser subscription owners

From
Jacob Champion
Date:
On Tue, Jan 24, 2023 at 5:50 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I think this has some potential, but it's pretty complex, seeming to
> require protocol extensions and having backward-compatibility problems
> and so on.

Yeah.

> What do you think about something in the spirit of a
> reverse-pg_hba.conf? The idea being that PostgreSQL facilities that
> make outbound connections are supposed to ask it whether those
> connections are OK to initiate.  Then you could have a default
> configuration that basically says "don't allow loopback connections"
> or "require passwords all the time" or whatever we like, and the DBA
> can change that as desired.

Well, I'll have to kick the idea around a little bit. Kneejerk reactions:

- It's completely reasonable to let a proxy operator restrict how that
proxy is used. I doubt very much that a typical DBA wants to be
operating an open proxy.

- I think the devil will be in the details of the configuration
design. Lists of allowed destination authorities (in the URI sense),
options that must be present/absent/overridden, those sound great. But
your initial examples of allow-loopback and require-passwords options
are in the "make the DBA deal with it" line of thinking, IMO. I think
it's difficult for someone to reason through those correctly the first
time, even for experts. I'd like to instead see the core problem --
that *any* ambient authentication used by a proxy is inherently risky
-- exposed as a highly visible concept in the config, so that it's
hard to make mistakes.

- I'm inherently skeptical of solutions that require all clients --
proxies, in this case -- to be configured correctly in order for a
server to be able to protect itself. (But I also have a larger
appetite for security options that break compatibility when turned on.
:D)

> > For the hypothetical logon trigger, or any case where the server does
> > something on behalf of a user upon connection, I agree it doesn't help you.
>
> I don't think the logon trigger thing is all *that* hypothetical. We
> don't have it yet, but there have been patches proposed repeatedly for
> many years.

Okay. I think this thread has applicable lessons -- if connection
establishment itself leads to side effects, all actors in the
ecosystem (bouncers, proxies) have to be hardened against making those
connections passively. I know we're very different from HTTP, but it
feels similar to their concept of method safety and the consequences
of violating it.

--Jacob



postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Robert Haas
Date:
[ Changing subject line to something more appropriate: This is
branched from the "Non-superuser subscription owners" thread, but the
topic has become connection security more generally for outbound
connections from a PostgreSQL instance, the inadequacies of just
trying to require that such connections always use a password, and
related problems. I proposed some kind of "reverse pg_hba.conf file"
as a way of allowing configurable limits on such outbound connections.
]

On Tue, Jan 24, 2023 at 2:18 PM Jacob Champion <jchampion@timescale.com> wrote:
> - It's completely reasonable to let a proxy operator restrict how that
> proxy is used. I doubt very much that a typical DBA wants to be
> operating an open proxy.

That's very well put. It's precisely what I was thinking, but
expressed much more clearly.

> - I think the devil will be in the details of the configuration
> design. Lists of allowed destination authorities (in the URI sense),
> options that must be present/absent/overridden, those sound great. But
> your initial examples of allow-loopback and require-passwords options
> are in the "make the DBA deal with it" line of thinking, IMO. I think
> it's difficult for someone to reason through those correctly the first
> time, even for experts. I'd like to instead see the core problem --
> that *any* ambient authentication used by a proxy is inherently risky
> -- exposed as a highly visible concept in the config, so that it's
> hard to make mistakes.

I find the concept of "ambient authentication" problematic. I don't
know exactly what you mean by it. I hope you'll tell me, but I think
that I won't like it even after I know, because as I said before, it's
difficult to know why anyone else makes a decision, and asking an
untrusted third-party why they're deciding something is sketchy at
best. I think that the problems we have in this area can be solved by
either (a) restricting the open proxy to be less open or (b)
encouraging people to authenticate users in some way that won't admit
connections from an open proxy. The former needs to be configurable by
the DBA, and the latter is also a configuration choice by the DBA. We
can provide tools here that make it less likely that people will shoot
themselves in the foot, and we can ship default configurations that
reduce the chance of inadvertent foot-shooting, and we can write
documentation that says "don't shoot yourself in the foot," but we
cannot actually prevent people from shooting themselves in the foot
except, perhaps, by massively nerfing the capabilities of the system.

What I was thinking about in terms of a "reverse pg_hba.conf" was
something in the vein of, e.g.:

SOURCE_COMPONENT SOURCE_DATABASE SOURCE_USER DESTINATION_SUBNET
DESTINATION_DATABASE DESTINATION_USER OPTIONS ACTION

e.g.

all all all local all all - deny # block access through UNIX sockets
all all all 127.0.0.0/8 all all - deny # block loopback interface via IPv4

Or:

postgres_fdw all all all all all authentication=cleartext,md5,sasl
allow # allow postgres_fdw with password-ish authentication

Disallowing loopback connections feels quite tricky. You could use
127.anything.anything.anything, but you could also loop back via IPv6,
or you could loop back via any interface. But you can't use
subnet-based ACLs to rule out loop backs through IP/IPv6 interfaces
unless you know what all your system's own IPs are. Maybe that's an
argument in favor of having a dedicated deny-loopback facility built
into the system instead of relying on IP ACLs. But I am not sure that
really works either: how sure are we that we can discover all of the
local IP addresses? Maybe it doesn't matter anyway, since the point is
just to disallow anything that would be likely to use "trust" or
"ident" authentication, and that's probably not going to include any
non-loopback network interfaces. But ... is that true in general? What
about on Windows?

> - I'm inherently skeptical of solutions that require all clients --
> proxies, in this case -- to be configured correctly in order for a
> server to be able to protect itself. (But I also have a larger
> appetite for security options that break compatibility when turned on.
> :D)

I (still) don't think that restricting the proxy is required, but you
can't both not restrict the proxy and also allow passwordless loopback
superuser connections. You have to pick one or the other. The reason I
keep harping on the role of the DBA is that I don't think we can make
that choice unilaterally on behalf of everyone. We've tried doing that
with the current rules and we've discussed the weaknesses of that
approach already.

> > I don't think the logon trigger thing is all *that* hypothetical. We
> > don't have it yet, but there have been patches proposed repeatedly for
> > many years.
>
> Okay. I think this thread has applicable lessons -- if connection
> establishment itself leads to side effects, all actors in the
> ecosystem (bouncers, proxies) have to be hardened against making those
> connections passively. I know we're very different from HTTP, but it
> feels similar to their concept of method safety and the consequences
> of violating it.

I am not familiar with that concept in detail but that sounds right to me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Mon, Jan 23, 2023 at 3:50 PM Jeff Davis <pgsql@j-davis.com> wrote:
> I believe your patch conflates two use cases:
>
> (A) Tightly-coupled servers that are managed by the administrator. In
> this case, there are a finite number of connection strings to make, and
> the admin knows about all of them. Validation is a poor solution for
> this use case, because we get into the weeds trying to figure out
> what's safe or not, overriding the admin's better judgement in some
> cases and letting through connection strings that might be unsafe. A
> much better solution is to simply declare the connection strings as
> some kind of object (perhaps a SERVER object), and hand out privileges
> or inherit them from a predefined role. Having connection string
> objects is also just a better UI: it allows changes to connection
> strings over time to adapt to changing security needs, and allows a
> simple name that is much easier to type and read.
>
> (B) Loosely-coupled servers that the admin doesn't know about, but
> which might be perfectly safe to access. Validation is useful here, but
> it's a long road of fine-grained privileges around acceptable hosts,
> IPs, authentication types, file access, password sources, password
> protocols, connection options, etc. The right solution here is to
> identify the sub-usecases of loosely-coupled servers, and enable them
> (with the appropriate controls) one at a time. Arguably, that's already
> what's happened by demanding a password (even if we don't like the
> mechanism, it does seem to work for some important cases).
>
> Is your patch targeted at use case (A), (B), or both?

I suppose that I would say that the patch is a better fit for (B),
because I'm not proposing to add any kind of intermediate object of
the type you postulate in (A). However, I don't really agree with the
way you've split this up, either. It seems to me that the relevant
question isn't "are the servers tightly coupled?" but rather "could
some user make a mess if we let them use any arbitrary connection
string?".

If you're running all of the machines involved on a private network
that is well-isolated from the Internet and in which only trusted
actors operate, you could use what I'm proposing here for either (A)
or (B) and it would be totally fine. If your server is sitting out on
the public Internet and is adequately secured against malicious
loopback connections, you could also probably use it for either (A) or
(B), unless you've got users who are really shady and you're worried
that the outbound connections that they make from your machine might
get you into trouble, in which case you probably can't use it for
either (A) or (B). Basically, the patch is suitable for cases where
you don't really need to restrict what connection strings people can
use, and unsuitable for cases where you do, but that doesn't have much
to do with whether the servers involved are loosely or tightly
coupled.

I think that you're basically trying to make an argument that some
sort of complex outbound connection filtering is mandatory, and I
still don't really agree with that. We ship postgres_fdw with
something extremely minimal - just a requirement that the password get
used - and the same for dblink. I think those rules suck and are
probably bad and insecure in quite a number of cases, and overly
strict in others, but I can think of no reason why CREATE SUBSCRIPTION
should be held to a higher standard than anything else. The
connections that you can make using CREATE SUBSCRIPTION are strictly
weaker than the ones you can make with dblink, which permits arbitrary
SQL execution. It cannot be right to suppose that a less-exploitable
system needs to be held to a higher security standard than a similar
but more-exploitable system.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Jacob Champion
Date:
On 1/24/23 12:04, Robert Haas wrote:
> I find the concept of "ambient authentication" problematic. I don't
> know exactly what you mean by it. I hope you'll tell me,

Sure: Ambient authority [1] means that something is granted access based
on some aspect of its existence that it can't remove (or even
necessarily enumerate). Up above, when you said "I cannot choose not to
be myself," that's a clear marker that ambient authority is involved.
Examples of ambient authn/z factors might include an originating IP
address, the user ID of a connected peer process, the use of a loopback
interface, a GPS location, and so on. So 'peer' and 'ident' are ambient
authentication methods.

And, because I think it's useful, I'll extend the definition to include
privileges that _could_ be dropped by a proxy, but in practice are
included because there's no easy way not to. Examples for libpq include
the automatic use of the client certificate in ~/.postgresql, or any
Kerberos credentials available in the local user cache. (Or even a
PGPASSWORD set up and forgotten by a DBA.)

Ambient authority is closely related to the confused deputy problem [2],
and the proxy discussed here is a classic confused deputy. The proxy
doesn't know that a core piece of its identity has been used to
authenticate the request it's forwarding. It can't choose its IP
address, or its user ID.

I'm most familiar with this in the context of HTTP, cookie-/IP-based
authn, and cross-site request forgeries. Whenever someone runs a local
web server with no authentication and says "it's okay! we only respond
to requests from the local host!" they're probably about to be broken
open by the first person to successfully reflect a request through the
victim's (very local) web browser.

Ways to mitigate or solve this problem (that I know of) include

1) Forwarding the original ambient context along with the request, so
the server can check it too. HTTP has the Origin header, so a browser
can say, "This request is not coming from my end user; it's coming from
a page controlled by example.org. You can't necessarily treat attached
cookies like they're authoritative." The PROXY protocol lets a proxy
forward several ambient factors, including the originating IP address
(or even the use of a UNIX socket) and information about the original
TLS context.

2) Explicitly combining the request with the proof of authority needed
to make it, as in capability-based security [3]. Some web frameworks
push secret "CSRF tokens" into URLs for this purpose, to tangle the
authorization into the request itself [4]. I'd argue that the "password
requirement" implemented by postgres_fdw and discussed upthread was an
attempt at doing this, to try to ensure that the authentication comes
from the user explicitly and not from the proxy. It's just not very strong.

(require_auth would strengthen it quite a bit; a major feature of that
patchset is to explicitly name the in-band authentication factors that a
server is allowed to pull out of a client. It's still not strong enough
to make a true capability, for one because it's client-side only. But as
long as servers don't perform actions on behalf of users upon
connection, that's pretty good in practice.)

3) Dropping as many implicitly-held privileges as possible before making
a request. This doesn't solve the problem but may considerably reduce
the practical attack surface. For example, if browsers didn't attach
their user's cookies to cross-origin requests, cross-site request
forgeries would probably be considerably less dangerous (and, in the
years since I left the space, it looks like browsers have finally
stopped doing this by default). Upthread, Andres suggested disabling the
default inclusion of client certs and GSS creds, and I would extend that
to include really *anything* pulled in from the environment. Make the
DBA explicitly allow those things.

> but I think
> that I won't like it even after I know, because as I said before, it's
> difficult to know why anyone else makes a decision, and asking an
> untrusted third-party why they're deciding something is sketchy at
> best.

I think that's a red herring. Setting aside that you can, in fact, prove
that the server has authenticated you (e.g. require_auth=scram-sha-256
in my proposed patchset), I don't think "untrusted servers, that we
don't control, doing something stupid" is a very useful thing to focus
on. We're trying to secure the case where a server *is* authenticating
us, using known useful factors, but those factors have been co-opted by
an attacker via a proxy.

> I think that the problems we have in this area can be solved by
> either (a) restricting the open proxy to be less open or (b)
> encouraging people to authenticate users in some way that won't admit
> connections from an open proxy.

(a) is an excellent mitigation, and we should do it. (b) starts getting
shaky because I think peer auth is actually a very reasonable choice for
many people. So I hope we can also start solving the underlying problem
while we implement (a).

> we
> cannot actually prevent people from shooting themselves in the foot
> except, perhaps, by massively nerfing the capabilities of the system.

But I thought we already agreed that most DBAs do not want a massively
capable proxy? I don't think we have to massively nerf the system, but
let's say we did. Would that really be unacceptable for this use case?

(You're still driving hard down the "it's impossible for us to securely
handle both cases at the same time" path. I don't think that's true from
a technical standpoint, because we hold nearly total control of the
protocol. I think we're in a much easier situation than HTTP was.)

> What I was thinking about in terms of a "reverse pg_hba.conf" was
> something in the vein of, e.g.:
> 
> SOURCE_COMPONENT SOURCE_DATABASE SOURCE_USER DESTINATION_SUBNET
> DESTINATION_DATABASE DESTINATION_USER OPTIONS ACTION
> 
> e.g.
> 
> all all all local all all - deny # block access through UNIX sockets
> all all all 127.0.0.0/8 all all - deny # block loopback interface via IPv4
> 
> Or:
> 
> postgres_fdw all all all all all authentication=cleartext,md5,sasl
> allow # allow postgres_fdw with password-ish authentication

I think this style focuses on absolute configuration flexibility at the
expense of usability. It obfuscates the common use cases. (I have the
exact same complaint about our HBA and ident configs, so I may be
fighting uphill.)

How should a DBA decide what is correct, or audit a configuration they
inherited from someone else? What makes it obvious why a proxy should
require cleartext auth instead of peer auth (especially since peer auth
seems to be inherently better, until you've read this thread)?

I'd rather the configuration focus on the pieces of a proxy's identity
that can be assumed by a client. For example, if the config has an
option for "let a client steal the proxy's user ID", and it's off by
default, then we've given the problem a name. DBAs can educate
themselves on it.

And if that option is off, then the implementation knows that

1) If the client has supplied explicit credentials and we can force the
server to use them, we're safe.
2) If the DBA says they're not running an ident server, or we can force
the server not to use ident authn, or the DBA pinky-swears that that
server isn't using ident authn, all IP connections are additionally safe.
3) If we have a way to forward the client's "origin" and we know that
the server will pay attention to it, all UNIX socket connections are
additionally safe.
4) Any *future* authentication method we add later needs to be
restricted in the same way.

Should we allow the use of our default client cert? the Kerberos cache?
passwords from the environment? All these are named and off by default.
DBAs can look through those options and say "oh, yeah, that seems like a
really bad idea because we have this one server over here..." And we
(the experts) now get to make the best decisions we can, based on a
DBA's declared intent, so the implementation gets to improve over time.
> Disallowing loopback connections feels quite tricky. You could use
> 127.anything.anything.anything, but you could also loop back via IPv6,
> or you could loop back via any interface. But you can't use
> subnet-based ACLs to rule out loop backs through IP/IPv6 interfaces
> unless you know what all your system's own IPs are. Maybe that's an
> argument in favor of having a dedicated deny-loopback facility built
> into the system instead of relying on IP ACLs. But I am not sure that
> really works either: how sure are we that we can discover all of the
> local IP addresses?

Well, to follow you down that road a little bit, I think that a DBA that
has set up `samehost ... trust` in their HBA is going to expect a
corresponding concept here, and it seems important for us to use an
identical implementation of samehost and samenet.

But I don't really want to follow you down that road, because I think
you illustrated my point yourself. You're already thinking about making
Disallowing Loopback Connections a first-class concept, but then you
immediately said

> Maybe it doesn't matter anyway, since the point is
> just to disallow anything that would be likely to use "trust" or
> "ident" authentication

I'd rather we enshrine that -- the point -- in the configuration, and
have the proxy disable everything that can't provably meet that intent.

Thanks,
--Jacob

[1] https://en.wikipedia.org/wiki/Ambient_authority
[2] https://en.wikipedia.org/wiki/Confused_deputy_problem
[3] https://en.wikipedia.org/wiki/Capability-based_security
[4] https://www.rfc-editor.org/rfc/rfc6265#section-8.2



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Tue, 2023-01-24 at 17:00 -0500, Robert Haas wrote:
> It seems to me that the relevant
> question isn't "are the servers tightly coupled?" but rather "could
> some user make a mess if we let them use any arbitrary connection
> string?".

The split I created is much easier for an admin to answer: is the list
of servers finite, or can users connect to new servers the admin isn't
even aware of? If it's a finite list, I feel there's a much better
solution with both security and UI benefits.

With your question, I'm not entirely clear if that's a question that we
already have an answer for (require a password parameter), or that we
will answer in this thread, or that the admin will answer.

> unless you've got users who are really shady 

Or compromised. Unfortunately, a role that's creating subscriptions has
a lot of surface area for escalation-of-privilege attacks, because they
have to trust all the owners of all the tables the subscriptions write
to.


> I think that you're basically trying to make an argument that some
> sort of complex outbound connection filtering is mandatory

No, I'm not asking for the validation to be more complex.

I believe use case (A) is a substantial use case, and I'd like to leave
space in the user interface to solve it a much better way than
connection string validation can offer. But to solve use case (A), we
need to separate the ability to create a subscription from the ability
to create a connection string.

Right now you see those as the same because they are done at the same
time in the same command; but I don't see it that way, because I had
plans to allow a variant of CREATE SUBSCRIPTION that uses foreign
servers. That plan would be consistent with dblink and postgres_fdw,
which already allow specifying foreign servers.

I propose that we have two predefined roles: pg_create_subscription,
and pg_create_connection. If creating a subscription with a connection
string, you'd need to be a member of both roles. But to create a
subscription with a server object, you'd just need to be a member of
pg_create_subscription and have the USAGE privilege on the server
object.


--
Jeff Davis
PostgreSQL Contributor Team - AWS





Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Jan 25, 2023 at 10:45 PM Jeff Davis <pgsql@j-davis.com> wrote:
> I propose that we have two predefined roles: pg_create_subscription,
> and pg_create_connection. If creating a subscription with a connection
> string, you'd need to be a member of both roles. But to create a
> subscription with a server object, you'd just need to be a member of
> pg_create_subscription and have the USAGE privilege on the server
> object.

I have no issue with that as a long-term plan. However, I think that
for right now we should just introduce pg_create_subscription. It
would make sense to add pg_create_connection in the same patch that
adds a CREATE CONNECTION command (or whatever exact syntax we end up
with) -- and that patch can also change CREATE SUBSCRIPTION to require
both privileges where a connection string is specified directly.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Thu, 2023-01-26 at 09:43 -0500, Robert Haas wrote:
> I have no issue with that as a long-term plan. However, I think that
> for right now we should just introduce pg_create_subscription. It
> would make sense to add pg_create_connection in the same patch that
> adds a CREATE CONNECTION command (or whatever exact syntax we end up
> with) -- and that patch can also change CREATE SUBSCRIPTION to
> require
> both privileges where a connection string is specified directly.

I assumed it would be a problem to say that pg_create_subscription was
enough to create a subscription today, and then later require
additional privileges (e.g. pg_create_connection).

If that's not a problem, then this sounds fine with me.

--
Jeff Davis
PostgreSQL Contributor Team - AWS





Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Thu, Jan 26, 2023 at 12:36 PM Jeff Davis <pgsql@j-davis.com> wrote:
> On Thu, 2023-01-26 at 09:43 -0500, Robert Haas wrote:
> > I have no issue with that as a long-term plan. However, I think that
> > for right now we should just introduce pg_create_subscription. It
> > would make sense to add pg_create_connection in the same patch that
> > adds a CREATE CONNECTION command (or whatever exact syntax we end up
> > with) -- and that patch can also change CREATE SUBSCRIPTION to
> > require
> > both privileges where a connection string is specified directly.
>
> I assumed it would be a problem to say that pg_create_subscription was
> enough to create a subscription today, and then later require
> additional privileges (e.g. pg_create_connection).
>
> If that's not a problem, then this sounds fine with me.

Wonderful! I'm working on a patch, but due to various distractions,
it's not done yet.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Thu, Jan 19, 2023 at 8:46 PM Andres Freund <andres@anarazel.de> wrote:
> > If we already had (or have) that logic someplace else, it would
> > probably make sense to reuse it
>
> We hve. See at least postgres_fdw's check_conn_params(), dblink's
> dblink_connstr_check() and dblink_security_check().

In the patch I posted previously, I had some other set of checks, more
or less along the lines suggested by Jeff. I looked into revising that
approach and making the behavior match exactly what we do in those
places instead. I find that it breaks 027_nosuperuser.pl.
Specifically, where without the patch I get "ok 6 - nosuperuser admin
with all table privileges can replicate into unpartitioned", with the
patch it goes boom, because the subscription can't connect any more
due to the password requirement.

At first, I found it a bit tempting to see this as a further
indication that the force-a-password approach is not the right idea,
because the test case clearly memorializes a desire *not* to require a
password in this situation. However, the loopback-to-superuser attack
is just as viable for subscription as it in other cases, and my
previous patch would have done nothing to block it. So what I did
instead is add a password_required attribute, just like what
postgres_fdw has. As in the case of postgres_fdw, the actual rule is
that if the attribute is false, a password is not required, and if the
attribute is true, a password is required unless you are a superuser.
If you're a superuser, it still isn't. This is a slightly odd set of
semantics but it has precedent and practical advantages. Also, as in
the case of postgres_fdw, only a superuser can set
password_required=false, and a subscription that has that setting can
only be modified by a superuser, no matter who owns it.

Even though I hate the require-a-password stuff with the intensity of
a thousand suns, I think this is better than the previous patch,
because it's more consistent with what we do elsewhere and because it
blocks the loopback-connection-to-superuser attack. I think we
*really* need to develop a better system for restricting proxied
connections (no matter how proxied) and I hope that we do that soon.
But inventing something for this purpose that differs from what we do
elsewhere will make that task harder, not easier.

Thoughts?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Robert Haas
Date:
On Wed, Jan 25, 2023 at 6:22 PM Jacob Champion <jchampion@timescale.com> wrote:
> Sure: Ambient authority [1] means that something is granted access based
> on some aspect of its existence that it can't remove (or even
> necessarily enumerate). Up above, when you said "I cannot choose not to
> be myself," that's a clear marker that ambient authority is involved.
> Examples of ambient authn/z factors might include an originating IP
> address, the user ID of a connected peer process, the use of a loopback
> interface, a GPS location, and so on. So 'peer' and 'ident' are ambient
> authentication methods.

OK.

> 1) Forwarding the original ambient context along with the request, so
> the server can check it too.

Right, so a protocol extension. Reasonable idea, but a big lift. Not
only do you need everyone to be running a new enough version of
PostgreSQL, but existing proxies like pgpool and pgbouncer need
updates, too.

> 2) Explicitly combining the request with the proof of authority needed
> to make it, as in capability-based security [3].

As far as I can see, that link doesn't address how you'd make this
approach work across a network.

> 3) Dropping as many implicitly-held privileges as possible before making
> a request. This doesn't solve the problem but may considerably reduce
> the practical attack surface.

Right. I definitely don't object to this kind of approach, but I don't
think it can ever be sufficient by itself.

> > e.g.
> >
> > all all all local all all - deny # block access through UNIX sockets
> > all all all 127.0.0.0/8 all all - deny # block loopback interface via IPv4
> >
> > Or:
> >
> > postgres_fdw all all all all all authentication=cleartext,md5,sasl
> > allow # allow postgres_fdw with password-ish authentication
>
> I think this style focuses on absolute configuration flexibility at the
> expense of usability. It obfuscates the common use cases. (I have the
> exact same complaint about our HBA and ident configs, so I may be
> fighting uphill.)

That's probably somewhat true, but on the other hand, it also is more
powerful than what you're describing. In your system, is there some
way the DBA can say "hey, you can connect to any of the machines on
this list of subnets, but nothing else"? Or equally, "hey, you may NOT
connect to any machine on this list of subnets, but anything else is
fine"? Or "you can connect to these subnets without SSL, but if you
want to talk to anything else, you need to use SSL"? I would feel a
bit bad saying that those are just use cases we don't care about. Most
people likely wouldn't use that kind of flexibility, so maybe it
doesn't really matter, but it seems kind of nice to have. Your idea
seems to rely on us being able to identify all of the policies that a
user is likely to want and give names to each one, and I don't feel
very confident that that's realistic. But maybe I'm misinterpreting
your idea?

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-01-27 14:42:01 -0500, Robert Haas wrote:
> At first, I found it a bit tempting to see this as a further
> indication that the force-a-password approach is not the right idea,
> because the test case clearly memorializes a desire *not* to require a
> password in this situation. However, the loopback-to-superuser attack
> is just as viable for subscription as it in other cases, and my
> previous patch would have done nothing to block it.

Hm, compared to postgres_fdw, the user has far less control over what's
happening using that connection. Is there a way a subscription owner can
trigger evaluation of near-arbitrary SQL on the publisher side?


> So what I did instead is add a password_required attribute, just like what
> postgres_fdw has. As in the case of postgres_fdw, the actual rule is that if
> the attribute is false, a password is not required, and if the attribute is
> true, a password is required unless you are a superuser.  If you're a
> superuser, it still isn't. This is a slightly odd set of semantics but it
> has precedent and practical advantages. Also, as in the case of
> postgres_fdw, only a superuser can set password_required=false, and a
> subscription that has that setting can only be modified by a superuser, no
> matter who owns it.

I started out asking what benefits it provides to own a subscription one
cannot modify. But I think it is a good capability to have, to restrict the
set of relations that replication could target.  Although perhaps it'd be
better to set the "replay user" as a separate property on the subscription?

Does owning a subscription one isn't allowed to modify useful outside of that?



> Even though I hate the require-a-password stuff with the intensity of
> a thousand suns, I think this is better than the previous patch,
> because it's more consistent with what we do elsewhere and because it
> blocks the loopback-connection-to-superuser attack. I think we
> *really* need to develop a better system for restricting proxied
> connections (no matter how proxied) and I hope that we do that soon.
> But inventing something for this purpose that differs from what we do
> elsewhere will make that task harder, not easier.
> 
> Thoughts?

I think it's reasonable to mirror behaviour from elsewhere, and it'd let us
have this feature relatively soon - I think it's a common need to do this as a
non-superuser. It's IMO a very good idea to not subscribe as a superuser, even
if set up by a superuser...

But I also would understand if you / somebody else chose to focus on
implementing a less nasty connection model.


> Subject: [PATCH v2] Add new predefined role pg_create_subscriptions.

Maybe a daft question:

Have we considered using a "normal grant", e.g. on the database, instead of a
role?  Could it e.g. be useful to grant a user the permission to create a
subscription in one database, but not in another?


> @@ -1039,6 +1082,16 @@ AlterSubscription(ParseState *pstate, AlterSubscriptionStmt *stmt,
>  
>      sub = GetSubscription(subid, false);
>  
> +    /*
> +     * Don't allow non-superuser modification of a subscription with
> +     * password_required=false.
> +     */
> +    if (!sub->passwordrequired && !superuser())
> +        ereport(ERROR,
> +                (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                         errmsg("password_required=false is superuser-only"),
> +                         errhint("Subscriptions with the password_required option set to false may only be created
ormodified by the superuser.")));
 
> +
>      /* Lock the subscription so nobody else can do anything with it. */
>      LockSharedObject(SubscriptionRelationId, subid, 0, AccessExclusiveLock);

The subscription code already does ownership checks before locking and now
there's also the passwordrequired before.  Is it possible that this could open
up some sort of race? Could e.g. the user change the ownership to the
superuser in one session, do an ALTER in the other?

It looks like your change won't increase the danger of that, as the
superuser() check just checks the current users permissions.


> @@ -180,6 +180,13 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname,
>      if (PQstatus(conn->streamConn) != CONNECTION_OK)
>          goto bad_connection_errmsg;
>  
> +    if (must_use_password && !PQconnectionUsedPassword(conn->streamConn))
> +        ereport(ERROR,
> +                (errcode(ERRCODE_S_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED),
> +                 errmsg("password is required"),
> +                 errdetail("Non-superuser cannot connect if the server does not request a password."),
> +                 errhint("Target server's authentication method must be changed.")));
> +

The documentation of libpqrcv_connect() says that:
 * Returns NULL on error and fills the err with palloc'ed error message.

and throwing an error like that will at the very least leak the connection,
fd, fd reservation. Which I just had fixed :). At the very least you'd need to
copy the stuff that "bad_connection:" does.


I did wonder whether we should make libpqrcv_connect() use errsave() to return
errors.  Or whether we should make libpqrcv register a memory context reset
callback that'd close the libpq connection.


>  /*
> - * Validate connection info string (just try to parse it)
> + * Validate connection info string, and determine whether it might cause
> + * local filesystem access to be attempted.
> + *
> + * If the connection string can't be parsed, this function will raise
> + * an error and will not return. If it can, it will return true if this
> + * connection string specifies a password and false otherwise.
>   */
> -static void
> +static bool
>  libpqrcv_check_conninfo(const char *conninfo)

That is a somewhat odd API.  Why does it throw for some things, but not
others? Seems a bit cleaner to pass in a parameter indicating whether it
should throw when not finding a password? Particularly because you already
pass that to walrcv_connect().

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Fri, Jan 27, 2023 at 4:09 PM Andres Freund <andres@anarazel.de> wrote:
> Hm, compared to postgres_fdw, the user has far less control over what's
> happening using that connection. Is there a way a subscription owner can
> trigger evaluation of near-arbitrary SQL on the publisher side?

I'm not aware of one, but what I think it would let you do is
exfiltrate data you're not entitled to see.

> I started out asking what benefits it provides to own a subscription one
> cannot modify. But I think it is a good capability to have, to restrict the
> set of relations that replication could target.  Although perhaps it'd be
> better to set the "replay user" as a separate property on the subscription?

That's been proposed previously, but for reasons I don't quite
remember it seems not to have happened. I don't think it achieved
consensus.

> Does owning a subscription one isn't allowed to modify useful outside of that?

Uh, possibly that's a question for Mark or Jeff. I don't know. I can't
see what they would be, but I just work here.

> Maybe a daft question:
>
> Have we considered using a "normal grant", e.g. on the database, instead of a
> role?  Could it e.g. be useful to grant a user the permission to create a
> subscription in one database, but not in another?

Potentially, but I didn't think we'd want to burn through permissions
bits that fast, even given 7b378237aa805711353075de142021b1d40ff3b0.
Still, if the consensus is otherwise, I can change it. Then I guess
we'd end up with GRANT CREATE ON DATABASE and GRANT CREATE
SUBSCRIPTION ON DATABASE, which I'm sure wouldn't be confusing at all.

Or, another thought, maybe this should be checking for CREATE on the
current database + also pg_create_subscription. That seems like it
might be the right idea, actually.

> The subscription code already does ownership checks before locking and now
> there's also the passwordrequired before.  Is it possible that this could open
> up some sort of race? Could e.g. the user change the ownership to the
> superuser in one session, do an ALTER in the other?
>
> It looks like your change won't increase the danger of that, as the
> superuser() check just checks the current users permissions.

I'm not entirely clear whether there's a hazard there. If there is, I
think we could fix it by moving the LockSharedObject call up higher,
above object_ownercheck. The only problem with that is it lets you
lock an object on which you have no permissions: see
2ad36c4e44c8b513f6155656e1b7a8d26715bb94. To really fix that, we'd
need an analogue of RangeVarGetRelidExtended.

> and throwing an error like that will at the very least leak the connection,
> fd, fd reservation. Which I just had fixed :). At the very least you'd need to
> copy the stuff that "bad_connection:" does.

OK.

> I did wonder whether we should make libpqrcv_connect() use errsave() to return
> errors.  Or whether we should make libpqrcv register a memory context reset
> callback that'd close the libpq connection.

Yeah. Using errsave() might be better, but not sure I want to tackle
that just now.

> That is a somewhat odd API.  Why does it throw for some things, but not
> others? Seems a bit cleaner to pass in a parameter indicating whether it
> should throw when not finding a password? Particularly because you already
> pass that to walrcv_connect().

Will look into that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-01-27 16:35:11 -0500, Robert Haas wrote:
> > Maybe a daft question:
> >
> > Have we considered using a "normal grant", e.g. on the database, instead of a
> > role?  Could it e.g. be useful to grant a user the permission to create a
> > subscription in one database, but not in another?
> 
> Potentially, but I didn't think we'd want to burn through permissions
> bits that fast, even given 7b378237aa805711353075de142021b1d40ff3b0.
> Still, if the consensus is otherwise, I can change it.

I don't really have an opinion on what's better. I looked briefly whether
there was discussion around ithis but I didn't see anything.

pg_create_subcription feels a bit different than most of the other pg_*
roles. For most of those there is no schema object to tie permissions to. But
here there is.

But I think there's good arguments against a GRANT approach, too. GRANT ALL ON
DATABASE would suddenly be dangerous. How does it interact with database
ownership? Etc.


> Then I guess we'd end up with GRANT CREATE ON DATABASE and GRANT CREATE
> SUBSCRIPTION ON DATABASE, which I'm sure wouldn't be confusing at all.

Heh. I guess it could just be GRANT SUBSCRIBE.



> Or, another thought, maybe this should be checking for CREATE on the
> current database + also pg_create_subscription. That seems like it
> might be the right idea, actually.

Yes, that seems like a good idea.



> > The subscription code already does ownership checks before locking and now
> > there's also the passwordrequired before.  Is it possible that this could open
> > up some sort of race? Could e.g. the user change the ownership to the
> > superuser in one session, do an ALTER in the other?
> >
> > It looks like your change won't increase the danger of that, as the
> > superuser() check just checks the current users permissions.
> 
> I'm not entirely clear whether there's a hazard there.

I'm not at all either. It's just a code pattern that makes me anxious - I
suspect there's a few places it makes us more vulnerable.


> If there is, I think we could fix it by moving the LockSharedObject call up
> higher, above object_ownercheck. The only problem with that is it lets you
> lock an object on which you have no permissions: see
> 2ad36c4e44c8b513f6155656e1b7a8d26715bb94. To really fix that, we'd need an
> analogue of RangeVarGetRelidExtended.

Yea, we really should have something like RangeVarGetRelidExtended() for other
kinds of objects. It'd take a fair bit of work / time to use it widely, but
it'll take even longer if we start in 5 years ;)

Perhaps the bulk of RangeVarGetRelidExtended() could be generalized by having
a separate name->oid lookup callback?

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Jan 27, 2023, at 1:35 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
>> I started out asking what benefits it provides to own a subscription one
>> cannot modify. But I think it is a good capability to have, to restrict the
>> set of relations that replication could target.  Although perhaps it'd be
>> better to set the "replay user" as a separate property on the subscription?
>
> That's been proposed previously, but for reasons I don't quite
> remember it seems not to have happened. I don't think it achieved
> consensus.
>
>> Does owning a subscription one isn't allowed to modify useful outside of that?
>
> Uh, possibly that's a question for Mark or Jeff. I don't know. I can't
> see what they would be, but I just work here.

If the owner cannot modify the subscription, then the owner degenerates into a mere "run-as" user.  Note that this
isn'thow things work now, and even if we disallowed owners from modifying the connection string, there would still be
otherattributes the owner could modify, such as the set of publications subscribed. 


More generally, my thinking on this thread is that there needs to be two nosuperuser roles:  A higher privileged role
whichcan create a subscription, and a lower privileged role serving the "run-as" function.  Those shouldn't be the
same,because the "run-as" concept doesn't logically need to have subscription creation power, and likely *shouldn't*
havethat power.  Depending on which sorts of attributes a subscription object has, such as the connection string, the
answerdiffers for whether the owner/"run-as" user should get to change those attributes.  One advantage of Jeff's idea
ofusing a server object rather than a string is that it becomes more plausibly safe to allow the subscription owner to
makechanges to that attribute of the subscription. 



—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Fri, Jan 27, 2023 at 5:56 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> If the owner cannot modify the subscription, then the owner degenerates into a mere "run-as" user.  Note that this
isn'thow things work now, and even if we disallowed owners from modifying the connection string, there would still be
otherattributes the owner could modify, such as the set of publications subscribed. 

The proposed patch blocks every form of ALTER SUBSCRIPTION if
password_required false is set and you aren't a superuser. Is there
some other DML command that could be used to modify the set of
publications subscribed?

> More generally, my thinking on this thread is that there needs to be two nosuperuser roles:  A higher privileged role
whichcan create a subscription, and a lower privileged role serving the "run-as" function.  Those shouldn't be the
same,because the "run-as" concept doesn't logically need to have subscription creation power, and likely *shouldn't*
havethat power.  Depending on which sorts of attributes a subscription object has, such as the connection string, the
answerdiffers for whether the owner/"run-as" user should get to change those attributes.  One advantage of Jeff's idea
ofusing a server object rather than a string is that it becomes more plausibly safe to allow the subscription owner to
makechanges to that attribute of the subscription. 

There's some question in my mind about what these different mechanisms
are intended to accomplish.

On a technical level, I think that the idea of having a separate
objection for the connection string vs. the subscription itself is
perfectly sound, and to repeat what I said earlier, if someone wants
to implement that, cool. I also agree that it has the advantage that
you specify, namely, that someone can have rights to modify one of
those objects but not the other. What that lets you do is define a
short list of known systems and say, hey, you can replicate whatever
tables you want with whatever options you want, but only between these
systems. I'm not quite sure what problem that solves, though.

From my point of view, the two things that the superuser is most
likely to want to do are (1) control the replication setup themselves
and delegate nothing to any non-superuser or (2) give a non-superuser
pretty much complete control over replication with just enough
restrictions to avoid letting them do things that would compromise
security, such as hacking the local superuser account. In other words,
I expect that delegation of the logical replication configuration is
usually going to be all or nothing. Jeff's system allows for a
situation where you want to delegate some stuff but not everything,
and specifically where you want to dedicate control over the
subscription options and the tables being replicated, but not the
connection strings. To me, that feels like a bit of an awkward
configuration; I don't really understand in what situation that
division of responsibility would be particularly useful. I trust that
Jeff is proposing it because he knows of such a situation, but I don't
know what it is. I feel like, even if I wanted to let people use some
connection strings and not others, I'd probably want that control in
some form other than listing a specific list of allowable connection
strings -- I'd want to say things like "you have to use SSL" or "no
connecting back to the local host," because that lets me enforce some
general organizational policy without having to care specifically
about how each subscription is being set up.

Unfortunately, I have even less of an idea about what the run-as
concept is supposed to accomplish. I mean, at one level, I see it
quite clearly: the user creating the subscription wants replication to
have restricted privileges when it's running, and so they make the
run-as user some role with fewer privileges than their own. Brilliant.
But then I get stuck: against what kind of attack does that actually
protect us? If I'm a high privilege user, perhaps even a superuser,
and it's not safe to have logical replication running as me, then it
seems like the security model of logical replication is fundamentally
busted and we need to fix that. It can't be right to say that if you
have 263 users in a database and you want to replicate the whole
database to some other node, you need 263 different subscriptions with
a different run-as user for each. You need to be able to run all of
that logical replication as the superuser or some other high-privilege
user and not end up with a security compromise. And if we suppose that
that already works and is safe, well then what's the case where I do
need a run-as user?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Jan 30, 2023, at 7:44 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> And if we suppose that
> that already works and is safe, well then what's the case where I do
> need a run-as user?

A) Alice publishes tables, and occasionally adds new tables to existing publications.

B) Bob manages subscriptions, and periodically runs "refresh publication".  Bob also creates new subscriptions for
peoplewhen a row is inserted into the "please create a subscription for me" table which Bob owns, using a trigger that
Bobcreated on that table. 

C) Alice creates a "please create a subscription for me" table on the publishing database, adds lots of malicious
requests,and adds that table to the publication. 

D) Bob replicates the table, fires the trigger, creates the malicious subscriptions, and starts replicating all that
stuff,too. 

I think that having Charlie, not Bob, as the "run-as" user helps somewhere right around (D).

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Mon, Jan 30, 2023 at 11:11 AM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> > On Jan 30, 2023, at 7:44 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > And if we suppose that
> > that already works and is safe, well then what's the case where I do
> > need a run-as user?
>
> A) Alice publishes tables, and occasionally adds new tables to existing publications.
>
> B) Bob manages subscriptions, and periodically runs "refresh publication".  Bob also creates new subscriptions for
peoplewhen a row is inserted into the "please create a subscription for me" table which Bob owns, using a trigger that
Bobcreated on that table. 
>
> C) Alice creates a "please create a subscription for me" table on the publishing database, adds lots of malicious
requests,and adds that table to the publication. 
>
> D) Bob replicates the table, fires the trigger, creates the malicious subscriptions, and starts replicating all that
stuff,too. 
>
> I think that having Charlie, not Bob, as the "run-as" user helps somewhere right around (D).

I suppose it does, but I have some complaints.

First, it doesn't seem to make a lot of sense to have one person
managing the publications and someone else managing the subscriptions,
and especially if those parties are mutually untrusting. I can't think
of any real reason to set things up that way. Sure, you could, but why
would you? You could, equally, decide that one member of your
household was going to decide what's for dinner every night, and some
other member of your household was going to decide what gets purchased
at the grocery store each week. If those two people exercise their
responsibilities without tight coordination, or with hostile intent
toward each other, things are going to go badly, but that's not an
argument for putting a combination lock on the flour canister. It's an
argument for getting along better, or not having such a dumb system in
the first place. I don't quite see how the situation you postulate in
(A) and (B) is any different. Publications and subscriptions are as
closely connected as food purchases and meals. The point of a
publication is for it to connect up to a subscription. In what
circumstances would be it be reasonable to give responsibility for
those objects to different and especially mutually untrusting users?

Second, in step (B), we may ask why Bob is doing this with a trigger.
If he's willing to create any subscription for which Alice asks, we
could have just given Alice the authority to do those actions herself.
Presumably, therefore, Bob is willing to create some subscriptions for
which Alice may ask and not others. Perhaps this whole arrangement is
just a workaround for the lack of a sensible system for controlling
which connection strings Alice can use, in which case what is really
needed here might be something like the separate connection object
which Jeff postulated or my idea of a reverse pg_hba.conf. That kind
of approach would give a better user interface to Alice, who wouldn't
have to rephrase all of her CREATE SUBSCRIPTION commands as insert
statements. Conversely, if Alice and Bob are truly dedicated to this
convoluted system of creating subscriptions, then Bob needs to put
logic into his trigger that's smart enough to block any malicious
requests that Alice may make. He really brought this problem on
himself by not doing that.

Third, in step (C), it seems to me that whoever set up Alice's
permissions has really messed up. Either the schema Bob is using for
his create-me-a-subscription table exists on the primary and Alice has
permission to create tables in that schema, or else that schema does
not exist on the primary and Alice has permission to create it. Either
way, that's a bad setup. Bob's table should be located in a schema for
which Alice has only USAGE permissions and shouldn't have excess
permissions on the table, either. Then this step can't happen. This
step could also be blocked if, instead of using a table with a
trigger, Bob wrote a security definer function or procedure and
granted EXECUTE permission on that function or procedure to Alice.
He's still going to need sanity checks, though, and if the function or
procedure inserts into a logging table or something, he'd better make
sure that table is adequately secured rather than being, say, a table
owned by Alice with malicious triggers on it.

So basically this doesn't really feel like a valid scenario to me.
We're supposed to believe that Alice is hostile to Bob, but the
superuser doesn't seem to have thought very carefully about how Bob is
supposed to defend himself against Alice, and Bob doesn't even seem to
be trying. Maybe we should rename the users to Samson and Delilah? :-)

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Jan 30, 2023, at 9:26 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> First, it doesn't seem to make a lot of sense to have one person
> managing the publications and someone else managing the subscriptions,
> and especially if those parties are mutually untrusting. I can't think
> of any real reason to set things up that way. Sure, you could, but why
> would you? You could, equally, decide that one member of your
> household was going to decide what's for dinner every night, and some
> other member of your household was going to decide what gets purchased
> at the grocery store each week. If those two people exercise their
> responsibilities without tight coordination, or with hostile intent
> toward each other, things are going to go badly, but that's not an
> argument for putting a combination lock on the flour canister. It's an
> argument for getting along better, or not having such a dumb system in
> the first place. I don't quite see how the situation you postulate in
> (A) and (B) is any different. Publications and subscriptions are as
> closely connected as food purchases and meals. The point of a
> publication is for it to connect up to a subscription.

I have a grim view of the requirement that publishers and subscribers trust each other.  Even when they do trust each
other,they can firewall attacks by acting as if they do not. 

> In what
> circumstances would be it be reasonable to give responsibility for
> those objects to different and especially mutually untrusting users?

When public repositories of data, such as the IANA whois database, publish their data via postgres publications.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Jan 30, 2023, at 9:26 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> So basically this doesn't really feel like a valid scenario to me.
> We're supposed to believe that Alice is hostile to Bob, but the
> superuser doesn't seem to have thought very carefully about how Bob is
> supposed to defend himself against Alice, and Bob doesn't even seem to
> be trying. Maybe we should rename the users to Samson and Delilah? :-)

No, Atahualpa and Pizarro.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Mon, Jan 30, 2023 at 1:46 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> I have a grim view of the requirement that publishers and subscribers trust each other.  Even when they do trust each
other,they can firewall attacks by acting as if they do not.
 

I think it's OK if the CREATE PUBLICATION user doesn't particularly
trust the CREATE SUBSCRIPTION user, because the publication is just a
grouping of tables to which somebody can pay attention or not. The
CREATE PUBLICATION user isn't compromised either way. But, at least as
things stand, I don't see how the CREATE SUBSCRIPTION user get away
with not trusting the CREATE PUBLICATION user. CREATE SUBSCRIPTION
provides no tools at all for filtering the data that the subscriber
chooses to send.

Now that can be changed, I suppose, and a run-as user would be one way
to make progress in that direction. But I'm not sure how viable that
is, because...

> > In what
> > circumstances would be it be reasonable to give responsibility for
> > those objects to different and especially mutually untrusting users?
>
> When public repositories of data, such as the IANA whois database, publish their data via postgres publications.

... for that to work, IANA would need to set up the database so that
untrusted parties can create logical replication slots on their
PostgreSQL server. And I think that granting REPLICATION privilege on
your database to random people on the Internet is not really viable,
nor intended to be viable.  As the CREATE ROLE documentation says, "A
role having the REPLICATION attribute is a very highly privileged
role."

Concretely, this kind of setup would have the problem that you could
kill the IANA database by just creating a replication slot and then
not using it (or replicating from it only very very slowly).
Eventually, the replication slot would either hold back xmin enough
that you got a lot of bloat, or cause enough WAL to be retained that
you ran out of disk space. Maybe you could protect yourself against
that kind of problem by cutting off users who get too far behind, but
that also cuts off people who just have an outage for longer than your
cutoff.

Also, anyone who can connection to a replication slot can also connect
to any other replication slot, and drop any replication slot. So if
IANA did grant REPLICATION privilege to random people on the Internet,
one of them could jump into the system and screw things up for all the
others.

This kind of setup just doesn't seem viable to me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Jan 30, 2023, at 11:30 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> CREATE SUBSCRIPTION
> provides no tools at all for filtering the data that the subscriber
> chooses to send.
>
> Now that can be changed, I suppose, and a run-as user would be one way
> to make progress in that direction. But I'm not sure how viable that
> is, because...
>
>>> In what
>>> circumstances would be it be reasonable to give responsibility for
>>> those objects to different and especially mutually untrusting users?
>>
>> When public repositories of data, such as the IANA whois database, publish their data via postgres publications.
>
> ... for that to work, IANA would need to set up the database so that
> untrusted parties can create logical replication slots on their
> PostgreSQL server. And I think that granting REPLICATION privilege on
> your database to random people on the Internet is not really viable,
> nor intended to be viable.

That was an aspirational example in which there's infinite daylight between the publisher and subscriber.  I, too,
doubtthat's ever going to be possible.  But I still think we should aspire to some extra daylight between the two.
PerhapsIANA doesn't publish to the whole world, but instead publishes only to subscribers who have a contract in place,
andhave agreed to monetary penalties should they abuse the publishing server.  Whatever.  There's going to be some
amountof daylight possible if we design for it, and none otherwise. 

My real argument here isn't against your goal of having non-superusers who can create subscriptions.  That part seems
fineto me. 

Given that my work last year made it possible for subscriptions to run as somebody other than the subscription creator,
itannoys me that you now want the subscription creator's privileges to be what the subscription runs as.  That seems to
undowhat I worked on.  In my mental model of a (superuser-creator, non-superuser-owner) pair, it seems you're logically
onlytouching the lefthand side, so you should then have a (nonsuperuser-creator, nonsuperuser-owner) pair.  But you
don't. You go the apparently needless extra step of just squashing them together.  I just don't see why it needs to be
likethat. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Fri, Jan 27, 2023 at 5:00 PM Andres Freund <andres@anarazel.de> wrote:
> > Or, another thought, maybe this should be checking for CREATE on the
> > current database + also pg_create_subscription. That seems like it
> > might be the right idea, actually.
>
> Yes, that seems like a good idea.

Done in this version. I also changed check_conninfo to take an extra
argument instead of returning a Boolean, as per your suggestion.

I had a long think about what to do with ALTER SUBSCRIPTION ... OWNER
TO in terms of permissions checks. The previous version required that
the new owner have permissions of pg_create_subscription, but there
seems to be no particular reason for that rule except that it happens
to be what I made the code do. So I changed it to say that the current
owner must have CREATE privilege on the database, and must be able to
SET ROLE to the new owner. This matches the rule for CREATE SCHEMA.
Possibly we should *additionally* require that the person performing
the rename still have pg_create_subscription, but that shouldn't be
the only requirement. This change means that you can't just randomly
give your subscription to the superuser (with or without concurrently
attempting some other change as per your other comments) which is good
because you can't do that with other object types either.

There seems to be a good deal of inconsistency here. If you want to
give someone a schema, YOU need CREATE on the database. But if you
want to give someone a table, THEY need CREATE on the containing
schema. It make sense that we check permissions on the containing
object, which could be a database or a schema depending on what you're
renaming, but it's unclear to me why we sometimes check on the person
performing the ALTER command and at other times on the recipient. It's
also somewhat unclear to me why we are checking CREATE in the first
place, especially on the donor. It might make sense to have a rule
that you can't own an object in a place where you couldn't have
created it, but there is no such rule, because you can give someone
CREATE on a schema, they can create an object, and they you can take
CREATE a way and they still own an object there. So it kind of looks
to me like we made it up as we went along and that the result isn't
very consistent, but I'm inclined to follow CREATE SCHEMA here unless
there's some reason to do otherwise.

Another question around ALTER SUBSCRIPTION ... OWNER TO and also ALTER
SUBSCRIPTION .. RENAME is whether they ought to fail if you're not a
superuser and password_required false is set. They are separate code
paths from the rest of the ALTER SUBSCRIPTION cases, so if we want
that to be a rule we need dedicated code for it. I'm not quite sure
what's right. There's no comparable case for ALTER USER MAPPING
because a user mapping doesn't have an owner and so can't be
reassigned to a new owner. I don't see what the harm is, especially
for RENAME, but I might be missing something, and it certainly seems
arguable.

> > I'm not entirely clear whether there's a hazard there.
>
> I'm not at all either. It's just a code pattern that makes me anxious - I
> suspect there's a few places it makes us more vulnerable.

It looks likely to me that it was cut down from the CREATE SCHEMA code, FWIW.

> > If there is, I think we could fix it by moving the LockSharedObject call up
> > higher, above object_ownercheck. The only problem with that is it lets you
> > lock an object on which you have no permissions: see
> > 2ad36c4e44c8b513f6155656e1b7a8d26715bb94. To really fix that, we'd need an
> > analogue of RangeVarGetRelidExtended.
>
> Yea, we really should have something like RangeVarGetRelidExtended() for other
> kinds of objects. It'd take a fair bit of work / time to use it widely, but
> it'll take even longer if we start in 5 years ;)

We actually have something sort of like that in the form of
get_object_address(). It doesn't allow for a callback, but it does
have a retry loop.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Jacob Champion
Date:
On Fri, Jan 27, 2023 at 1:08 PM Robert Haas <robertmhaas@gmail.com> wrote:
> > 1) Forwarding the original ambient context along with the request, so
> > the server can check it too.
>
> Right, so a protocol extension. Reasonable idea, but a big lift. Not
> only do you need everyone to be running a new enough version of
> PostgreSQL, but existing proxies like pgpool and pgbouncer need
> updates, too.

Right.

> > 2) Explicitly combining the request with the proof of authority needed
> > to make it, as in capability-based security [3].
>
> As far as I can see, that link doesn't address how you'd make this
> approach work across a network.

The CSRF-token example I gave is one. But that's HTTP-specific
(stateless, server-driven) and probably doesn't make a lot of sense
for our case.

For our case, assuming that connections have side effects, one
solution could be for the client to signal to the server that the
connection should use in-band authentication only; i.e. fail the
connection if the credentials provided aren't good enough by
themselves to authenticate the client. (This has some overlap with
SASL negotiation, maybe.)

But that still requires server support. I don't know if there's a
clever way to tie the authentication to the request on the client side
only, using existing server implementations. (If connections don't
have side effects, require_auth should be sufficient.)

> > 3) Dropping as many implicitly-held privileges as possible before making
> > a request. This doesn't solve the problem but may considerably reduce
> > the practical attack surface.
>
> Right. I definitely don't object to this kind of approach, but I don't
> think it can ever be sufficient by itself.

I agree. (But for the record, I think that an outbound proxy filter is
also insufficient. Someone, somewhere, is going to want to safely
proxy through localhost _and_ have peer authentication set up.)

> > I think this style focuses on absolute configuration flexibility at the
> > expense of usability. It obfuscates the common use cases. (I have the
> > exact same complaint about our HBA and ident configs, so I may be
> > fighting uphill.)
>
> That's probably somewhat true, but on the other hand, it also is more
> powerful than what you're describing. In your system, is there some
> way the DBA can say "hey, you can connect to any of the machines on
> this list of subnets, but nothing else"? Or equally, "hey, you may NOT
> connect to any machine on this list of subnets, but anything else is
> fine"? Or "you can connect to these subnets without SSL, but if you
> want to talk to anything else, you need to use SSL"?

I guess I didn't call it out explicitly, so it was fair to assume that
it did not. I don't think we should ignore those cases.

But if we let the configuration focus on policies instead, and
simultaneously improve the confused-deputy problem, then any IP/host
filter functionality that we provide becomes an additional safety
measure instead of your only viable line of defense. "I screwed up our
IP filter, but we're still safe because the proxy refused to forward
its client cert to the backend." Or, "this other local application
requires peer authentication, but it's okay because the proxy
disallows those connections by default."

> Your idea
> seems to rely on us being able to identify all of the policies that a
> user is likely to want and give names to each one, and I don't feel
> very confident that that's realistic. But maybe I'm misinterpreting
> your idea?

No, that's pretty accurate. But I'm used to systems that provide a
ridiculous number of policies [1, 2] via what's basically a scoped
property bag. "Turn off option 1 and 2 globally. For host A and IP
address B, turn on option 1 as an exception." And I don't really
expect us to need as many options as those systems do.

I think that configuration style evolves well, it focuses on the right
things, and it can still handle IP lists intuitively [3], if that's
the way a DBA really wants to set up policies.

--Jacob

[1] https://httpd.apache.org/docs/2.4/mod/mod_proxy.html
[2] https://www.haproxy.com/documentation/hapee/latest/onepage/#4
[3] https://docs.nginx.com/nginx/admin-guide/security-controls/controlling-access-proxied-tcp/



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Mon, Jan 30, 2023 at 3:27 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:
> That was an aspirational example in which there's infinite daylight between the publisher and subscriber.  I, too,
doubtthat's ever going to be possible.  But I still think we should aspire to some extra daylight between the two.
PerhapsIANA doesn't publish to the whole world, but instead publishes only to subscribers who have a contract in place,
andhave agreed to monetary penalties should they abuse the publishing server.  Whatever.  There's going to be some
amountof daylight possible if we design for it, and none otherwise. 
>
> My real argument here isn't against your goal of having non-superusers who can create subscriptions.  That part seems
fineto me. 
>
> Given that my work last year made it possible for subscriptions to run as somebody other than the subscription
creator,it annoys me that you now want the subscription creator's privileges to be what the subscription runs as.  That
seemsto undo what I worked on.  In my mental model of a (superuser-creator, non-superuser-owner) pair, it seems you're
logicallyonly touching the lefthand side, so you should then have a (nonsuperuser-creator, nonsuperuser-owner) pair.
Butyou don't.  You go the apparently needless extra step of just squashing them together.  I just don't see why it
needsto be like that. 

I feel like you're accusing me of removing functionality that has
never existed. A subscription doesn't run as the subscription creator.
It runs as the subscription owner. If you or anyone else had added the
capability for it to run as someone other than the subscription owner,
I certainly wouldn't be trying to back that capability out as part of
this patch, and because there isn't, I'm not proposing to add that as
part of this patch. I don't see how that makes me guilty of squashing
anything together. The current state of affairs, where the run-as user
is taken from pg_subscription.subowner, the same field that is updated
by ALTER SUBSCRIPTION ... OWNER TO, is the result of your work, not
anything that I have done or am proposing to do.

I also *emphatically* disagree with the idea that this undoes what you
worked on. My patch would be *impossible* without your work. Prior to
your work, the run-as user was always, basically, the superuser, and
so the idea of allowing anyone other than a superuser to execute
CREATE SUBSCRIPTION would be flat-out nuts. Because of your work,
that's now a thing that we may be able to reasonably allow, if we can
work through the remaining issues. So I'm grateful to you, and also
sorry to hear that you're annoyed with me. But I still don't think
that the fact that the division you want doesn't exist is somehow my
fault.

I'm kind of curious why you *didn't* make this distinction at the time
that you were did the other work in this area. Maybe my memory is
playing tricks on me again, but I seem to recall talking about the
idea with you at the time, and I seem to recall thinking that it
sounded like an OK idea. I seem to vaguely recall us discussing
hazards like: well, what if replication causes code to get executed as
the subscription owner that that causes something bad to happen? But I
think the only way that happens is if they put triggers on the tables
that are being replicated, which is their choice, and they can avoid
installing problematic code there if they want. I think there might
have been some other scenarios, too, but I just can't remember. In any
case, I don't think the idea is completely without merit. I think it
could very well be something that we want to have for one reason or
another. But I don't currently understand exactly what those reasons
are, and I don't see any reason why one patch should both split owner
from run-as user and also allow the owner to be a non-superuser. That
seems like two different efforts to me.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Robert Haas
Date:
On Mon, Jan 30, 2023 at 4:12 PM Jacob Champion <jchampion@timescale.com> wrote:
> For our case, assuming that connections have side effects, one
> solution could be for the client to signal to the server that the
> connection should use in-band authentication only; i.e. fail the
> connection if the credentials provided aren't good enough by
> themselves to authenticate the client. (This has some overlap with
> SASL negotiation, maybe.)

I'm not an expert on this stuff, but to me that feels like a weak and
fuzzy concept. If the client is going to tell the server something,
I'd much rather have it say something like "i'm proxying a request
from my local user rhaas, who authenticated using such and such a
method and connected from such and such an IP yadda yadda". That feels
to me like really clear communication that the server can then be
configured to something about via pg_hba.conf or similar. Saying "use
in-band authentication only", to me, feels much murkier. As the
recipient of that message, I don't know exactly what to do about it,
and it feels like whatever heuristic I adopt might end up being wrong
and something bad happens anyway.

> I agree. (But for the record, I think that an outbound proxy filter is
> also insufficient. Someone, somewhere, is going to want to safely
> proxy through localhost _and_ have peer authentication set up.)

Well then they're indeed going to need some way to distinguish a
proxied connection from a non-proxied one. You can't send identical
connection requests in different scenarios and get different
results....

> I guess I didn't call it out explicitly, so it was fair to assume that
> it did not. I don't think we should ignore those cases.

OK, cool.

> But if we let the configuration focus on policies instead, and
> simultaneously improve the confused-deputy problem, then any IP/host
> filter functionality that we provide becomes an additional safety
> measure instead of your only viable line of defense. "I screwed up our
> IP filter, but we're still safe because the proxy refused to forward
> its client cert to the backend." Or, "this other local application
> requires peer authentication, but it's okay because the proxy
> disallows those connections by default."

Defense in depth is good.

> > Your idea
> > seems to rely on us being able to identify all of the policies that a
> > user is likely to want and give names to each one, and I don't feel
> > very confident that that's realistic. But maybe I'm misinterpreting
> > your idea?
>
> No, that's pretty accurate. But I'm used to systems that provide a
> ridiculous number of policies [1, 2] via what's basically a scoped
> property bag. "Turn off option 1 and 2 globally. For host A and IP
> address B, turn on option 1 as an exception." And I don't really
> expect us to need as many options as those systems do.
>
> I think that configuration style evolves well, it focuses on the right
> things, and it can still handle IP lists intuitively [3], if that's
> the way a DBA really wants to set up policies.

I think what we really need here is an example or three of a proposed
configuration file syntax. I think it would be good if we could pick a
syntax that doesn't require a super-complicated parser, and that maybe
has something in common with our existing configuration file syntaxes.
But if we have to invent something new, then we can do that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Jan 30, 2023, at 1:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> I feel like you're accusing me of removing functionality that has
> never existed. A subscription doesn't run as the subscription creator.
> It runs as the subscription owner. If you or anyone else had added the
> capability for it to run as someone other than the subscription owner,
> I certainly wouldn't be trying to back that capability out as part of
> this patch, and because there isn't, I'm not proposing to add that as
> part of this patch. I don't see how that makes me guilty of squashing
> anything together. The current state of affairs, where the run-as user
> is taken from pg_subscription.subowner, the same field that is updated
> by ALTER SUBSCRIPTION ... OWNER TO, is the result of your work, not
> anything that I have done or am proposing to do.
>
> I also *emphatically* disagree with the idea that this undoes what you
> worked on. My patch would be *impossible* without your work. Prior to
> your work, the run-as user was always, basically, the superuser, and
> so the idea of allowing anyone other than a superuser to execute
> CREATE SUBSCRIPTION would be flat-out nuts. Because of your work,
> that's now a thing that we may be able to reasonably allow, if we can
> work through the remaining issues. So I'm grateful to you, and also
> sorry to hear that you're annoyed with me. But I still don't think
> that the fact that the division you want doesn't exist is somehow my
> fault.
>
> I'm kind of curious why you *didn't* make this distinction at the time
> that you were did the other work in this area. Maybe my memory is
> playing tricks on me again, but I seem to recall talking about the
> idea with you at the time, and I seem to recall thinking that it
> sounded like an OK idea. I seem to vaguely recall us discussing
> hazards like: well, what if replication causes code to get executed as
> the subscription owner that that causes something bad to happen? But I
> think the only way that happens is if they put triggers on the tables
> that are being replicated, which is their choice, and they can avoid
> installing problematic code there if they want. I think there might
> have been some other scenarios, too, but I just can't remember. In any
> case, I don't think the idea is completely without merit. I think it
> could very well be something that we want to have for one reason or
> another. But I don't currently understand exactly what those reasons
> are, and I don't see any reason why one patch should both split owner
> from run-as user and also allow the owner to be a non-superuser. That
> seems like two different efforts to me.

I don't have a concrete problem with your patch, and wouldn't object if you committed it.  My concerns were more how
youwere phrasing things, but it seems not worth any additional conversation, because it's probably a distinction
withouta difference. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-01-30 10:44:29 -0500, Robert Haas wrote:
> On a technical level, I think that the idea of having a separate
> objection for the connection string vs. the subscription itself is
> perfectly sound, and to repeat what I said earlier, if someone wants
> to implement that, cool. I also agree that it has the advantage that
> you specify, namely, that someone can have rights to modify one of
> those objects but not the other. What that lets you do is define a
> short list of known systems and say, hey, you can replicate whatever
> tables you want with whatever options you want, but only between these
> systems. I'm not quite sure what problem that solves, though.

That does seem somewhat useful, but also fairly limited, at least as
long as it's really just a single connection, rather than a "pattern" of
safe connections.


> Unfortunately, I have even less of an idea about what the run-as
> concept is supposed to accomplish. I mean, at one level, I see it
> quite clearly: the user creating the subscription wants replication to
> have restricted privileges when it's running, and so they make the
> run-as user some role with fewer privileges than their own. Brilliant.
> But then I get stuck: against what kind of attack does that actually
> protect us? If I'm a high privilege user, perhaps even a superuser,
> and it's not safe to have logical replication running as me, then it
> seems like the security model of logical replication is fundamentally
> busted and we need to fix that.

I don't really understand that - the run-as approach seems like a
necessary piece of improving the security model.

I think it's perfectly reasonable to want to replicate from one system
in another, but to not want to allow logical replication to insert into
pg_class or whatnot. So not using superuser to execute the replication
makes sense.

This is particularly the case if you're just replicating a small part of
the tables from one system to another. E.g. in a sharded setup, you may
want to replicate metadata too servers.

Even if all the systems are operated by people you trust (including
possibly even yourself, if you want to go that far), you may want to
reduce the blast radius of privilege escalation, or even just bugs, to a
smaller amount of data.


I think we'll need two things to improve upon the current situation:

1) run-as user, to reduce the scope of potential danger

2) Option to run the database inserts as the owner of the table, with a
   check that the run-as is actually allowed to perform work as the
   owning role. That prevents escalation from table owner (who could add
   default expressions etc) from gettng the privs of the
   run-as/replication owner.


I think it makes sense for 1) to be a fairly privileged user, but I
think it's good practice for that user to not be allowed to change the
system configuration etc.


> It can't be right to say that if you have 263 users in a database and
> you want to replicate the whole database to some other node, you need
> 263 different subscriptions with a different run-as user for each. You
> need to be able to run all of that logical replication as the
> superuser or some other high-privilege user and not end up with a
> security compromise.

I'm not quite following along here - are you thinking of 263 tables
owned by 263 users? If yes, that's why I am thinking that we need the
option to perform each table modification as the owner of that table
(with the same security restrictions we use for REINDEX etc).


> And if we suppose that that already works and is safe, well then
> what's the case where I do need a run-as user?

It's not at all safe today, IMO. You need to trust that nothing bad will
be replicated, otherwise the owner of the subscription has to be
considered compromised.

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Tue, Jan 31, 2023 at 7:01 PM Andres Freund <andres@anarazel.de> wrote:
> I don't really understand that - the run-as approach seems like a
> necessary piece of improving the security model.
>
> I think it's perfectly reasonable to want to replicate from one system
> in another, but to not want to allow logical replication to insert into
> pg_class or whatnot. So not using superuser to execute the replication
> makes sense.
>
> This is particularly the case if you're just replicating a small part of
> the tables from one system to another. E.g. in a sharded setup, you may
> want to replicate metadata too servers.

I don't think that a system catalog should be considered a valid
replication target, no matter who owns the subscription, so ISTM that
writing to pg_class should be blocked regardless. The thing I'm
struggling to understand is: if you only want to replicate into tables
that Alice can write, why not just make Alice own the subscription?
For a run-as user to make sense, you need a scenario where we want the
replication to target only tables that Alice can touch, but we also
don't want Alice herself to be able to touch the subscription, so you
make Alice the run-as user and yourself the owner, or something like
that. But I'm not sure what that scenario is exactly.

Mark was postulating a scenario where the publisher and subscriber
don't trust each other. I was thinking a little bit more about that. I
still maintain that the current system is poorly set up to make that
work, but suppose we wanted to do better. We could add filtering on
the subscriber side, like you list schemas or specific relations that
you are or are not willing to replicate into. Then you could, for
example, connect your subscription to a certain remote publication,
but with the restriction that you're only willing to replicate into
the "headquarters" schema. Then we'll replicate whatever tables they
send us, but if the dorks at headquarters mess up the publications on
their end (intentionally or otherwise) and add some tables from the
"locally_controlled_stuff" schema, we'll refuse to replicate that into
our eponymous schema. I don't think this kind of system is well-suited
to environments where people are totally hostile to each other,
because you still need to have replication slots on the remote side
and stuff. Also, having the remote side decode stuff and ignoring it
locally is expensive, and I bet if we add stuff like this then people
will misuse it and be sad. But it would make the system easier to
reason about: I know for sure that this subscription will only write
to these places, because that's all I've given it permission to do.

In the sharding scenario you mention, if you want to provide
accidental writes to unrelated tables due to the publication being not
what we expect, you can either make the subscription owned by the same
role that owns the sharded tables, or a special-purpose role that has
permission to write to exactly the set of tables that you expect to be
touched and no others. Or, if you had something like what I posited in
the last paragraph, you could use that instead. But I don't see how a
separate run-as user helps. If I'm just being super-dense here, I hope
that one of you will explain using short words. :-)

> I think we'll need two things to improve upon the current situation:
>
> 1) run-as user, to reduce the scope of potential danger
>
> 2) Option to run the database inserts as the owner of the table, with a
>    check that the run-as is actually allowed to perform work as the
>    owning role. That prevents escalation from table owner (who could add
>    default expressions etc) from gettng the privs of the
>    run-as/replication owner.

I'm not quite sure what we do here now, but I agree that trigger
firing seems like a problem. It might be that we need to worry about
the user on the origin server, too. If Alice inserts a row that causes
a replicated table owned by Bob to fire a trigger or evaluate a
default expression or whatever due the presence of a subscription
owned by Charlie, there is a risk that Alice might try to attack
either Bob or Charlie, or that Bob might try to attack Charlie.

> > And if we suppose that that already works and is safe, well then
> > what's the case where I do need a run-as user?
>
> It's not at all safe today, IMO. You need to trust that nothing bad will
> be replicated, otherwise the owner of the subscription has to be
> considered compromised.

What kinds of things are bad to replicate?

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Feb 1, 2023, at 6:43 AM, Robert Haas <robertmhaas@gmail.com> wrote:

> The thing I'm
> struggling to understand is: if you only want to replicate into tables
> that Alice can write, why not just make Alice own the subscription?
> For a run-as user to make sense, you need a scenario where we want the
> replication to target only tables that Alice can touch, but we also
> don't want Alice herself to be able to touch the subscription, so you
> make Alice the run-as user and yourself the owner, or something like
> that. But I'm not sure what that scenario is exactly.

This "run-as" idea came about because we didn't want arbitrary roles to be able to change the subscription's connection
string. A competing idea was to have a server object rather than a string, with roles like Alice being able to use the
serverobject if they have been granted usage privilege, and not otherwise.  So the "run-as" and "server" ideas were
somewhatcompeting. 

> Mark was postulating a scenario where the publisher and subscriber
> don't trust each other. I was thinking a little bit more about that. I
> still maintain that the current system is poorly set up to make that
> work, but suppose we wanted to do better. We could add filtering on
> the subscriber side, like you list schemas or specific relations that
> you are or are not willing to replicate into. Then you could, for
> example, connect your subscription to a certain remote publication,
> but with the restriction that you're only willing to replicate into
> the "headquarters" schema. Then we'll replicate whatever tables they
> send us, but if the dorks at headquarters mess up the publications on
> their end (intentionally or otherwise) and add some tables from the
> "locally_controlled_stuff" schema, we'll refuse to replicate that into
> our eponymous schema.

That example is good, though I don't see how "filters" are better than roles+privileges.  Care to elaborate?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Jacob Champion
Date:
On Mon, Jan 30, 2023 at 2:21 PM Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Jan 30, 2023 at 4:12 PM Jacob Champion <jchampion@timescale.com> wrote:
> > For our case, assuming that connections have side effects, one
> > solution could be for the client to signal to the server that the
> > connection should use in-band authentication only; i.e. fail the
> > connection if the credentials provided aren't good enough by
> > themselves to authenticate the client. (This has some overlap with
> > SASL negotiation, maybe.)
>
> I'm not an expert on this stuff, but to me that feels like a weak and
> fuzzy concept. If the client is going to tell the server something,
> I'd much rather have it say something like "i'm proxying a request
> from my local user rhaas, who authenticated using such and such a
> method and connected from such and such an IP yadda yadda". That feels
> to me like really clear communication that the server can then be
> configured to something about via pg_hba.conf or similar. Saying "use
> in-band authentication only", to me, feels much murkier. As the
> recipient of that message, I don't know exactly what to do about it,
> and it feels like whatever heuristic I adopt might end up being wrong
> and something bad happens anyway.

Is it maybe just a matter of terminology? If a proxy tells the server,
"This user is logging in. Here's the password I have for them. DO NOT
authenticate using anything else," and the HBA says to use ident auth
for that user, then the server fails the connection. That's what I
mean by in-band -- the proxy says, "here are the credentials for this
connection." That's it.

Alternatively, if you really don't like making this server-side: any
future "connection side effects" we add, such as logon triggers, could
either be opted into by the client or explicitly invoked by the client
after it's happy with the authentication exchange. Or it could be
disabled at the server side for forms of ambient authn. (This is
getting closer to HTTP's method safety concept.)

> > I agree. (But for the record, I think that an outbound proxy filter is
> > also insufficient. Someone, somewhere, is going to want to safely
> > proxy through localhost _and_ have peer authentication set up.)
>
> Well then they're indeed going to need some way to distinguish a
> proxied connection from a non-proxied one. You can't send identical
> connection requests in different scenarios and get different
> results....

Yeah. Most of these solutions require explicitly labelling things that
were implicit before.

> I think what we really need here is an example or three of a proposed
> configuration file syntax. I think it would be good if we could pick a
> syntax that doesn't require a super-complicated parser

Agreed. The danger from my end is, I'm trained on configuration
formats that have infinite bells and whistles. I don't really want to
go too crazy with it.

> and that maybe
> has something in common with our existing configuration file syntaxes.
> But if we have to invent something new, then we can do that.

Okay. Personally I'd like
- the ability to set options globally (so filters are optional)
- the ability to maintain many options for a specific scope (host? IP
range?) without making my config lines grow without bound
- the ability to audit a configuration without trusting its comments

But getting all of my wishlist into a sane configuration format that
handles all the use cases is the tricky part. I'll think about it.

--Jacob



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-02-01 09:43:39 -0500, Robert Haas wrote:
> On Tue, Jan 31, 2023 at 7:01 PM Andres Freund <andres@anarazel.de> wrote:
> > I don't really understand that - the run-as approach seems like a
> > necessary piece of improving the security model.
> >
> > I think it's perfectly reasonable to want to replicate from one system
> > in another, but to not want to allow logical replication to insert into
> > pg_class or whatnot. So not using superuser to execute the replication
> > makes sense.
> >
> > This is particularly the case if you're just replicating a small part of
> > the tables from one system to another. E.g. in a sharded setup, you may
> > want to replicate metadata too servers.
> 
> I don't think that a system catalog should be considered a valid
> replication target, no matter who owns the subscription, so ISTM that
> writing to pg_class should be blocked regardless.

The general point is that IMO is that in many setups you should use a
user with fewer privileges than a superuser.  It doesn't really matter
whether we have an ad-hoc restriction for system catalogs. More often
than not being able to modify other tables will give you a lot of
privileges too.


> The thing I'm struggling to understand is: if you only want to
> replicate into tables that Alice can write, why not just make Alice
> own the subscription?

Because it implies that the replication happens as a user that's
privileged enough to change the configuration of replication.


> Mark was postulating a scenario where the publisher and subscriber
> don't trust each other.

FWIW, I don't this this is mainly about "trust", but instead about
layering security / the principle of least privilege. The "run-as" user
(i.e. currently owner) is constantly performing work on behalf of a
remote node, including executing code (default clauses etc). To make it
harder to use such a cross-system connection to move from one system to
the next, it's a good idea to execute it in the least privileged context
possible. And I don't see why it'd need the permission to modify the
definition of the subscription and similar "admin" tasks.

It's not that such an extra layer would necessarily completely stop an
attacker. But it might delay them and make their attack more noisy.


Similarly, if I were to operate an important production environment
again, I'd not have relations owned by the [pseudo]superuser, but by a
user controlled by the [pseudo]superuser. That way somebody tricking the
superuser into a REINDEX or such only gets the ability to execute code
in a less privileged context.




> I was thinking a little bit more about that. I
> still maintain that the current system is poorly set up to make that
> work, but suppose we wanted to do better. We could add filtering on
> the subscriber side, like you list schemas or specific relations that
> you are or are not willing to replicate into.

Isn't that largely a duplication of the ACLs on relations etc?


> > I think we'll need two things to improve upon the current situation:
> >
> > 1) run-as user, to reduce the scope of potential danger
> >
> > 2) Option to run the database inserts as the owner of the table, with a
> >    check that the run-as is actually allowed to perform work as the
> >    owning role. That prevents escalation from table owner (who could add
> >    default expressions etc) from gettng the privs of the
> >    run-as/replication owner.
> 
> I'm not quite sure what we do here now, but I agree that trigger
> firing seems like a problem. It might be that we need to worry about
> the user on the origin server, too. If Alice inserts a row that causes
> a replicated table owned by Bob to fire a trigger or evaluate a
> default expression or whatever due the presence of a subscription
> owned by Charlie, there is a risk that Alice might try to attack
> either Bob or Charlie, or that Bob might try to attack Charlie.

The attack on Bob exists without logical replication too - a REINDEX or
such is executed as the owner of the relation and re-evaluates index
expressions, constraints etc.  Given our security model I don't think we
can protect the relation owner if they trust somebody to insert rows, so
I don't really know what we can do to protect Charlie against Bob.



> > > And if we suppose that that already works and is safe, well then
> > > what's the case where I do need a run-as user?
> >
> > It's not at all safe today, IMO. You need to trust that nothing bad will
> > be replicated, otherwise the owner of the subscription has to be
> > considered compromised.
> 
> What kinds of things are bad to replicate?

I think that's unfortunately going to be specific to a setup.

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-01-30 15:32:34 -0500, Robert Haas wrote:
> I had a long think about what to do with ALTER SUBSCRIPTION ... OWNER
> TO in terms of permissions checks. The previous version required that
> the new owner have permissions of pg_create_subscription, but there
> seems to be no particular reason for that rule except that it happens
> to be what I made the code do. So I changed it to say that the current
> owner must have CREATE privilege on the database, and must be able to
> SET ROLE to the new owner. This matches the rule for CREATE SCHEMA.
> Possibly we should *additionally* require that the person performing
> the rename still have pg_create_subscription, but that shouldn't be
> the only requirement.

As long as owner and run-as are the same, I think it's strongly
preferrable to *not* require pg_create_subscription.


> There seems to be a good deal of inconsistency here. If you want to
> give someone a schema, YOU need CREATE on the database. But if you
> want to give someone a table, THEY need CREATE on the containing
> schema. It make sense that we check permissions on the containing
> object, which could be a database or a schema depending on what you're
> renaming, but it's unclear to me why we sometimes check on the person
> performing the ALTER command and at other times on the recipient. It's
> also somewhat unclear to me why we are checking CREATE in the first
> place, especially on the donor. It might make sense to have a rule
> that you can't own an object in a place where you couldn't have
> created it, but there is no such rule, because you can give someone
> CREATE on a schema, they can create an object, and they you can take
> CREATE a way and they still own an object there. So it kind of looks
> to me like we made it up as we went along and that the result isn't
> very consistent, but I'm inclined to follow CREATE SCHEMA here unless
> there's some reason to do otherwise.

Yuck. No idea what the best policy around this is.


> Another question around ALTER SUBSCRIPTION ... OWNER TO and also ALTER
> SUBSCRIPTION .. RENAME is whether they ought to fail if you're not a
> superuser and password_required false is set.

I don't really see a benefit in allowing it, so I'm inclined to go for
the more restrictive option. But this is a really weakly held opinion.



> > > If there is, I think we could fix it by moving the LockSharedObject call up
> > > higher, above object_ownercheck. The only problem with that is it lets you
> > > lock an object on which you have no permissions: see
> > > 2ad36c4e44c8b513f6155656e1b7a8d26715bb94. To really fix that, we'd need an
> > > analogue of RangeVarGetRelidExtended.
> >
> > Yea, we really should have something like RangeVarGetRelidExtended() for other
> > kinds of objects. It'd take a fair bit of work / time to use it widely, but
> > it'll take even longer if we start in 5 years ;)
>
> We actually have something sort of like that in the form of
> get_object_address(). It doesn't allow for a callback, but it does
> have a retry loop.

Hm, sure looks like that code doesn't do any privilege checking...


> @@ -1269,13 +1270,19 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
>                                      slotname,
>                                      NAMEDATALEN);
>
> +    /* Is the use of a password mandatory? */
> +    must_use_password = MySubscription->passwordrequired &&
> +        !superuser_arg(MySubscription->owner);

There's a few repetitions of this - perhaps worth putting into a helper?


> @@ -180,6 +181,13 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname,
>       if (PQstatus(conn->streamConn) != CONNECTION_OK)
>               goto bad_connection_errmsg;
>
> +     if (must_use_password && !PQconnectionUsedPassword(conn->streamConn))
> +             ereport(ERROR,
> +                             (errcode(ERRCODE_S_R_E_PROHIBITED_SQL_STATEMENT_ATTEMPTED),
> +                              errmsg("password is required"),
> +                              errdetail("Non-superuser cannot connect if the server does not request a password."),
> +                              errhint("Target server's authentication method must be changed. or set
password_required=falsein the subscription attributes\
 
.")));
> +
>       if (logical)
>       {
>               PGresult   *res;

This still leaks the connection on error, no?

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Feb 1, 2023 at 1:09 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
> > On Feb 1, 2023, at 6:43 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> > The thing I'm
> > struggling to understand is: if you only want to replicate into tables
> > that Alice can write, why not just make Alice own the subscription?
> > For a run-as user to make sense, you need a scenario where we want the
> > replication to target only tables that Alice can touch, but we also
> > don't want Alice herself to be able to touch the subscription, so you
> > make Alice the run-as user and yourself the owner, or something like
> > that. But I'm not sure what that scenario is exactly.
>
> This "run-as" idea came about because we didn't want arbitrary roles to be able to change the subscription's
connectionstring.  A competing idea was to have a server object rather than a string, with roles like Alice being able
touse the server object if they have been granted usage privilege, and not otherwise.  So the "run-as" and "server"
ideaswere somewhat competing. 

As far as not changing the connection string goes, a few more ideas
have entered the fray: the current patch uses a password_required
property that is modelled on postgres_fdw, and I've elsewhere proposed
a reverse pg_hba.conf.

IMHO, for the use cases that I can imagine, the reverse pg_hba.conf
idea feels better than all competitors, because it's the only idea
that lets you define a class of acceptable connection strings. Jeff's
idea of a separate connection object is fine if you have a specific,
short list of connection strings and you want to allow those and
disallow everything else, and there may be cases where people want
that, and that's fine, but my guess is that it's overly restrictive in
a lot of environments. The password_required property has the virtue
of being compatible with what we do in other places right now, and of
preventing wraparound-to-superuser attacks effectively, but it's
totally unconfigurable and that sucks. The runas user idea gives you
some control over who is allowed to set the connection string, but it
doesn't help you delegate that to a non-superuser, because the idea
there is that you want the non-superuser to be able to set connection
strings that are OK with the actual superuser but not others.

I think part of my confusion here is that I thought that the point of
the runas user was to defend against logical replication itself
changing the connection string, and I don't see how it would do that.
It's just moving rows around. If the point is that somebody who can
log in as the runas user might change the connection string to
something we don't like, that makes somewhat more sense. I think I had
in my head that you wouldn't use someone's actual login role to run
logical replication, but rather some role specifically set up for that
purpose. In that scenario, nobody's running SQL commands as the runas
user, so even if they also own the subscription, there's no way for it
to get modified.

> > Mark was postulating a scenario where the publisher and subscriber
> > don't trust each other. I was thinking a little bit more about that. I
> > still maintain that the current system is poorly set up to make that
> > work, but suppose we wanted to do better. We could add filtering on
> > the subscriber side, like you list schemas or specific relations that
> > you are or are not willing to replicate into. Then you could, for
> > example, connect your subscription to a certain remote publication,
> > but with the restriction that you're only willing to replicate into
> > the "headquarters" schema. Then we'll replicate whatever tables they
> > send us, but if the dorks at headquarters mess up the publications on
> > their end (intentionally or otherwise) and add some tables from the
> > "locally_controlled_stuff" schema, we'll refuse to replicate that into
> > our eponymous schema.
>
> That example is good, though I don't see how "filters" are better than roles+privileges.  Care to elaborate?

I'm not sure that they are. Are we assuming that the user who is
creating subscriptions is also powerful enough to create roles and
give them just the required amount of privilege? If so, it seems like
they might as well just do it that way. And maybe we should assume
that, because in most cases, a dedication replication role makes more
sense to me than running replication under some role that you're also
using for other things. On the other hand, I bet a lot of people today
are just running replication as a superuser, in which case maybe this
could be useful? This whole idea was mostly just me spitballing to see
what other people thought. I'm not wild about the complexity involved
for what you get out of it, so if we don't need it, that's more than
fine with me.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Feb 1, 2023 at 3:37 PM Andres Freund <andres@anarazel.de> wrote:
> The general point is that IMO is that in many setups you should use a
> user with fewer privileges than a superuser.  It doesn't really matter
> whether we have an ad-hoc restriction for system catalogs. More often
> than not being able to modify other tables will give you a lot of
> privileges too.

I don't know what you mean by this. DML doesn't confer privileges. If
code gets executed and runs with the replication user's credentials,
that could lead to privilege escalation, but just moving rows around
doesn't, at least not in the database sense. It might confer
unanticipated real-world benefits, like if you can update your own
salary or something, but in the context of replication you have to
have had the ability to do that on some other node anyway. If that
change wasn't supposed to get replicated to the local node, then why
are we using replication? Or why is that table in the publication? I'm
confused.

> > The thing I'm struggling to understand is: if you only want to
> > replicate into tables that Alice can write, why not just make Alice
> > own the subscription?
>
> Because it implies that the replication happens as a user that's
> privileged enough to change the configuration of replication.

But again, replication is just about inserting, updating, and deleting
rows. To change the replication configuration, you have to be able to
parlay that into the ability to execute code. That's why I think
trigger security is really important. But I'm wondering if there's
some way we can handle that that doesn't require us to make a decision
about arun-as user. For instance, if firing triggers as the table
owner is an acceptable solution, then the only thing that the run-as
user is actually controlling is which tables we're willing to
replicate into in the first place (unless there's some other way that
logical replication can run arbitrary code). The name almost becomes a
misnomer in that case. It's not a run-as user, it's
use-this-user's-permissions-to-see-if-I-should-fail-replication user.

> > I was thinking a little bit more about that. I
> > still maintain that the current system is poorly set up to make that
> > work, but suppose we wanted to do better. We could add filtering on
> > the subscriber side, like you list schemas or specific relations that
> > you are or are not willing to replicate into.
>
> Isn't that largely a duplication of the ACLs on relations etc?

Yeah, maybe.

> > I'm not quite sure what we do here now, but I agree that trigger
> > firing seems like a problem. It might be that we need to worry about
> > the user on the origin server, too. If Alice inserts a row that causes
> > a replicated table owned by Bob to fire a trigger or evaluate a
> > default expression or whatever due the presence of a subscription
> > owned by Charlie, there is a risk that Alice might try to attack
> > either Bob or Charlie, or that Bob might try to attack Charlie.
>
> The attack on Bob exists without logical replication too - a REINDEX or
> such is executed as the owner of the relation and re-evaluates index
> expressions, constraints etc.  Given our security model I don't think we
> can protect the relation owner if they trust somebody to insert rows, so
> I don't really know what we can do to protect Charlie against Bob.

Yikes.

> > > > And if we suppose that that already works and is safe, well then
> > > > what's the case where I do need a run-as user?
> > >
> > > It's not at all safe today, IMO. You need to trust that nothing bad will
> > > be replicated, otherwise the owner of the subscription has to be
> > > considered compromised.
> >
> > What kinds of things are bad to replicate?
>
> I think that's unfortunately going to be specific to a setup.

Can you give an example?

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-02-02 09:28:03 -0500, Robert Haas wrote:
> I don't know what you mean by this. DML doesn't confer privileges. If
> code gets executed and runs with the replication user's credentials,
> that could lead to privilege escalation, but just moving rows around
> doesn't, at least not in the database sense.

Executing DML ends up executing code. Think predicated/expression
indexes, triggers, default expressions etc. If a badly written trigger
etc can be tricked to do arbitrary code exec, an attack will be able to
run with the privs of the run-as user.  How bad that is is influenced to
some degree by the amount of privileges that user has.

Greetings,

Andres Freund



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Robert Haas
Date:
On Wed, Feb 1, 2023 at 3:37 PM Jacob Champion <jchampion@timescale.com> wrote:
> > I'm not an expert on this stuff, but to me that feels like a weak and
> > fuzzy concept. If the client is going to tell the server something,
> > I'd much rather have it say something like "i'm proxying a request
> > from my local user rhaas, who authenticated using such and such a
> > method and connected from such and such an IP yadda yadda". That feels
> > to me like really clear communication that the server can then be
> > configured to something about via pg_hba.conf or similar. Saying "use
> > in-band authentication only", to me, feels much murkier. As the
> > recipient of that message, I don't know exactly what to do about it,
> > and it feels like whatever heuristic I adopt might end up being wrong
> > and something bad happens anyway.
>
> Is it maybe just a matter of terminology? If a proxy tells the server,
> "This user is logging in. Here's the password I have for them. DO NOT
> authenticate using anything else," and the HBA says to use ident auth
> for that user, then the server fails the connection. That's what I
> mean by in-band -- the proxy says, "here are the credentials for this
> connection." That's it.

I don't think that's quite the right concept. It seems to me that the
client is responsible for informing the server of what the situation
is, and the server is responsible for deciding whether to allow the
connection. In your scenario, the client is not only communicating
information ("here's the password I have got") but also making demands
on the server ("DO NOT authenticate using anything else"). I like the
first part fine, but not the second part.

Consider the scenario where somebody wants to allow a connection that
is proxied and does not require a password. For example, maybe I have
a group of three machines that all mutually trust each other and the
network is locked down so that we need not worry about IP spoofing or
whatever. Just be doubly sure, they all have SSL certificates so that
they can verify that an incoming connection is from one of the other
trusted machines. I, as the administrator, want to configure things so
that each machine will proxy connections to the others as long as
local user = remote user. When the remote machine receives the
connection, it can trust that the request is legitimate provided that
the SSL certificate is successfully verified.

The way I think this should work is, first, on each machine, in the
proxy configuration, there should be a rule that says "only proxy
connections where local user = remote user" (and any other rules I
want to enforce). Second, in the HBA configuration, there should be a
rule that says "if somebody is trying to proxy a connection, it has to
be for one of these IPs and they have to authenticate using an SSL
certificate". In this kind of scenario, the client has no business
demanding that the server authenticate using the password rather than
anything else. The server, not the client, is in charge of deciding
which connections to accept; the client's job is only to decide which
connections to proxy. And the human being is responsible for making
sure that the combination of those two things implements the intended
security policy.

> Agreed. The danger from my end is, I'm trained on configuration
> formats that have infinite bells and whistles. I don't really want to
> go too crazy with it.

Yeah. If I remember my math well enough, the time required to
implement infinite bells and whistles will also be infinite, and as a
wise man once said, real artists ship.

It does seem like a good idea, if we can, to make the configuration
file format flexible enough that we can easily extend it with more
bells and whistles later if we so choose. But realistically most
people are going to have very simple configurations.

> > and that maybe
> > has something in common with our existing configuration file syntaxes.
> > But if we have to invent something new, then we can do that.
>
> Okay. Personally I'd like
> - the ability to set options globally (so filters are optional)
> - the ability to maintain many options for a specific scope (host? IP
> range?) without making my config lines grow without bound
> - the ability to audit a configuration without trusting its comments
>
> But getting all of my wishlist into a sane configuration format that
> handles all the use cases is the tricky part. I'll think about it.

Nobody seemed too keen on my proposal of a bunch of tab-separated
fields; maybe we're all traumatized from pg_hba.conf and should look
for something more complex with a real parser. I thought that
tab-separated fields might be good enough and simple to implement, but
it doesn't matter how simple it is to implement if nobody likes it. We
could do something that looks more like a series of if-then rules,
e.g.

target-host 127.0.0.0/8 => reject
authentication-method scram => accept
reject

But it's only a hop, skip and a jump from there to something that
looks an awful lot like a full-blown programing language, and maybe
that's even the right idea, but, oh, the bike-shedding!

Cue someone to suggest that it's about time we embed a Lua interpreter.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Fri, Feb 3, 2023 at 3:47 AM Andres Freund <andres@anarazel.de> wrote:
> On 2023-02-02 09:28:03 -0500, Robert Haas wrote:
> > I don't know what you mean by this. DML doesn't confer privileges. If
> > code gets executed and runs with the replication user's credentials,
> > that could lead to privilege escalation, but just moving rows around
> > doesn't, at least not in the database sense.
>
> Executing DML ends up executing code. Think predicated/expression
> indexes, triggers, default expressions etc. If a badly written trigger
> etc can be tricked to do arbitrary code exec, an attack will be able to
> run with the privs of the run-as user.  How bad that is is influenced to
> some degree by the amount of privileges that user has.

I spent some time studying this today. I think you're right. What I'm
confused about is: why do we consider this situation even vaguely
acceptable? Isn't this basically an admission that our logical
replication security model is completely and totally broken and we
need to fix it somehow and file for a CVE number? Like, in released
branches, you can't even have a subscription owned by a non-superuser.
But any non-superuser can set a default expression or create an enable
always trigger and sure enough, if that table is replicated, the
system will run that trigger as the subscription owner, who is a
superuser. Which AFAICS means that if a non-superuser owns a table
that is part of a subscription, they can instantly hack superuser.
Which seems, uh, extremely bad. Am I missing something?

Based on other remarks you made upthread, it seems like we ought to be
doing the actual replication as the table owner, since the table owner
has to be prepared for executable code attached to the table to be
re-run on rows in the table at any table when somebody does a REINDEX.
And then, in master, where there's some provision for non-superuser
subscription owners, we maybe need to re-think the privileges required
to replicate into a table in the first place. I don't think that
having I/U/D permissions on a table is really sufficient to justify
performing those operations *as the table owner*; perhaps the check
ought to be whether you have the privileges of the table owner.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-02-06 14:07:39 -0500, Robert Haas wrote:
> On Fri, Feb 3, 2023 at 3:47 AM Andres Freund <andres@anarazel.de> wrote:
> > On 2023-02-02 09:28:03 -0500, Robert Haas wrote:
> > > I don't know what you mean by this. DML doesn't confer privileges. If
> > > code gets executed and runs with the replication user's credentials,
> > > that could lead to privilege escalation, but just moving rows around
> > > doesn't, at least not in the database sense.
> >
> > Executing DML ends up executing code. Think predicated/expression
> > indexes, triggers, default expressions etc. If a badly written trigger
> > etc can be tricked to do arbitrary code exec, an attack will be able to
> > run with the privs of the run-as user.  How bad that is is influenced to
> > some degree by the amount of privileges that user has.
> 
> I spent some time studying this today. I think you're right. What I'm
> confused about is: why do we consider this situation even vaguely
> acceptable? Isn't this basically an admission that our logical
> replication security model is completely and totally broken and we
> need to fix it somehow and file for a CVE number? Like, in released
> branches, you can't even have a subscription owned by a non-superuser.
> But any non-superuser can set a default expression or create an enable
> always trigger and sure enough, if that table is replicated, the
> system will run that trigger as the subscription owner, who is a
> superuser. Which AFAICS means that if a non-superuser owns a table
> that is part of a subscription, they can instantly hack superuser.
> Which seems, uh, extremely bad. Am I missing something?

It's decidedly not great, yes. I don't know if it's quite a CVE type issue,
after all, the same is true for any other type of query the superuser
executes. But at the very least the documentation needs to be better, with a
big red box making sure the admin is aware of the problem.

I think we need some more fundamental ways to deal with this issue, including
but not restricted to the replication context. Some potentially relevant
discussion is in this thread:
https://postgr.es/m/75b0dbb55e9febea54c441efff8012a6d2cb5bd7.camel%40j-davis.com

I don't agree with Jeff's proposal, but I think there's some worthwhile ideas
in the idea + followups.


> And then, in master, where there's some provision for non-superuser
> subscription owners, we maybe need to re-think the privileges required
> to replicate into a table in the first place. I don't think that
> having I/U/D permissions on a table is really sufficient to justify
> performing those operations *as the table owner*; perhaps the check
> ought to be whether you have the privileges of the table owner.

Yes, I think we ought to check role membership, including non-inherited
memberships.

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Mon, Feb 6, 2023 at 2:18 PM Andres Freund <andres@anarazel.de> wrote:
> It's decidedly not great, yes. I don't know if it's quite a CVE type issue,
> after all, the same is true for any other type of query the superuser
> executes. But at the very least the documentation needs to be better, with a
> big red box making sure the admin is aware of the problem.

I don't think that's the same thing at all. A superuser executing a
query interactively can indeed cause all sorts of bad things to
happen, but you don't have to log in as superuser and run DML queries
on tables owned by unprivileged users, and you shouldn't.

But what we're talking about here is -- the superuser comes along and
sets up logical replication in the configuration in what seems to be
exactly the way it was intended to be used, and now any user who can
log into the subscriber node can become superuser for free whenever
they want, without the superuser doing anything at all, even logging
in. Saying it's "not ideal" seems like you're putting it in the same
category as "the cheese got moldy in the fridge" but to me it sounds
more like "the fridge exploded and the house is on fire."

If we were to document this, I assume that the warning we would add to
the documentation would look like this:

<-- begin documentation text -->
Pretty much don't ever use logical replication. In any normal
configuration, it lets every user on your system escalate to superuser
whenever they want. It is possible to make it safe, if you make sure
all the tables on the replica are owned by the superuser and none of
them have any triggers, defaults, expression indexes, or anything else
associated with them that might execute any code while replicating.
But notice that this makes logical replication pretty much useless for
one of its intended purposes, which is high availability, because if
you actually fail over, you're going to then have to change the owners
of all of those tables and apply any missing triggers, defaults,
expression indexes, or anything like that which you may want to have.
And then to fail back you're going to have to remove all of that stuff
again and once again make the tables superuser-owned. That's obviously
pretty impractical, so you probably shouldn't use logical replication
at all until we get around to fixing this. You might wonder why we
implemented a feature that can't be used in any kind of normal way
without completely and totally breaking your system security -- but
don't ask us, we don't know, either!
<-- end documentation text -->

Honestly, this makes the CREATEROLE exploit that I fixed recently in
master look like a walk in the park. Sure, it's a pain for service
providers, who might otherwise use it, but most normal users don't and
wouldn't no matter how it worked, and really are not going to care.
But people do use logical replication, and it seems to me that the
issue you're describing here means that approximately 100% of those
installations have a vulnerability allowing any local user who owns a
table or can create one to escalate to superuser. Far from being not
quite a CVE issue, that seems substantially more serious than most
things we get CVEs for.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Feb 1, 2023 at 4:02 PM Andres Freund <andres@anarazel.de> wrote:
> On 2023-01-30 15:32:34 -0500, Robert Haas wrote:
> > I had a long think about what to do with ALTER SUBSCRIPTION ... OWNER
> > TO in terms of permissions checks.
>
> As long as owner and run-as are the same, I think it's strongly
> preferrable to *not* require pg_create_subscription.

OK.

> > Another question around ALTER SUBSCRIPTION ... OWNER TO and also ALTER
> > SUBSCRIPTION .. RENAME is whether they ought to fail if you're not a
> > superuser and password_required false is set.
>
> I don't really see a benefit in allowing it, so I'm inclined to go for
> the more restrictive option. But this is a really weakly held opinion.

I went back and forth on this and ended up with what you propose here.
It's simpler to explain this way.

> > +     /* Is the use of a password mandatory? */
> > +     must_use_password = MySubscription->passwordrequired &&
> > +             !superuser_arg(MySubscription->owner);
>
> There's a few repetitions of this - perhaps worth putting into a helper?

I don't think so. It's slightly different each time, because it's
pulling data out of different data structures.

> This still leaks the connection on error, no?

I've attempted to fix this in v4, attached.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Jacob Champion
Date:
On 2/6/23 08:22, Robert Haas wrote:
> I don't think that's quite the right concept. It seems to me that the
> client is responsible for informing the server of what the situation
> is, and the server is responsible for deciding whether to allow the
> connection. In your scenario, the client is not only communicating
> information ("here's the password I have got") but also making demands
> on the server ("DO NOT authenticate using anything else"). I like the
> first part fine, but not the second part.

For what it's worth, making a negative demand during authentication is
pretty standard: if you visit example.com and it tells you "I need your
OS login password and Social Security Number to authenticate you," you
have the option of saying "no thanks" and closing the tab.

It's not really about protecting the server at that point; the server
can protect itself. It's about protecting *you*. Allowing the proxy to
pin a specific set of authentication details to the connection is just a
way for it to close the tab on a server that would otherwise pull some
other piece of ambient authority out of it.

In a hypothetical world where the server presented the client with a
list of authentication options before allowing any access, this would
maybe be a little less convoluted to solve. For example, a proxy seeing
a SASL list of

- ANONYMOUS
- EXTERNAL

could understand that both methods allow the client to assume the
authority of the proxy itself. So if its client isn't allowed to do
that, the proxy realizes something is wrong (either it, or its target
server, has been misconfigured or is under attack), and it can close the
connection *before* the server runs login triggers.

> In this kind of scenario, the client has no business
> demanding that the server authenticate using the password rather than
> anything else. The server, not the client, is in charge of deciding
> which connections to accept; the client's job is only to decide which
> connections to proxy.

This sounds like a reasonable separation of responsibilities on the
surface, but I think it's subtly off. The entire confused-deputy problem
space revolves around the proxy being unable to correctly decide which
connections to allow unless it also knows why the connections are being
authorized.

You've constructed an example where that's not a concern: everything's
symmetrical, all proxies operate with the same authority, and internal
users are identical to external users. But the CVE that led to the
password requirement, as far as I can tell, dealt with asymmetry. The
proxy had the authority to connect locally to a user, and the clients
had the authority to connect to other machines' users, but those users
weren't the same and were not mutually trusting.

> And the human being is responsible for making
> sure that the combination of those two things implements the intended
> security policy.

Sure, but upthread it was illustrated how difficult it is for even the
people implementing the protocol to reason through what's safe and
what's not.

The primitives we're providing in the protocol are, IMO, difficult to
wield safely for more complex use cases. We can provide mitigations, and
demand that the DBA reason through every combination, and tell them
"don't do that" when they screw up or come across a situation that our
mitigations can't paper over. But I think we can solve the root problem
instead.

> We
> could do something that looks more like a series of if-then rules,
> e.g.
> 
> target-host 127.0.0.0/8 => reject
> authentication-method scram => accept
> reject

Yeah, I think something based on allow/deny is going to be the most
intuitive.

> But it's only a hop, skip and a jump from there to something that
> looks an awful lot like a full-blown programing language, and maybe
> that's even the right idea, but, oh, the bike-shedding!

Eh. Someone will demand Turing-completeness eventually, but you don't
have to listen. :D

--Jacob



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Mon, 2023-02-06 at 14:40 -0500, Robert Haas wrote:
> On Mon, Feb 6, 2023 at 2:18 PM Andres Freund <andres@anarazel.de>
> wrote:
> > It's decidedly not great, yes. I don't know if it's quite a CVE
> > type issue,
> > after all, the same is true for any other type of query the
> > superuser
> > executes. But at the very least the documentation needs to be
> > better, with a
> > big red box making sure the admin is aware of the problem.
>
> I don't think that's the same thing at all. A superuser executing a
> query interactively can indeed cause all sorts of bad things to
> happen, but you don't have to log in as superuser and run DML queries
> on tables owned by unprivileged users, and you shouldn't.

There are two questions:

1. Is the security situation with logical replication bad? Yes. You
nicely summarized just how bad.

2. Is it the same situation as accessing a table owned by a user you
don't absolutely trust?

Regardless of how the second question is answered, it won't diminish
your point that logical replication is in a bad state. If another
situation is also bad, we should fix that too.

And I think the DML situation is really bad, too. Anyone reading our
documentation would find extensive explanations about GRANT/REVOKE, and
puzzle over the fine details of exactly how much they trust user foo.
Do I trust foo enough for WITH GRANT OPTION? Does foo really need to
see all of the columns of this table, or just a subset?

But there's no obvious mention that user foo must trust you absolutely
in order to exercise the GRANT at all, because you (as table owner) can
trivially cause foo to execute arbitrary code. There's no warning or
hint or suggestion at runtime to know that you are about to execute
someone else's code with your privileges or that it might be dangerous.

It gets worse. Let's say that user foo figures that out, and they're
extra cautious to SET SESSION AUTHORIZATION or SET ROLE to drop their
privileges before accessing a table. No good: the table owner can just
craft their arbitrary code with a "RESET SESSION AUTHORIZATION" or a
"RESET ROLE" at the top, and the code will still execute with the
privileges of user foo.

So I don't think "shouldn't" is quite good enough. In the first place,
the user needs to know that the risk exists. Second, what if they
actually do want to access a table owned by someone else for whatever
reason -- how do they do that safely?

I can't resist mentioning that these are all SECURITY INVOKER problems.
SECURITY INVOKER is insecure unless the invoker absolutely trusts the
definer, and that only really makes sense if the definer is a superuser
(or something very close). That's why we keep adding exceptions with
SECURITY_RESTRICTED_OPERATION, which is really just a way to silently
ignore the SECURITY INVOKER label and use SECURITY DEFINER instead.

At some point we need to ask: "when is SECURITY INVOKER both safe and
useful?" and contain it to those cases, rather than silently ignoring
it in an expanding list of cases.

I know that the response here is that SECURITY DEFINER is somehow
worse. Maybe for superuser-defined functions, it is. But basically, the
problems with SECURITY DEFINER all amount to "the author of the code
needs to be careful", which is a lot more intuitive than the problems
with SECURITY INVOKER.

Another option is having some kind SECURITY NONE that would run the
code as a very limited-privilege user that can basically only access
the catalog. That would be useful for running default expressions and
the like without the definer or invoker needing to be careful.


--
Jeff Davis
PostgreSQL Contributor Team - AWS





Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Feb 22, 2023, at 9:18 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> Another option is having some kind SECURITY NONE that would run the
> code as a very limited-privilege user that can basically only access
> the catalog. That would be useful for running default expressions and
> the like without the definer or invoker needing to be careful.

Another option is to execute under the intersection of their privileges, where both the definer and the invoker need
theprivileges in order for the action to succeed.  That would be more permissive than the proposed SECURITY NONE, while
stillpreventing either party from hijacking privileges of the other. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2023-02-22 at 09:27 -0800, Mark Dilger wrote:
> Another option is to execute under the intersection of their
> privileges, where both the definer and the invoker need the
> privileges in order for the action to succeed.  That would be more
> permissive than the proposed SECURITY NONE, while still preventing
> either party from hijacking privileges of the other.

Interesting idea, I haven't heard of something like that being done
before. Is there some precedent for that or a use case where it's
helpful?

Regards,
    Jeff Davis




Re: Non-superuser subscription owners

From
Mark Dilger
Date:

> On Feb 22, 2023, at 10:49 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Wed, 2023-02-22 at 09:27 -0800, Mark Dilger wrote:
>> Another option is to execute under the intersection of their
>> privileges, where both the definer and the invoker need the
>> privileges in order for the action to succeed.  That would be more
>> permissive than the proposed SECURITY NONE, while still preventing
>> either party from hijacking privileges of the other.
>
> Interesting idea, I haven't heard of something like that being done
> before. Is there some precedent for that or a use case where it's
> helpful?

No current use case comes to mind, but I proposed it for event triggers one or two development cycles ago, to allow for
non-superuserevent trigger owners.  The problems associated with allowing non-superusers to create and own event
triggerswere pretty similar to the problems being discussed in this thread. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company






Re: Non-superuser subscription owners

From
Joe Conway
Date:
On 2/22/23 14:12, Mark Dilger wrote:
>> On Feb 22, 2023, at 10:49 AM, Jeff Davis <pgsql@j-davis.com> wrote:
>> On Wed, 2023-02-22 at 09:27 -0800, Mark Dilger wrote:
>>> Another option is to execute under the intersection of their
>>> privileges, where both the definer and the invoker need the
>>> privileges in order for the action to succeed.  That would be more
>>> permissive than the proposed SECURITY NONE, while still preventing
>>> either party from hijacking privileges of the other.
>> 
>> Interesting idea, I haven't heard of something like that being done
>> before. Is there some precedent for that or a use case where it's
>> helpful?
>  > No current use case comes to mind, but I proposed it for event
> triggers one or two development cycles ago, to allow for
> non-superuser event trigger owners.  The problems associated with
> allowing non-superusers to create and own event triggers were pretty
> similar to the problems being discussed in this thread.


The intersection of privileges is used, for example, in multi-level 
security contexts where the intersection of the network-allowed levels 
and the subject allowed levels is used to bracket what can be accessed 
and how.

Other examples I found with a quick search:


https://docs.oracle.com/javase/8/docs/api/java/security/AccessController.html#doPrivileged-java.security.PrivilegedAction-java.security.AccessControlContext-


https://learn.microsoft.com/en-us/dotnet/api/system.security.permissions.dataprotectionpermission.intersect?view=dotnet-plat-ext-7.0


-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Feb 22, 2023 at 12:18 PM Jeff Davis <pgsql@j-davis.com> wrote:
> There are two questions:
>
> 1. Is the security situation with logical replication bad? Yes. You
> nicely summarized just how bad.
>
> 2. Is it the same situation as accessing a table owned by a user you
> don't absolutely trust?
>
> Regardless of how the second question is answered, it won't diminish
> your point that logical replication is in a bad state. If another
> situation is also bad, we should fix that too.

Well said.

> So I don't think "shouldn't" is quite good enough. In the first place,
> the user needs to know that the risk exists. Second, what if they
> actually do want to access a table owned by someone else for whatever
> reason -- how do they do that safely?

Good question. I don't think we currently have a good answer.

> I can't resist mentioning that these are all SECURITY INVOKER problems.
> SECURITY INVOKER is insecure unless the invoker absolutely trusts the
> definer, and that only really makes sense if the definer is a superuser
> (or something very close). That's why we keep adding exceptions with
> SECURITY_RESTRICTED_OPERATION, which is really just a way to silently
> ignore the SECURITY INVOKER label and use SECURITY DEFINER instead.

That's an interesting way to look at it. I think there are perhaps two
different possible perspectives here. One possibility is to take the
view that you've adopted here, and blame it on SECURITY INVOKER. The
other possibility, at least as I see it, is to blame it on the fact
that we have so many places to attach executable code to tables and
very few ways for people using those tables to limit their exposure to
such code. Suppose Alice owns a table and attaches a trigger to it. If
Bob inserts into that table, I think we have to run the trigger,
because Alice is entitled to assume that, for example, any BEFORE
triggers she might have defined that block certain kinds of inserts
are actually going to block those inserts; any constraints that she
has applied to the table are going to be enforced against all new
rows; and any default expressions she supplies are actually going to
work. I think Bob has to be OK with those things too; otherwise, he
just shouldn't insert anything into the table.

But Bob doesn't have to be OK with Alice's code changing the session
state, or executing DML or DDL with his permissions. I wonder if
that's where we should be trying to insert restrictions here. Right
now, we think of SECURITY_RESTRICTED_OPERATION as a way to prevent a
function or procedure that runs under a different user ID than the
session user from poisoning the session state. But I'm thinking that
maybe the problem isn't really with code running under a different
user ID. It's with running code *provided by* a different user ID.
Maybe we should stop thinking about the security context as something
that you set when you switch to running as a different user ID, and
start thinking about it as something that needs to be set based on the
relationship between the user that provided the code and the session
user. If they're not the same, some restrictions are probably
appropriate, except I think in the case where the user who provided
the code can become the session user anyway.

> Another option is having some kind SECURITY NONE that would run the
> code as a very limited-privilege user that can basically only access
> the catalog. That would be useful for running default expressions and
> the like without the definer or invoker needing to be careful.

This might be possible, but I have some doubts about how difficult it
would be to get all the details right. We'd need to make sure that
this limited-privilege user couldn't ever create a table, or own one,
or be granted any privileges to do anything other than the minimal set
of things it's supposed to be able to do, or poison the session state,
etc. And it would have weird results like current_user returning the
name of the limited-privilege user rather than any of the users
involved in the operation. Maybe that's all OK, but I find it more
appealing to try to think about what kinds of operations can be
performed in what contexts than to invent entirely new users.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Mon, 2023-02-27 at 10:45 -0500, Robert Haas wrote:

>  Suppose Alice owns a table and attaches a trigger to it. If
> Bob inserts into that table, I think we have to run the trigger,
> because Alice is entitled to assume that, for example, any BEFORE
> triggers she might have defined that block certain kinds of inserts
> are actually going to block those inserts; any constraints that she
> has applied to the table are going to be enforced against all new
> rows; and any default expressions she supplies are actually going to
> work.

True, but I still find myself suspending my disbelief. Which of these
use cases make sense for SECURITY INVOKER?

> I think Bob has to be OK with those things too; otherwise, he
> just shouldn't insert anything into the table.

Right, but why should Bob's privileges be needed to do any of those
things? Any difference in privileges, for those use cases, could only
either get in the way of achieving Alice's goals, or cause a security
problem for Bob.

> But Bob doesn't have to be OK with Alice's code changing the session
> state, or executing DML or DDL with his permissions.

What's left? Should Bob be OK with Alice's code using his permissions
for anything?

>  I wonder if
> that's where we should be trying to insert restrictions here. Right
> now, we think of SECURITY_RESTRICTED_OPERATION as a way to prevent a
> function or procedure that runs under a different user ID than the
> session user from poisoning the session state. But I'm thinking that
> maybe the problem isn't really with code running under a different
> user ID. It's with running code *provided by* a different user ID.
> Maybe we should stop thinking about the security context as something
> that you set when you switch to running as a different user ID, and
> start thinking about it as something that needs to be set based on
> the
> relationship between the user that provided the code and the session
> user. If they're not the same, some restrictions are probably
> appropriate, except I think in the case where the user who provided
> the code can become the session user anyway.

I think you are saying that we should still run Alice's code with the
privileges of Bob, but somehow make that safe(r) for Bob. Is that
right?

That sounds hard, and I'm still stuck at the "why" question. Why do we
want to run Alice's code with Bob's permissions?

The answers I have so far are abstract. For instance, maybe it's a
clever SRF that takes table names as inputs and you want people to only
be able to use the clever SRF with tables they have privileges on. But
that's not what most functions do, and it's certainly not what most
default expressions do.

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Stephen Frost
Date:
Greetings,

* Jeff Davis (pgsql@j-davis.com) wrote:
> On Mon, 2023-02-27 at 10:45 -0500, Robert Haas wrote:
> >  Suppose Alice owns a table and attaches a trigger to it. If
> > Bob inserts into that table, I think we have to run the trigger,
> > because Alice is entitled to assume that, for example, any BEFORE
> > triggers she might have defined that block certain kinds of inserts
> > are actually going to block those inserts; any constraints that she
> > has applied to the table are going to be enforced against all new
> > rows; and any default expressions she supplies are actually going to
> > work.
>
> True, but I still find myself suspending my disbelief. Which of these
> use cases make sense for SECURITY INVOKER?

I do think there are some use-cases for it, but agree that it'd be
better to encourage more use of SECURITY DEFINER and one approach to
that might be to have a way for users to explicitly say "don't run code
that isn't mine or a superuser's with my privileges."  Of course, we
need to make sure it's possible to write safe SECURITY DEFINER functions
and to be clear about how to do that to avoid the risk in the other
direction.  We also need to provide some additonal functions along the
lines of "calling_role()" or similar (so that the function can know who
the actual role is that's running the trigger) for the common case of
auditing or needing to know the calling role for RLS or similar.

I don't think we'd be able to get away with just getting rid of SECURITY
INVOKER entirely or even in changing the current way triggers (or
functions in views, etc) are run by default.

> > I think Bob has to be OK with those things too; otherwise, he
> > just shouldn't insert anything into the table.
>
> Right, but why should Bob's privileges be needed to do any of those
> things? Any difference in privileges, for those use cases, could only
> either get in the way of achieving Alice's goals, or cause a security
> problem for Bob.
>
> > But Bob doesn't have to be OK with Alice's code changing the session
> > state, or executing DML or DDL with his permissions.
>
> What's left? Should Bob be OK with Alice's code using his permissions
> for anything?

I don't know about trying to define that X things are ok and Y things
are not, that seems like it would be more confusing and difficult to
work with.  Regular SELECT queries that pull data that Bob has access to
but Alice doesn't is a security issue too, were Alice to install a
function that Bob calls which writes that data into a place that Alice
could then access it.  Perhaps if we could allow Bob to say "these
things are ok for Alice's code to access" then it could work ... but if
that's what is going on then the code could run with Alice's permissions
and Bob could use our nice and granular GRANT/RLS system to say what
Alice is allowed to access.

> >  I wonder if
> > that's where we should be trying to insert restrictions here. Right
> > now, we think of SECURITY_RESTRICTED_OPERATION as a way to prevent a
> > function or procedure that runs under a different user ID than the
> > session user from poisoning the session state. But I'm thinking that
> > maybe the problem isn't really with code running under a different
> > user ID. It's with running code *provided by* a different user ID.
> > Maybe we should stop thinking about the security context as something
> > that you set when you switch to running as a different user ID, and
> > start thinking about it as something that needs to be set based on
> > the
> > relationship between the user that provided the code and the session
> > user. If they're not the same, some restrictions are probably
> > appropriate, except I think in the case where the user who provided
> > the code can become the session user anyway.
>
> I think you are saying that we should still run Alice's code with the
> privileges of Bob, but somehow make that safe(r) for Bob. Is that
> right?
>
> That sounds hard, and I'm still stuck at the "why" question. Why do we
> want to run Alice's code with Bob's permissions?
>
> The answers I have so far are abstract. For instance, maybe it's a
> clever SRF that takes table names as inputs and you want people to only
> be able to use the clever SRF with tables they have privileges on. But
> that's not what most functions do, and it's certainly not what most
> default expressions do.

current_role / current_user are certainly common as a default
expression.  I agree that that's more of an edge case that would be nice
to solve in a different way though.  I do think there's some other use
cases for SECURITY INVOKER but not enough folks understand the security
risk associated with it and it'd be good for us to improve on that
situation.

Thanks,

Stephen

Attachment

Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Mon, Feb 27, 2023 at 1:25 PM Jeff Davis <pgsql@j-davis.com> wrote:
> I think you are saying that we should still run Alice's code with the
> privileges of Bob, but somehow make that safe(r) for Bob. Is that
> right?

Yeah. That's the idea I was floating, at least.

> That sounds hard, and I'm still stuck at the "why" question. Why do we
> want to run Alice's code with Bob's permissions?
>
> The answers I have so far are abstract. For instance, maybe it's a
> clever SRF that takes table names as inputs and you want people to only
> be able to use the clever SRF with tables they have privileges on. But
> that's not what most functions do, and it's certainly not what most
> default expressions do.

I guess I have a pretty hard time imagining that we can just
obliterate SECURITY INVOKER entirely. It seems fundamentally
reasonable to me that Alice might want to make some code available to
be executed in the form of a function or procedure but without
offering to execute it with her own privileges. But I think maybe
you're asking a different question, which is whether when the code is
attached to a table we ought to categorically switch to the table
owner before executing it. I'm less sure about the answer to that
question. We already take the position that VACUUM always runs as the
table owner, and while VACUUM runs index expressions but not for
example triggers, why not just be consistent and run all code that is
tied to the table as the table owner, all the time?

Maybe that's the right thing to do, but I think it would inevitably
break some things for some users. Alice might choose to write her
triggers or default expressions in ways that rely on them running with
Bob's permissions in any number of ways. For instance, maybe those
functions issue a SELECT query against an RLS-enabled table, such that
the answer depends on whose privileges are used to run the query. More
simply, she might refer to CURRENT_ROLE, say to record who inserted
any particular row into her table, which seems like a totally
reasonable thing to want to do. If she was feeling really clever, she
might even have designed queries that she's using inside those
triggers or default expressions to fail if Bob doesn't have enough
permissions to do some particular modification that he's attempting,
and thus block certain kinds of access to her own tables. That would
be pretty weird and perhaps too clever by half, but the point is that
the current behavior is probably known to many, many users and we
really can't know what they've done that depends on that. If we change
any behavior here, some people are going to notice those changes, and
they may not like them.

To put that another way, we're not talking about a black-and-white
security vulnerability here, like a buffer overrun that allows for
arbitrary code execution. We're talking about a set of semantics that
seem to be somewhat fragile and vulnerable to spawning security
problems. Nobody wants those security problems, for sure. But it
doesn't follow that nobody is relying on the semantics.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Mon, 2023-02-27 at 14:10 -0500, Stephen Frost wrote:
> I do think there are some use-cases for it, but agree that it'd be
> better to encourage more use of SECURITY DEFINER and one approach to
> that might be to have a way for users to explicitly say "don't run
> code
> that isn't mine or a superuser's with my privileges." 

I tried that:

https://www.postgresql.org/message-id/75b0dbb55e9febea54c441efff8012a6d2cb5bd7.camel@j-davis.com

but Andres pointed out some problems with my implementation. They
didn't seem easily fixable, but perhaps with more effort it could work
(run all the expressions as security definer, as well?).

>  Of course, we
> need to make sure it's possible to write safe SECURITY DEFINER
> functions
> and to be clear about how to do that to avoid the risk in the other
> direction.

Agreed. Perhaps we can force search_path to be set for SECURITY
DEFINER, and require that the temp schema be explicitly included rather
than the current "must be at the end". We could also provide a way to
turn public access off in the same statement, so that you don't need to
use a transaction block to keep the function private.

> I don't think we'd be able to get away with just getting rid of
> SECURITY
> INVOKER entirely or even in changing the current way triggers (or
> functions in views, etc) are run by default.

I didn't propose anything radical. I'm just trying to get some
agreement that SECURITY INVOKER is central to a lot of our security
woes, and that we should be treating it with skepticism on a
fundamental level.

Individual proposals for how to get away from SECURITY INVOKER should
be evaluated on their merits (i.e. don't break a bunch of stuff).

Regards,
    Jeff Davis




Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Mon, 2023-02-27 at 16:13 -0500, Robert Haas wrote:
> On Mon, Feb 27, 2023 at 1:25 PM Jeff Davis <pgsql@j-davis.com> wrote:
> > I think you are saying that we should still run Alice's code with
> > the
> > privileges of Bob, but somehow make that safe(r) for Bob. Is that
> > right?
>
> Yeah. That's the idea I was floating, at least.

Isn't that a hard problem; maybe impossible?

>
> I guess I have a pretty hard time imagining that we can just
> obliterate SECURITY INVOKER entirely.

Of course not.

>  It seems fundamentally
> reasonable to me that Alice might want to make some code available to
> be executed in the form of a function or procedure but without
> offering to execute it with her own privileges.

It also seems fundamentally reasonable that if someone grants you
privileges on one of their tables, it might be safe to access it.

I'm sure there are a few use cases for SECURITY INVOKER, but they are
quite narrow.

Perhaps most frustratingly, even if none of the users on a system has
any use case for SECURITY INVOKER, they still must all live in fear of
accessing each others' tables, because at any time a SECURITY INVOKER
function could be attached to one of the tables.

I feel like we are giving up mainstream utility and safety in exchange
for contrived or exceptional cases. That's not a good trade.

> We already take the position that VACUUM always runs as the
> table owner, and while VACUUM runs index expressions but not for
> example triggers, why not just be consistent and run all code that is
> tied to the table as the table owner, all the time?

I'd also extend this to default expressions and other code that can be
executed implicitly.

> Maybe that's the right thing to do

If it's the right place to go, then I think we should consider
reasonable steps to take in that direction that don't cause unnecessary
breakage.

> , but I think it would inevitably
> break some things for some users.

Not all steps would be breaking changes, and a lot of those steps are
things we should do anyway. We could make it easier to write safe
SECURITY DEFINER functions, provide more tools for users to opt-out of
executing SECURITY INVOKER code, provide a way for superusers to safely
drop privileges, document the problems with security invoker and what
to do about them, etc.

> Alice might choose to write her
> triggers or default expressions in ways that rely on them running
> with
> Bob's permissions in any number of ways.

Sure, breakage is possible, and we should mitigate it.

But we also shouldn't exaggerate it -- for instance, others have
proposed that we run code as the table owner for logical subscriptions,
and that's going to break things in the same way. Arguably, if we are
going to break something, it's better to break it consistently rather
than one subsystem at a time.

Back to the $SUBJECT, if we allow non-superusers to run subscriptions,
and the subscription runs the code as the table owner, that might also
lead to some weird behavior for triggers that rely on SECURITY INVOKER
semantics.

Regards,
    Jeff Davis





Re: Non-superuser subscription owners

From
Stephen Frost
Date:
Greetings,

* Jeff Davis (pgsql@j-davis.com) wrote:
> On Mon, 2023-02-27 at 14:10 -0500, Stephen Frost wrote:
> > I do think there are some use-cases for it, but agree that it'd be
> > better to encourage more use of SECURITY DEFINER and one approach to
> > that might be to have a way for users to explicitly say "don't run
> > code
> > that isn't mine or a superuser's with my privileges." 
>
> I tried that:
>
> https://www.postgresql.org/message-id/75b0dbb55e9febea54c441efff8012a6d2cb5bd7.camel@j-davis.com
>
> but Andres pointed out some problems with my implementation. They
> didn't seem easily fixable, but perhaps with more effort it could work
> (run all the expressions as security definer, as well?).

Presumably.  Ultimately, I tend to agree it won't be easy.  That doesn't
mean it's not a worthwhile effort.

> >  Of course, we
> > need to make sure it's possible to write safe SECURITY DEFINER
> > functions
> > and to be clear about how to do that to avoid the risk in the other
> > direction.
>
> Agreed. Perhaps we can force search_path to be set for SECURITY
> DEFINER, and require that the temp schema be explicitly included rather
> than the current "must be at the end". We could also provide a way to
> turn public access off in the same statement, so that you don't need to
> use a transaction block to keep the function private.

We do pretty strongly encourage a search_path setting for SECURITY
DEFINER today..  That said, I'm not against pushing on that harder.  The
issue about temporary schemas is a more difficult issue... but frankly,
I'd like an option to say "no temporary schemas should be allowed in my
search path" when it comes to a security definer function.

> > I don't think we'd be able to get away with just getting rid of
> > SECURITY
> > INVOKER entirely or even in changing the current way triggers (or
> > functions in views, etc) are run by default.
>
> I didn't propose anything radical. I'm just trying to get some
> agreement that SECURITY INVOKER is central to a lot of our security
> woes, and that we should be treating it with skepticism on a
> fundamental level.

Sure, but if we want to make progress then we have to provide a
direction for folks to go in that's both secure and convenient.

> Individual proposals for how to get away from SECURITY INVOKER should
> be evaluated on their merits (i.e. don't break a bunch of stuff).

Of course.  That said ... we don't want to spend a lot of time
going in a direction that won't bear fruit; I'm hopeful that this
direction will though.

Thanks,

Stephen

Attachment

Re: Non-superuser subscription owners

From
Stephen Frost
Date:
Greetings,

* Jeff Davis (pgsql@j-davis.com) wrote:
> Not all steps would be breaking changes, and a lot of those steps are
> things we should do anyway. We could make it easier to write safe
> SECURITY DEFINER functions, provide more tools for users to opt-out of
> executing SECURITY INVOKER code, provide a way for superusers to safely
> drop privileges, document the problems with security invoker and what
> to do about them, etc.

Agreed.

> But we also shouldn't exaggerate it -- for instance, others have
> proposed that we run code as the table owner for logical subscriptions,
> and that's going to break things in the same way. Arguably, if we are
> going to break something, it's better to break it consistently rather
> than one subsystem at a time.

I tend to agree with this.

> Back to the $SUBJECT, if we allow non-superusers to run subscriptions,
> and the subscription runs the code as the table owner, that might also
> lead to some weird behavior for triggers that rely on SECURITY INVOKER
> semantics.

Indeed.

Thanks,

Stephen

Attachment

Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Mon, Feb 27, 2023 at 7:37 PM Jeff Davis <pgsql@j-davis.com> wrote:
> > Yeah. That's the idea I was floating, at least.
>
> Isn't that a hard problem; maybe impossible?

It doesn't seem that hard to me; maybe I'm missing something.

The existing SECURITY_RESTRICTED_OPERATION flag basically prevents you
from tinkering with the session state. If we also had a similar flags
like DATABASE_READS_PROHIBITED and DATABASE_WRITES_PROHIBITED (or just
a combined DATABASE_ACCESS_PROHIBITED flag) I think that would be
pretty close to what we need. The idea would be that, when a user
executes a function or procedure owned by a user that they don't trust
completely, we'd set
SECURITY_RESTRICTED_OPERATION|DATABASE_READS_PROHIBITED|DATABASE_WRITES_PROHIBITED.
And we could provide a user with a way to express the degree of trust
they have in some other user or perhaps even some specific function,
e.g.

SET trusted_roles='alice:read';

...could mean that I trust alice to read from the database with my
permissions, should I happen to run code provided by her in SECURITY
INVOKER modacke.

I'm sure there's some details to sort out here, e.g. around security
related to the trusted_roles GUC itself. But I don't really see a
fundamental problem. We can invent arbitrary flags that prohibit
classes of operations that are of concern, set them by default in
cases where concern is justified, and then give users who want the
current behavior some kind of escape hatch that causes those flags to
not get set after all. Not only does such a solution not seem
impossible, I can possibly even imagine back-patching it, depending on
exactly what the shape of the final solution is, how important we
think it is to get a fix out there, and how brave I'm feeling that
day.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-02-22 09:18:34 -0800, Jeff Davis wrote:
> I can't resist mentioning that these are all SECURITY INVOKER problems.
> SECURITY INVOKER is insecure unless the invoker absolutely trusts the
> definer, and that only really makes sense if the definer is a superuser
> (or something very close). That's why we keep adding exceptions with
> SECURITY_RESTRICTED_OPERATION, which is really just a way to silently
> ignore the SECURITY INVOKER label and use SECURITY DEFINER instead.
> 
> At some point we need to ask: "when is SECURITY INVOKER both safe and
> useful?" and contain it to those cases, rather than silently ignoring
> it in an expanding list of cases.

I can only repeat myself in stating that SECURITY DEFINER solves none of the
relevant issues. I included several examples of why it doesn't in the recent
thread about "blocking SECURITY INVOKER". E.g. that default arguments of
SECDEF functions are evaluated with the current user's privileges, not the
function owner's privs:

https://postgr.es/m/20230113032943.iyxdu7bnxe4cmbld%40awork3.anarazel.de

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Tue, 2023-02-28 at 11:28 -0800, Andres Freund wrote:
> I can only repeat myself in stating that SECURITY DEFINER solves none
> of the
> relevant issues. I included several examples of why it doesn't in the
> recent
> thread about "blocking SECURITY INVOKER". E.g. that default arguments
> of
> SECDEF functions are evaluated with the current user's privileges,
> not the
> function owner's privs:
>
> https://postgr.es/m/20230113032943.iyxdu7bnxe4cmbld%40awork3.anarazel.de

I was speaking a bit loosely, using "SECURITY DEFINER" to mean the
semantics of executing code as the one who wrote it. I didn't
specifically mean the function marker, because as you pointed out in
the other thread, that's not enough.

From your email it looks like there is still a path forward:

"The proposal to not trust any expressions controlled by untrusted
users at least allows to prevent execution of code, even if it doesn't
provide a way to execute the code in a safe manner.  Given that we
don't have the former, it seems foolish to shoot for the latter."

And later:

"I think the combination of
a) a setting that restricts evaluation of any non-trusted expressions,
   independent of the origin
b) an easy way to execute arbitrary statements within
   SECURITY_RESTRICTED_OPERATION"

My takeaway from that thread was that we need a mechanism to deal with
non-function code (e.g. default expressions) first; but once we have
that, it opens up the design space to better solutions or at least
mitigations. Is that right?

Regards,
    Jeff Davis




Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Tue, 2023-02-28 at 08:37 -0500, Robert Haas wrote:
> The existing SECURITY_RESTRICTED_OPERATION flag basically prevents
> you
> from tinkering with the session state.

Currently, every time we set that flag we also run all the code as the
table owner.

You're suggesting using the SECURITY_RESTRICTED_OPERATION flag, along
with the new security flags, but not switch to the table owner, right?

>  If we also had a similar flags
> like DATABASE_READS_PROHIBITED and DATABASE_WRITES_PROHIBITED (or
> just
> a combined DATABASE_ACCESS_PROHIBITED flag) I think that would be
> pretty close to what we need. The idea would be that, when a user
> executes a function or procedure 

Or default expressions, I presume. If we at least agree on this point,
then I think we should try to find a way to treat these other hunks of
code in a secure way (which I think is what Andres was suggesting).

> owned by a user that they don't trust
> completely, we'd set
> SECURITY_RESTRICTED_OPERATION|DATABASE_READS_PROHIBITED|DATABASE_WRIT
> ES_PROHIBITED.

It seems like you're saying to basically just keep the user ID the
same, and maybe keep USAGE privileges, but not be able to do anything
else? Might be useful. Kind of like running it as a nobody user but
without the problems you mentioned. Some details to think about, I'm
sure.

> And we could provide a user with a way to express the degree of trust
> they have in some other user or perhaps even some specific function,
> e.g.
>
> SET trusted_roles='alice:read';
>
> ...could mean that I trust alice to read from the database with my
> permissions, should I happen to run code provided by her in SECURITY
> INVOKER modacke.

I'm not very excited about inventing a new privilege language inside a
GUC, but perhaps a simpler form could be a reasonable mitigation (or at
least a starting place).

> I'm sure there's some details to sort out here, e.g. around security
> related to the trusted_roles GUC itself. But I don't really see a
> fundamental problem. We can invent arbitrary flags that prohibit
> classes of operations that are of concern, set them by default in
> cases where concern is justified, and then give users who want the
> current behavior some kind of escape hatch that causes those flags to
> not get set after all. Not only does such a solution not seem
> impossible, I can possibly even imagine back-patching it, depending
> on
> exactly what the shape of the final solution is, how important we
> think it is to get a fix out there, and how brave I'm feeling that
> day.

Unless the trusted roles defaults to '*', then I think it will still
break some things.


One of my key tests for user-facing proposals is whether the
documentation will be reasonable or not. Most of these proposals to
make SECURITY INVOKER less bad fail that test.

Each of these ideas and sub-ideas affect the semantics, and should be
documented. But how do we document that some code runs as you, some as
the person who wrote it, sometimes we obey SECURITY INVOKER and
sometimes we ignore it and use DEFINER semantics, some code is outside
a function and always executes as the invoker, some code has some
security flags, and some code has more security flags, code can change
between the time you look at it and the time it runs, and it's all
filtered through GUCs with their own privilege sub-language?

OK, let's assume that we have all of that documented, then how do we
guide users on what reasonable best practices are for the GUC settings,
etc.? Or do we just say "this is mechanically how all these parts work,
good luck assembling it into a secure system!". [ Note: I feel like
this is the state we are in now. Even if technically we don't have live
security bugs that I'm aware of, we are setting users up for security
problems. ]

On the other hand, if we focus on executing code as the user who wrote
it in most places, then the documentation will be something like: "you
defined the table, you wrote the code, it runs as you, here are some
best practices for writing secure code". And we have some different
documentation for writing a cool SECURITY INVOKER function and how to
get other users to trust you enough to run it. That sounds a LOT more
understandable for users.

Regards,
    Jeff Davis




Re: Non-superuser subscription owners

From
Stephen Frost
Date:
Greetings,

* Jeff Davis (pgsql@j-davis.com) wrote:
> On Tue, 2023-02-28 at 08:37 -0500, Robert Haas wrote:
> > The existing SECURITY_RESTRICTED_OPERATION flag basically prevents
> > you
> > from tinkering with the session state.
>
> Currently, every time we set that flag we also run all the code as the
> table owner.
>
> You're suggesting using the SECURITY_RESTRICTED_OPERATION flag, along
> with the new security flags, but not switch to the table owner, right?

I'm having trouble following this too, I have to admit.  If we aren't
changing who we're running the code under.. but making it so that the
code isn't actually able to do anything then that doesn't strike me as
likely to actually be useful?  Surely things like triggers which are
used to update another table or insert into another table what happened
on the table with the trigger need to be allowed to modify the database-
how do we make that possible while the code runs as the invoker and not
the table owner when the table owner is the one who gets to write the
code?

> >  If we also had a similar flags
> > like DATABASE_READS_PROHIBITED and DATABASE_WRITES_PROHIBITED (or
> > just
> > a combined DATABASE_ACCESS_PROHIBITED flag) I think that would be
> > pretty close to what we need. The idea would be that, when a user
> > executes a function or procedure 
>
> Or default expressions, I presume. If we at least agree on this point,
> then I think we should try to find a way to treat these other hunks of
> code in a secure way (which I think is what Andres was suggesting).

Would need to apply to functions in views and functions in RLS too,
along wth default expressions and everything else that could be defined
by one person and run by another.

> > owned by a user that they don't trust
> > completely, we'd set
> > SECURITY_RESTRICTED_OPERATION|DATABASE_READS_PROHIBITED|DATABASE_WRIT
> > ES_PROHIBITED.
>
> It seems like you're saying to basically just keep the user ID the
> same, and maybe keep USAGE privileges, but not be able to do anything
> else? Might be useful. Kind of like running it as a nobody user but
> without the problems you mentioned. Some details to think about, I'm
> sure.

While there's certainly some use-cases where a completely unprivileged
user would work, there's certainly an awful lot where it wouldn't.
Having that as an option might be interesting for those much more
limited use-cases and maybe you could even say "only run functions which
are owned by a superuser or X roles" but it's certainly not a general
solution to the problem.

> > And we could provide a user with a way to express the degree of trust
> > they have in some other user or perhaps even some specific function,
> > e.g.
> >
> > SET trusted_roles='alice:read';
> >
> > ...could mean that I trust alice to read from the database with my
> > permissions, should I happen to run code provided by her in SECURITY
> > INVOKER modacke.
>
> I'm not very excited about inventing a new privilege language inside a
> GUC, but perhaps a simpler form could be a reasonable mitigation (or at
> least a starting place).

I'm pretty far down the path of "wow that looks really difficult to work
with", to put it nicely.

> > I'm sure there's some details to sort out here, e.g. around security
> > related to the trusted_roles GUC itself. But I don't really see a
> > fundamental problem. We can invent arbitrary flags that prohibit
> > classes of operations that are of concern, set them by default in
> > cases where concern is justified, and then give users who want the
> > current behavior some kind of escape hatch that causes those flags to
> > not get set after all. Not only does such a solution not seem
> > impossible, I can possibly even imagine back-patching it, depending
> > on
> > exactly what the shape of the final solution is, how important we
> > think it is to get a fix out there, and how brave I'm feeling that
> > day.
>
> Unless the trusted roles defaults to '*', then I think it will still
> break some things.

Defaulting to an option that is "don't break anything" while giving
users flexibility to test out other, more secure, options seems like it
would be a pretty reasonable way forward, generally.  That said.. I
don't really think this particular approach ends up being a good
direction to go in...

> One of my key tests for user-facing proposals is whether the
> documentation will be reasonable or not. Most of these proposals to
> make SECURITY INVOKER less bad fail that test.

and this is certainly a very good point as to why.

Thanks,

Stephen

Attachment

Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Tue, Feb 28, 2023 at 4:01 PM Jeff Davis <pgsql@j-davis.com> wrote:
> You're suggesting using the SECURITY_RESTRICTED_OPERATION flag, along
> with the new security flags, but not switch to the table owner, right?

Correct.

> Or default expressions, I presume. If we at least agree on this point,
> then I think we should try to find a way to treat these other hunks of
> code in a secure way (which I think is what Andres was suggesting).

Yeah, or any other expressions. Basically impose restrictions when the
user running the code is not the same as the user who provided the
code.

> It seems like you're saying to basically just keep the user ID the
> same, and maybe keep USAGE privileges, but not be able to do anything
> else? Might be useful. Kind of like running it as a nobody user but
> without the problems you mentioned. Some details to think about, I'm
> sure.

Yep.

> I'm not very excited about inventing a new privilege language inside a
> GUC, but perhaps a simpler form could be a reasonable mitigation (or at
> least a starting place).

I'm not very sure about this part, either. I think we need some way of
shutting off whatever new controls we impose, but the shape of it is
unclear to me and I think there are a bunch of problems.

> Unless the trusted roles defaults to '*', then I think it will still
> break some things.

Definitely. IMHO, it's OK to break some things, certainly in a major
release and maybe even in a minor release. But we don't want to break
more things that we really need to break. And as you say, we want the
restrictions to be comprehensible.

> Each of these ideas and sub-ideas affect the semantics, and should be
> documented. But how do we document that some code runs as you, some as
> the person who wrote it, sometimes we obey SECURITY INVOKER and
> sometimes we ignore it and use DEFINER semantics, some code is outside
> a function and always executes as the invoker, some code has some
> security flags, and some code has more security flags, code can change
> between the time you look at it and the time it runs, and it's all
> filtered through GUCs with their own privilege sub-language?
>
> OK, let's assume that we have all of that documented, then how do we
> guide users on what reasonable best practices are for the GUC settings,
> etc.? Or do we just say "this is mechanically how all these parts work,
> good luck assembling it into a secure system!". [ Note: I feel like
> this is the state we are in now. Even if technically we don't have live
> security bugs that I'm aware of, we are setting users up for security
> problems. ]
>
> On the other hand, if we focus on executing code as the user who wrote
> it in most places, then the documentation will be something like: "you
> defined the table, you wrote the code, it runs as you, here are some
> best practices for writing secure code". And we have some different
> documentation for writing a cool SECURITY INVOKER function and how to
> get other users to trust you enough to run it. That sounds a LOT more
> understandable for users.

What I was imagining is that we would document something like: A table
can have executable code associated with it in a variety of ways. For
example, it can have triggers, default expressions, check constraints,
or row-level security filters. In most cases, these expressions are
executed with the privileges of the user performing the operation on
the table, except when SECURITY DEFINER functions are used. Because
these expressions are set by the table owner and executed by the users
accessing the table, there is a risk that the table owner could
include malicious code that usurps the privileges of the user
accessing the table. For this reason, these expressions are, by
default, restricted from doing <things>. If you want to allow those
operations, you can <something>.

I agree that running code as the table owner is helpful in a bunch of
scenarios, but I also don't think it fixes everything. You earlier
mentioned that switching to the table owner seems to be just a way of
turning SECURITY INVOKER into SECURITY DEFINER in random places, or
maybe that's not exactly what you said but that's what I took from it.
And I think that's right. If we just slather user context switches
everywhere, I'm not actually very sure that's going to be
comprehensible behavior: if my trigger function is SECURITY INVOKER,
why is it getting executed as me, not the inserting user? I also think
there are plenty of cases where that could just replace the current
set of security problems with a new set of security problems. If the
trigger function is SECURITY INVOKER, then the user who wrote it
doesn't have to worry about securing it against attacks by users
accessing the table; it's just running with the permissions of the
user performing the DML. Maybe there are correctness issues if you
don't lock down search_path etc., but there's no security compromise
because there's no user ID switching. As soon as you magically turn
that into a SECURITY DEFINER function, you've provided a way for the
users performing DML to attack the table owner.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Stephen Frost
Date:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Tue, Feb 28, 2023 at 4:01 PM Jeff Davis <pgsql@j-davis.com> wrote:
> > Or default expressions, I presume. If we at least agree on this point,
> > then I think we should try to find a way to treat these other hunks of
> > code in a secure way (which I think is what Andres was suggesting).
>
> Yeah, or any other expressions. Basically impose restrictions when the
> user running the code is not the same as the user who provided the
> code.

Would this have carve-outs for things like "except if the user providing
the code is trusted/superuser"?  Seems like that would be necessary for
the function to be able to do more-or-less anything, but then I worry
that there's superuser-owned code which could leak information or be
used by a malicious owner as that code would still be running as the
invoking user..  Perhaps we could say that the function also has to be
leakproof, but that isn't quite the same issue and therefore it seems
like we'd need to decorate all of the functions with another flag that's
allowed to be run in this manner.

Random thought- what if the function has a NOTIFY in it with a payload
of some kind of sensitive information?

> > Unless the trusted roles defaults to '*', then I think it will still
> > break some things.
>
> Definitely. IMHO, it's OK to break some things, certainly in a major
> release and maybe even in a minor release. But we don't want to break
> more things that we really need to break. And as you say, we want the
> restrictions to be comprehensible.

Really hard to say if whatever this is is OK for back-patching and
breaking minor releases without knowing exactly what is getting broken
... but it'd have to be a very clear edge case of what gets broken for
it to be sensible for breaking in a minor release without a very clear
vulnerability or such that's being fixed with a simple work-around.
Just making all auditing triggers break in a minor release certainly
wouldn't be acceptable, as an example that I imagine we all agree with.

> > Each of these ideas and sub-ideas affect the semantics, and should be
> > documented. But how do we document that some code runs as you, some as
> > the person who wrote it, sometimes we obey SECURITY INVOKER and
> > sometimes we ignore it and use DEFINER semantics, some code is outside
> > a function and always executes as the invoker, some code has some
> > security flags, and some code has more security flags, code can change
> > between the time you look at it and the time it runs, and it's all
> > filtered through GUCs with their own privilege sub-language?
> >
> > OK, let's assume that we have all of that documented, then how do we
> > guide users on what reasonable best practices are for the GUC settings,
> > etc.? Or do we just say "this is mechanically how all these parts work,
> > good luck assembling it into a secure system!". [ Note: I feel like
> > this is the state we are in now. Even if technically we don't have live
> > security bugs that I'm aware of, we are setting users up for security
> > problems. ]
> >
> > On the other hand, if we focus on executing code as the user who wrote
> > it in most places, then the documentation will be something like: "you
> > defined the table, you wrote the code, it runs as you, here are some
> > best practices for writing secure code". And we have some different
> > documentation for writing a cool SECURITY INVOKER function and how to
> > get other users to trust you enough to run it. That sounds a LOT more
> > understandable for users.
>
> What I was imagining is that we would document something like: A table
> can have executable code associated with it in a variety of ways. For
> example, it can have triggers, default expressions, check constraints,
> or row-level security filters. In most cases, these expressions are
> executed with the privileges of the user performing the operation on
> the table, except when SECURITY DEFINER functions are used. Because
> these expressions are set by the table owner and executed by the users
> accessing the table, there is a risk that the table owner could
> include malicious code that usurps the privileges of the user
> accessing the table. For this reason, these expressions are, by
> default, restricted from doing <things>. If you want to allow those
> operations, you can <something>.

Well, one possible answer to 'something' might be 'use SECURITY DEFINER
functions which are owned by a role allowed to do <things>'.  Note that
that doesn't have to be the table owner though, it could be a much more
constrained role.  That approach would allow us to leverage the existing
GRANT/RLS/et al system for what's allowed and avoid having to create new
things like a complex permission system inside of a GUC for users to
have to understand.

> I agree that running code as the table owner is helpful in a bunch of
> scenarios, but I also don't think it fixes everything. You earlier
> mentioned that switching to the table owner seems to be just a way of
> turning SECURITY INVOKER into SECURITY DEFINER in random places, or
> maybe that's not exactly what you said but that's what I took from it.
> And I think that's right. If we just slather user context switches
> everywhere, I'm not actually very sure that's going to be
> comprehensible behavior: if my trigger function is SECURITY INVOKER,
> why is it getting executed as me, not the inserting user? I also think
> there are plenty of cases where that could just replace the current
> set of security problems with a new set of security problems. If the
> trigger function is SECURITY INVOKER, then the user who wrote it
> doesn't have to worry about securing it against attacks by users
> accessing the table; it's just running with the permissions of the
> user performing the DML. Maybe there are correctness issues if you
> don't lock down search_path etc., but there's no security compromise
> because there's no user ID switching. As soon as you magically turn
> that into a SECURITY DEFINER function, you've provided a way for the
> users performing DML to attack the table owner.

I agree that we don't want to just turn "SECURITY INVOKER function when
run as a trigger" into SECURITY DEFINER, and that SECURITY DEFINER
functions need to be able to be written in a mannor that limits the risk
of them being able to be abused to gain control of the role which owns
the function (the latter being something we've worked on but should
certainly continue to improve on, independently of any of this..).

Along the same general vein of "don't break things", perhaps an approach
would be a GUC that users can enable that says "don't allow code that
does something dangerous (again, need to figure out how to do that..)
when it's written by someone else to run with my privileges (and
therefore isn't a SECURITY DEFINER function)".  The idea here being that
we want to encourage users to enable that, maybe we eventually enable it
by default in a new major version, and push people in the direction of
writing secure SECURITY DEFINER functions for the cases where they
actually need the trigger, or such, to do something beyond whatever we
define as being 'safe'.  This keeps the GUC as a simple on/off or enum
like row_security, fails the action when something not-safe is being
attempted, and gives the flexibility of our existing GRANT/RLS system
for the case where a SECURITY DEFINER function is created to perform the
operation.  This does still need some supporting functions like 'calling
role' or such because there could be many many roles doing an INSERT
into a table which runs a trigger and that trigger runs as some other
role, but the other role could be one that has more privileges than the
INSERT'ing role and therefore it needs to implement additional checks on
the operation to limit what the INSERT'ing role is allowed to do.

I do worry about asking function authors to effectively rewrite these
kinds of permission checks and wonder if there's a way we could make it
easier for them- perhaps a kind of function that's SECURITY DEFINER in
that it runs as the owner of the function, but it sets a flag saying
"only allow things that the function owner is allowed to do AND the
calling user is allowed to do", similar to the 'intersection of
privileges' idea mentioned elsewhere on this thread.

Thanks,

Stephen

Attachment

Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Mar 1, 2023 at 10:48 AM Stephen Frost <sfrost@snowman.net> wrote:
> > Yeah, or any other expressions. Basically impose restrictions when the
> > user running the code is not the same as the user who provided the
> > code.
>
> Would this have carve-outs for things like "except if the user providing
> the code is trusted/superuser"?  Seems like that would be necessary for
> the function to be able to do more-or-less anything, but then I worry
> that there's superuser-owned code which could leak information or be
> used by a malicious owner as that code would still be running as the
> invoking user..  Perhaps we could say that the function also has to be
> leakproof, but that isn't quite the same issue and therefore it seems
> like we'd need to decorate all of the functions with another flag that's
> allowed to be run in this manner.

Yes, I think there can be carve-outs based on the relationship of the
users involved -- if the user who provided the code is the superuser
or some other user who can anyway run whatever they want as the user
performing the operation, then there's no point in imposing any
restrictions -- and I think there can also be some way of setting
policy. I proposed a GUC in an earlier email, and you proposed one
with somewhat different semantics in this email, and I'm not sure that
either of those things in particular is right or that we ought to be
using a GUC for this at all. However, there should almost certainly be
SOME way for the superuser to turn any new restrictions off, and there
should probably also be some way for an unprivileged user to say "you
know, I am totally OK with running any code that alice provides --
just go with it."

I don't think we're at a point where we can conclude on what those
mechanisms should look like just yet, but I think that everyone who
has spoken up agrees that they ought to exist, assuming we go in this
direction at all.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2023-03-01 at 10:05 -0500, Robert Haas wrote:
> For this reason, these expressions are, by
> default, restricted from doing <things>.

The hard part is defining <things> without resorting to a bunch of
special cases, and also in a way that doesn't break a bunch of stuff.

> You earlier
> mentioned that switching to the table owner seems to be just a way of
> turning SECURITY INVOKER into SECURITY DEFINER in random places, or
> maybe that's not exactly what you said but that's what I took from
> it.

Yeah, though I glossed over some details, see below.

>  If we just slather user context switches
> everywhere, I'm not actually very sure that's going to be
> comprehensible behavior: if my trigger function is SECURITY INVOKER,
> why is it getting executed as me, not the inserting user?

Let's consider other expressions first. The proposal is that all
expressions attached to a table should run as the table owner (set
aside compatibility concerns for a moment). If those expressions call a
SECURITY INVOKER function, the invoker would need to be the table owner
as well. Users could get confused by that, but I think it's
documentable and understandable; and it's really the only thing that
makes sense -- otherwise changing the user is completely useless.

We should consider triggers as just another expression being executed,
and the invoker is the table owner. The problem is that it's a little
annoying to users because they probably defined the function for the
sole purpose of being a trigger function for a single table, and it
might look as though the SECURITY INVOKER label was ignored.

But there is a difference, which I glossed over before: SECURITY
INVOKER on a trigger function would still have significance, because
the function owner (definer) and table owner (invoker) could still be
different in the case of a trigger, just like in an expression.

This goes back to my point that SECURITY INVOKER is more complex for us
to document and for users to understand. The user *must* understand who
the invoker is in various contexts. That's the situation today and
there's no escaping it. We aren't making things any worse, at least as
long as we can sort out compatibility in a reasonable way.

(Aside: I'm having some serious concerns about how the invoker of a
function called in a view is not the view definer. That's another thing
we'll need to fix, because it's another way of putting SECURITY INVOKER
code in front of someone without them knowing.)

(Aside: We should take a second look at the security invoker views
before we release them. I know that I learned some things during this
discussion and a fresh look might be useful.)

> As soon as you magically turn
> that into a SECURITY DEFINER function, you've provided a way for the
> users performing DML to attack the table owner.

I don't think it's magic, as I said above. But I assume that your more
general point is that if we take some responsibility away from the
invoker and place it on the definer, then it creates room for new kinds
of problems. And I agree.

The point of moving responsibility to the definer is that the definer
can actually do something to protect themselves (nail down search_path,
restrict USAGE privs, and avoid dynamic SQL); whereas the invoker is
nearly helpless.

Regards,
    Jeff Davis




Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Mar 1, 2023 at 1:13 PM Jeff Davis <pgsql@j-davis.com> wrote:
> I don't think it's magic, as I said above. But I assume that your more
> general point is that if we take some responsibility away from the
> invoker and place it on the definer, then it creates room for new kinds
> of problems. And I agree.
>
> The point of moving responsibility to the definer is that the definer
> can actually do something to protect themselves (nail down search_path,
> restrict USAGE privs, and avoid dynamic SQL); whereas the invoker is
> nearly helpless.

I think there's some truth to that allegation, but I think it largely
comes from the fact that we've given very little thought or attention
to this problem. We have a section in the documentation on writing
SECURITY DEFINER functions safely because we've known for a long time
that it's dangerous and we've provided some (imperfect) tools for
dealing with it, like allowing a SET search_path clause to be attached
to a function definition. We have no comparable documentation section
about SECURITY INVOKER because we haven't historically taken that
seriously as a security hazard and we have no tools to make it safe.
But we could, as with what I'm proposing here, or the user/function
trust mechanism previously proposed by Noah, or various other ideas
that we might have.

I don't like the idea of saying that we're not going to try to invent
anything new and just push people into using the stuff we already
have. The stuff we have for security SECURITY DEFINER functions is not
very good. True, it's better than what we have for protecting against
the risks inherent in SECURITY INVOKER, but that's not saying much:
anything at all is better than nothing. But it's easy to forget a SET
search_path clause on one of your functions, or to include something
in that search path that's not actually safe, or to have a problem
that isn't blocked by just setting search_path. Also, not that it's
the most important consideration here, but putting a SET clause on
your functions is really kind of expensive if what the function does
is trivial, which if you're using it in an index expression or a
default expression, will often be the case. I don't want to pretend
like I have all the answers here, but I find it really hard to believe
that pushing people to do the same kind of nonsense that's currently
required when writing a SECURITY DEFINER function for a lot of their
other functions as well is going to be a win. I think it will probably
suck.

To be fair, it's possible that there's no solution to this class of
problems that *doesn't* suck, but I think we should look a lot harder
before coming to that conclusion. I've come to agree with your
contention that we're not taking the hazards of SECURITY INVOKER
nearly seriously enough, but I think you're underestimating the
hazards that SECURITY DEFINER poses, and overestimating how easy it is
to avoid them.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2023-03-01 at 16:06 -0500, Robert Haas wrote:

> To be fair, it's possible that there's no solution to this class of
> problems that *doesn't* suck, but I think we should look a lot harder
> before coming to that conclusion.

Fair enough. The situation is bad enough that I'm willing to consider a
pretty wide range of solutions and mitigations that might otherwise be
unappealing.

I think there might be something promising in your idea to highly
restrict the privileges of code attached to a table. A lot of
expressions are really simple and don't need much to be both useful and
safe. We may not need the exact same solution for both default
expressions and triggers. Some details to work through, though.

Regards,
    Jeff Davis




Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-02-28 12:36:38 -0800, Jeff Davis wrote:
> On Tue, 2023-02-28 at 11:28 -0800, Andres Freund wrote:
> > I can only repeat myself in stating that SECURITY DEFINER solves none
> > of the
> > relevant issues. I included several examples of why it doesn't in the
> > recent
> > thread about "blocking SECURITY INVOKER". E.g. that default arguments
> > of
> > SECDEF functions are evaluated with the current user's privileges,
> > not the
> > function owner's privs:
> > 
> > https://postgr.es/m/20230113032943.iyxdu7bnxe4cmbld%40awork3.anarazel.de
> 
> I was speaking a bit loosely, using "SECURITY DEFINER" to mean the
> semantics of executing code as the one who wrote it. I didn't
> specifically mean the function marker, because as you pointed out in
> the other thread, that's not enough.

Oh, ok.


> From your email it looks like there is still a path forward:
> 
> "The proposal to not trust any expressions controlled by untrusted
> users at least allows to prevent execution of code, even if it doesn't
> provide a way to execute the code in a safe manner.  Given that we
> don't have the former, it seems foolish to shoot for the latter."
> 
> And later:
> 
> "I think the combination of
> a) a setting that restricts evaluation of any non-trusted expressions,
>    independent of the origin
> b) an easy way to execute arbitrary statements within
>    SECURITY_RESTRICTED_OPERATION"
> 
> My takeaway from that thread was that we need a mechanism to deal with
> non-function code (e.g. default expressions) first; but once we have
> that, it opens up the design space to better solutions or at least
> mitigations. Is that right?

I doubt it's realistic to change the user for all kinds of expressions
individually. A query can involve expressions controlled by many users,
changing the current user in a super granular way seems undesirable from a
performance and complexity pov.

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-02-28 08:37:02 -0500, Robert Haas wrote:
> On Mon, Feb 27, 2023 at 7:37 PM Jeff Davis <pgsql@j-davis.com> wrote:
> > > Yeah. That's the idea I was floating, at least.
> >
> > Isn't that a hard problem; maybe impossible?
>
> It doesn't seem that hard to me; maybe I'm missing something.
>
> The existing SECURITY_RESTRICTED_OPERATION flag basically prevents you
> from tinkering with the session state. If we also had a similar flags
> like DATABASE_READS_PROHIBITED and DATABASE_WRITES_PROHIBITED (or just
> a combined DATABASE_ACCESS_PROHIBITED flag) I think that would be
> pretty close to what we need. The idea would be that, when a user
> executes a function or procedure owned by a user that they don't trust
> completely, we'd set
> SECURITY_RESTRICTED_OPERATION|DATABASE_READS_PROHIBITED|DATABASE_WRITES_PROHIBITED.
> And we could provide a user with a way to express the degree of trust
> they have in some other user or perhaps even some specific function,
> e.g.

ISTM that this would require annotating most functions in the system. There's
many problems besides accessing database contents. Just a few examples:

- dblink functions to access another system / looping back
- pg_read_file()/pg_file_write() allows almost arbitrary mischief
- pg_stat_reset[_shared]()
- creating/dropping logical replication slots
- use untrusted PL functions
- many more

A single wrongly annotated function would be sufficient to escape. This
includes user defined functions.


This basically proposes that we can implement a safe sandbox for executing
arbitrary code in a privileged context. IMO history suggests that that's a
hard thing to do.

Am I missing something?

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Mar 1, 2023 at 7:34 PM Andres Freund <andres@anarazel.de> wrote:
> ISTM that this would require annotating most functions in the system. There's
> many problems besides accessing database contents. Just a few examples:
>
> - dblink functions to access another system / looping back
> - pg_read_file()/pg_file_write() allows almost arbitrary mischief
> - pg_stat_reset[_shared]()
> - creating/dropping logical replication slots
> - use untrusted PL functions
> - many more
>
> A single wrongly annotated function would be sufficient to escape. This
> includes user defined functions.
>
> This basically proposes that we can implement a safe sandbox for executing
> arbitrary code in a privileged context. IMO history suggests that that's a
> hard thing to do.

Yeah, that's true, but I don't think switching users all the time is
going to be great either. And it's not like other people haven't gone
this way: that's what plperl (not plperlu) is all about, and
JavaScript running in your browser, and so on. Those things aren't
problem-free, of course, but we're all using them.

When I was initially thinking about this, I thought that maybe we
could just block access to tables and utility statements. That's got
problems in both directions. On the one hand, there are functions like
the ones you propose here that have side effects which we might not
want to allow, and on the other hand, somebody might have an index
expression that does a lookup in a table that they "never change". The
latter case is problematic for non-security reasons, because there's
an invisible dump-ordering constraint that must be obeyed for
dump/restore to work at all, but there's no security issue. Still, I'm
not sure this idea is completely dead in the water. It doesn't seem
unreasonable to me that if you have that kind of case, you have to
somehow opt into the behavior: yeah, I know that index functions I'm
executing are going to read from tables, and I consent to that. And
similarly, if your index expression calls pg_stat_reset_shared(), that
probably ought to be blocked by default too, and if you want to allow
it, you have to say so. Yes, that does require labelling functions, or
maybe putting run-time checks in them:

RequireAvailableCapability(CAP_MODIFY_DATABASE_STATE);

If that capability isn't available in the present context, the call
errors out. That way, it's possible for the required capabilities to
depend on the arguments to the function, and we can change markings in
minor releases without needing catalog changes.

There's another way of thinking about this problem, which involves
supposing that the invoker should only be induced to do things that
the definer could also have done. Basically do every privilege check
twice, and require that both pass. The problem I have with that is
that there are various operations which depend on your identity, not
just your privileges. For instance, a GRANT statement records a
grantor, and a REVOKE statement infers a grantor whose grant is to be
revoked. The operation isn't just allowed or disallowed based on who
performed it -- it actually does something different depending on who
performs it. I believe we have a number of cases like that, and I
think that they suggest that that whole model is pretty flawed. Even
if that were no issue, this also seems extremely complex to implement,
because we have an absolute crap-ton of places that perform privilege
checks and getting all of those places to check privileges as a second
user seems nightmarish. I also think that it might be lead to
confusing error messages: alice tried to do X but we're not allowing
it because bob isn't allowed to do X. Eh, what? As opposed to the
sandboxing approach, where I think you get something more like:

ERROR: database state cannot be modified now
DETAIL: The database system is evaluating an index expression.
HINT: Do $SOMETHING to allow this.

I don't want to press too hard on my idea here. I'm sure it has a
bunch of problems apart from those already mentioned, and those
already mentioned are not trivial. However, I do think there might be
ways to make it work, and I'm not at all convinced that trying to
switch users all over the place is going to be be better, either for
security or usability. Is there some other whole kind of approach we
can take here that we haven't discussed yet?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Robert Haas
Date:
On Thu, Feb 9, 2023 at 4:46 PM Jacob Champion <jchampion@timescale.com> wrote:
> On 2/6/23 08:22, Robert Haas wrote:
> > I don't think that's quite the right concept. It seems to me that the
> > client is responsible for informing the server of what the situation
> > is, and the server is responsible for deciding whether to allow the
> > connection. In your scenario, the client is not only communicating
> > information ("here's the password I have got") but also making demands
> > on the server ("DO NOT authenticate using anything else"). I like the
> > first part fine, but not the second part.
>
> For what it's worth, making a negative demand during authentication is
> pretty standard: if you visit example.com and it tells you "I need your
> OS login password and Social Security Number to authenticate you," you
> have the option of saying "no thanks" and closing the tab.

No, that's the opposite, and exactly the point I'm trying to make. In
that case, the SERVER says what it's willing to accept, and the CLIENT
decides whether or not to provide that. In your proposal, the client
tells the server which authentication methods to accept.

> In a hypothetical world where the server presented the client with a
> list of authentication options before allowing any access, this would
> maybe be a little less convoluted to solve. For example, a proxy seeing
> a SASL list of
>
> - ANONYMOUS
> - EXTERNAL
>
> could understand that both methods allow the client to assume the
> authority of the proxy itself. So if its client isn't allowed to do
> that, the proxy realizes something is wrong (either it, or its target
> server, has been misconfigured or is under attack), and it can close the
> connection *before* the server runs login triggers.

Yep, that totally makes sense to me, but I don't think it's what you proposed.

> This sounds like a reasonable separation of responsibilities on the
> surface, but I think it's subtly off. The entire confused-deputy problem
> space revolves around the proxy being unable to correctly decide which
> connections to allow unless it also knows why the connections are being
> authorized.

I agree.

> You've constructed an example where that's not a concern: everything's
> symmetrical, all proxies operate with the same authority, and internal
> users are identical to external users. But the CVE that led to the
> password requirement, as far as I can tell, dealt with asymmetry. The
> proxy had the authority to connect locally to a user, and the clients
> had the authority to connect to other machines' users, but those users
> weren't the same and were not mutually trusting.

Yeah, agreed. So, I think the point here is that the proxy
configuration (and pg_hba.conf) need to be sufficiently powerful that
each user can permit the things that make sense in their environment
and block the things that don't.

I don't think we're really very far apart here, but for some reason
the terminology seems to be giving us some trouble. Of course, there's
also the small problem of actually finding the time to do some
meaningful work on this stuff, rather than just talking....

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Jacob Champion
Date:
On Tue, Mar 7, 2023 at 11:04 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Feb 9, 2023 at 4:46 PM Jacob Champion <jchampion@timescale.com> wrote:
> > On 2/6/23 08:22, Robert Haas wrote:
> > > I don't think that's quite the right concept. It seems to me that the
> > > client is responsible for informing the server of what the situation
> > > is, and the server is responsible for deciding whether to allow the
> > > connection. In your scenario, the client is not only communicating
> > > information ("here's the password I have got") but also making demands
> > > on the server ("DO NOT authenticate using anything else"). I like the
> > > first part fine, but not the second part.
> >
> > For what it's worth, making a negative demand during authentication is
> > pretty standard: if you visit example.com and it tells you "I need your
> > OS login password and Social Security Number to authenticate you," you
> > have the option of saying "no thanks" and closing the tab.
>
> No, that's the opposite, and exactly the point I'm trying to make. In
> that case, the SERVER says what it's willing to accept, and the CLIENT
> decides whether or not to provide that. In your proposal, the client
> tells the server which authentication methods to accept.

Ah, that's a (the?) sticking point. In my example, the client doesn't
tell the server which methods to accept. The client tells the server
which method the *client* has the ability to use. (Or, implicitly,
which methods it refuses to use.)

That shouldn't lose any power, security-wise, because the server is
looking for an intersection of the two sets. And the client already
has the power to do that for almost every form of authentication,
except the ambient methods.

I don't think I necessarily like that option better than SASL-style,
but hopefully that clarifies it somewhat?

> I don't think we're really very far apart here, but for some reason
> the terminology seems to be giving us some trouble.

Agreed.

> Of course, there's
> also the small problem of actually finding the time to do some
> meaningful work on this stuff, rather than just talking....

Agreed :)

--Jacob



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Robert Haas
Date:
On Wed, Mar 8, 2023 at 2:30 PM Jacob Champion <jchampion@timescale.com> wrote:
> > No, that's the opposite, and exactly the point I'm trying to make. In
> > that case, the SERVER says what it's willing to accept, and the CLIENT
> > decides whether or not to provide that. In your proposal, the client
> > tells the server which authentication methods to accept.
>
> Ah, that's a (the?) sticking point. In my example, the client doesn't
> tell the server which methods to accept. The client tells the server
> which method the *client* has the ability to use. (Or, implicitly,
> which methods it refuses to use.)
>
> That shouldn't lose any power, security-wise, because the server is
> looking for an intersection of the two sets. And the client already
> has the power to do that for almost every form of authentication,
> except the ambient methods.
>
> I don't think I necessarily like that option better than SASL-style,
> but hopefully that clarifies it somewhat?

Hmm, yeah, I guess that's OK. I still don't love it, though. It feels
more solid to me if the proxy can actually block the connections
before they even happen, without having to rely on a server
interaction to figure out what is permissible.

I don't know what you mean by SASL-style, exactly.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-02-07 16:56:55 -0500, Robert Haas wrote:
> On Wed, Feb 1, 2023 at 4:02 PM Andres Freund <andres@anarazel.de> wrote:
> > > +     /* Is the use of a password mandatory? */
> > > +     must_use_password = MySubscription->passwordrequired &&
> > > +             !superuser_arg(MySubscription->owner);
> >
> > There's a few repetitions of this - perhaps worth putting into a helper?
> 
> I don't think so. It's slightly different each time, because it's
> pulling data out of different data structures.
> 
> > This still leaks the connection on error, no?
> 
> I've attempted to fix this in v4, attached.

Hm - it still feels wrong that we error out in case of failure, despite the
comment to the function saying:
 * Returns NULL on error and fills the err with palloc'ed error message.

Other than this, the change looks ready to me.

Greetings,

Andres Freund



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Jacob Champion
Date:
On Wed, Mar 8, 2023 at 11:40 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Mar 8, 2023 at 2:30 PM Jacob Champion <jchampion@timescale.com> wrote:
> > I don't think I necessarily like that option better than SASL-style,
> > but hopefully that clarifies it somewhat?
>
> Hmm, yeah, I guess that's OK.

Okay, cool.

> I still don't love it, though. It feels
> more solid to me if the proxy can actually block the connections
> before they even happen, without having to rely on a server
> interaction to figure out what is permissible.

Sure. I don't see a way for the proxy to figure that out by itself,
though, going back to my asymmetry argument from before. Only the
server truly knows, at time of HBA processing, whether the proxy
itself has authority. If the proxy knew, it wouldn't be confused.

> I don't know what you mean by SASL-style, exactly.

That's the one where the server explicitly names all forms of
authentication, including the ambient ones (ANONYMOUS, EXTERNAL,
etc.), and requires the client to choose one before running any
actions on their behalf. That lets the require_auth machinery work for
this case, too.

--Jacob



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Robert Haas
Date:
On Wed, Mar 8, 2023 at 5:44 PM Jacob Champion <jchampion@timescale.com> wrote:
>
> On Wed, Mar 8, 2023 at 11:40 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > On Wed, Mar 8, 2023 at 2:30 PM Jacob Champion <jchampion@timescale.com> wrote:
> > > I don't think I necessarily like that option better than SASL-style,
> > > but hopefully that clarifies it somewhat?
> >
> > Hmm, yeah, I guess that's OK.
>
> Okay, cool.
>
> > I still don't love it, though. It feels
> > more solid to me if the proxy can actually block the connections
> > before they even happen, without having to rely on a server
> > interaction to figure out what is permissible.
>
> Sure. I don't see a way for the proxy to figure that out by itself,
> though, going back to my asymmetry argument from before. Only the
> server truly knows, at time of HBA processing, whether the proxy
> itself has authority. If the proxy knew, it wouldn't be confused.
>
> > I don't know what you mean by SASL-style, exactly.
>
> That's the one where the server explicitly names all forms of
> authentication, including the ambient ones (ANONYMOUS, EXTERNAL,
> etc.), and requires the client to choose one before running any
> actions on their behalf. That lets the require_auth machinery work for
> this case, too.
>
> --Jacob



--
Robert Haas
EDB: http://www.enterprisedb.com



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Robert Haas
Date:
On Wed, Mar 8, 2023 at 5:44 PM Jacob Champion <jchampion@timescale.com> wrote:
> Sure. I don't see a way for the proxy to figure that out by itself,
> though, going back to my asymmetry argument from before. Only the
> server truly knows, at time of HBA processing, whether the proxy
> itself has authority. If the proxy knew, it wouldn't be confused.

That seems like a circular argument. If you call the problem the
confused deputy problem then the issue must indeed be that the deputy
is confused, and needs to talk to someone else to get un-confused. But
why is the deputy necessarily confused in the first place? Our deputy
is confused because our code to decide whether to proxy a connection
or not is super-dumb, but if there's an intrinsic reason it can't be
smarter, I don't understand what it is.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Jacob Champion
Date:
On Thu, Mar 9, 2023 at 6:17 AM Robert Haas <robertmhaas@gmail.com> wrote:
> That seems like a circular argument. If you call the problem the
> confused deputy problem then the issue must indeed be that the deputy
> is confused, and needs to talk to someone else to get un-confused. But
> why is the deputy necessarily confused in the first place? Our deputy
> is confused because our code to decide whether to proxy a connection
> or not is super-dumb,

No, I think our proxy is confused because it doesn't know what power
it has, and it can't tell the server what power it wants to use. That
problem is independent of the decision to proxy. You're suggesting
strengthening the code that makes that decision -- adding an oracle
(in the form of a DBA) that knows about the confusion and actively
mitigates it. That's guaranteed to work if the oracle is perfect,
because "perfect" is somewhat tautologically defined as "whatever
ensures secure operation". But the oracle doesn't reduce the
confusion, and DBAs aren't perfect.

If you want to add a Sheriff Andy to hold Barney Fife's hand [1], that
will absolutely make Barney less of a problem, and I'd like to have
Andy around regardless. But Barney still doesn't know what's going on,
and when Andy makes a mistake, there will still be trouble. I'd like
to teach Barney some useful stuff.

> but if there's an intrinsic reason it can't be
> smarter, I don't understand what it is.

Well... I'm not well-versed enough in this to prove non-existence of a
solution. Can you find a solution, using the current protocol, that
doesn't make use of perfect out-of-band knowledge? We have a client
that will authenticate using any method the server asks it to, even if
its user intended to use something else. And we have a server that can
eagerly skip client authentication, and then eagerly run code on its
behalf, without first asking the client what it's even trying to do.
That would be an inherently hostile environment for *any* proxy, not
just ours.

Thanks,
--Jacob

[1] https://en.wikipedia.org/wiki/The_Andy_Griffith_Show#Premise_and_characters



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Robert Haas
Date:
On Fri, Mar 10, 2023 at 7:00 PM Jacob Champion <jchampion@timescale.com> wrote:
> On Thu, Mar 9, 2023 at 6:17 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > That seems like a circular argument. If you call the problem the
> > confused deputy problem then the issue must indeed be that the deputy
> > is confused, and needs to talk to someone else to get un-confused. But
> > why is the deputy necessarily confused in the first place? Our deputy
> > is confused because our code to decide whether to proxy a connection
> > or not is super-dumb,
>
> No, I think our proxy is confused because it doesn't know what power
> it has, and it can't tell the server what power it wants to use. That
> problem is independent of the decision to proxy. You're suggesting
> strengthening the code that makes that decision -- adding an oracle
> (in the form of a DBA) that knows about the confusion and actively
> mitigates it. That's guaranteed to work if the oracle is perfect,
> because "perfect" is somewhat tautologically defined as "whatever
> ensures secure operation". But the oracle doesn't reduce the
> confusion, and DBAs aren't perfect.

I think this is the root of our disagreement. My understanding of the
previous discussion is that people think that the major problem here
is the wraparound-to-superuser attack. That is, in general, we expect
that when we connect to a database over the network, we expect it to
do some kind of active authentication, like asking us for a password,
or asking us for an SSL certificate that isn't just lying around for
anyone to use. However, in the specific case of a local connection, we
have a reliable way of knowing who the remote user is without any kind
of active authentication, namely 'peer' authentication or perhaps even
'trust' if we trust all the local users, and so we don't judge it
unreasonable to allow local connections without any form of active
authentication. There can be some scenarios where even over a network
we can know the identity of the person connecting with complete
certainty, e.g. if endpoints are locked down such that the source IP
address is a reliable indicator of who is initiating the connection,
but in general when there's a network involved you don't know who the
person making the connection is and need to do something extra to
figure it out.

If you accept this characterization of the problem, then I don't think
the oracle is that hard to design. We simply set it up not to allow
wraparound connections, or maybe even more narrowly to not allow
wraparound connections to superuser. If the DBA has some weird network
topology where that's not the correct rule, either because they want
to allow wraparound connections or they want to disallow other things,
then yeah they have to tell us what to allow, but I don't really see
why that's an unreasonable expectation. I'd expect the correct
configuration of the proxy facility to fall naturally out of what's
allowed in pg_hba.conf. If machine A is configured to accept
connections from machines B and C based on environmental factors, then
machines B and C should be configured not to proxy connections to A.
If machines B and C aren't under our control such that we can
configure them that way, then the configuration is fundamentally
insecure in a way that we can't really fix.

I think that what you're proposing is that B and C can just be allowed
to proxy to A and A can say "hey, by the way, I'm just gonna let you
in without asking for anything else" and B and C can, when proxying,
react to that by disconnecting before the connection actually goes
through. That's simpler, in a sense. It doesn't require us to set up
the proxy configuration on B and C in a way that matches what
pg_hba.conf allows on A. Instead, B and C can automatically deduce
what connections they should refuse to proxy. I guess that's nice, but
it feels pretty magical to me. It encourages the DBA not to think
about what B and C should actually be allowed to proxy, and instead
just trust that the automatics are going to prevent any security
disasters. I'm not sure that they always will, and I fear cultivating
too much reliance on them. I think that if you're setting up a network
topology where the correct rule is something more complex than "don't
allow wraparound connections to superuser," maybe you ought to be
forced to spell that rule out instead of letting the system deduce one
that you hope will be right.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Mar 8, 2023 at 2:47 PM Andres Freund <andres@anarazel.de> wrote:
> Hm - it still feels wrong that we error out in case of failure, despite the
> comment to the function saying:
>  * Returns NULL on error and fills the err with palloc'ed error message.

I've amended the comment so that it explains why it's done that way.

> Other than this, the change looks ready to me.

Well, it still needed documentation changes and pg_dump changes. I've
added those in the attached version.

If nobody's too unhappy with the idea, I plan to commit this soon,
both because I think that the feature is useful, and also because I
think it's an important security improvement. Since replication is
currently run as the subscription owner, any table owner can
compromise the subscription owner's account, which is really bad, but
if the subscription owner can be a non-superuser, it's a little bit
less bad. From a security point of view, I think the right thing to do
and what would improve security a lot more is to run replication as
the table owner rather than the subscription owner. I've posted a
patch for that at
http://postgr.es/m/CA+TgmoaSCkg9ww9oppPqqs+9RVqCexYCE6Aq=UsYPfnOoDeFkw@mail.gmail.com
and AFAICT everyone agrees with the idea, even if the patch itself
hasn't yet attracted any code reviews. But although the two patches
are fairly closely related, this seems to be a good idea whether that
moves forward or not, and that seems to be a good idea whether this
moves forward or not. As this one has had more review and discussion,
my current thought is to try to get this one committed first.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2023-03-22 at 12:16 -0400, Robert Haas wrote:
> If nobody's too unhappy with the idea, I plan to commit this soon,
> both because I think that the feature is useful, and also because I
> think it's an important security improvement.

Is there any chance I can convince you to separate the privileges of
using a connection string and creating a subscription, as I
suggested[1] earlier?

It would be useful for dblink, and I also plan to propose CREATE
SUBSCRIPTION ... SERVER for v17 (it was too late for 16), for which it
would also be useful to make the distinction.

You seemed to generally think it was a reasonable idea, but wanted to
wait for the other patch. I think it's the right breakdown of
privileges even now, and I don't see a reason to give ourselves a
headache later trying to split up the privileges later.

Regards,
    Jeff Davis

[1]
https://www.postgresql.org/message-id/fa1190c117c2455f2dd968a1a09f796ccef27b29.camel@j-davis.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Wed, Mar 22, 2023 at 3:53 PM Jeff Davis <pgsql@j-davis.com> wrote:
> Is there any chance I can convince you to separate the privileges of
> using a connection string and creating a subscription, as I
> suggested[1] earlier?

What would this amount to concretely? Also adding a
pg_connection_string predefined role and requiring both that and
pg_create_subscription in all cases until your proposed changes get
made?

If so, I don't think that's a good idea. Maybe for some reason your
proposed changes won't end up happening, and then we've just got a
useless extra thing that makes things confusing. I think that adding a
pg_connection_string privilege properly belongs to whatever patch
makes it possible to separate the connection string from the
subscription, and that we probably shouldn't add those even in
separate commits, let alone in separate major releases.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Thu, 2023-03-23 at 11:52 -0400, Robert Haas wrote:
> What would this amount to concretely? Also adding a
> pg_connection_string predefined role and requiring both that and
> pg_create_subscription [to CREATE SUBSCRIPTION]

Yes.

> If so, I don't think that's a good idea. Maybe for some reason your
> proposed changes won't end up happening, and then we've just got a
> useless extra thing that makes things confusing.

Even if my changes don't happen, I would find it less confusing and
more likely that users understand what they're doing.

To most users, the consequences of allowing users to write connection
strings on the server are far from obvious. Even we, as developers,
needed to spend a lot of time discussing the nuances.

Someone merely granting the ability to CREATE SUBSCRIPTION would read
that page in the docs, which is dominated by the mechanics of a
subscription and says little about the connection string, let alone the
security nuances of using it on a server.

But if there is also a separate connection string privilege required,
we can document it better and they are more likely to find it and
understand.

Beyond that, the connection string and the mechanics of the
subscription are really different concepts.

Regards,
    Jeff Davis




Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Thu, Mar 23, 2023 at 1:41 PM Jeff Davis <pgsql@j-davis.com> wrote:
> Even if my changes don't happen, I would find it less confusing and
> more likely that users understand what they're doing.

I respectfully but firmly disagree. I think having two predefined
roles that are both required to create a subscription and neither of
which allows you to do anything other than create a subscription is
intrinsically confusing. I'm not willing to commit a patch that works
like that, and I will object if someone else wants to do so.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Wed, 2023-03-22 at 12:16 -0400, Robert Haas wrote:
> I've posted a
> patch for that at
> http://postgr.es/m/CA+TgmoaSCkg9ww9oppPqqs+9RVqCexYCE6Aq=UsYPfnOoDeFkw@mail.gmail.com
> and AFAICT everyone agrees with the idea, even if the patch itself
> hasn't yet attracted any code reviews. But although the two patches
> are fairly closely related, this seems to be a good idea whether that
> moves forward or not, and that seems to be a good idea whether this
> moves forward or not. As this one has had more review and discussion,
> my current thought is to try to get this one committed first.

The current patch (non-superuser-subscriptions) is the most user-facing
aspect, and it seems wrong to commit it before we have the security
model in a reasonable place. As you pointed out[1], it's not in a
reasonable place now, so encouraging more use seems like a bad idea.

The other patch you posted seems like it makes a lot of progress in
that direction, and I think that should go in first. That was one of
the items I suggested previously[2], so thank you for working on that.

Regards,
    Jeff Davis


[1]
https://www.postgresql.org/message-id/CA%2BTgmoavSQVcvEW3ZgZ7a1Q-TJ-fp0%2BNt7W3D7FCawArtTCBCQ%40mail.gmail.com

[2]
https://www.postgresql.org/message-id/27c557b12a590067c5e00588009447bb5bb2dd42.camel@j-davis.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Fri, Mar 24, 2023 at 3:17 AM Jeff Davis <pgsql@j-davis.com> wrote:
> The current patch (non-superuser-subscriptions) is the most user-facing
> aspect, and it seems wrong to commit it before we have the security
> model in a reasonable place. As you pointed out[1], it's not in a
> reasonable place now, so encouraging more use seems like a bad idea.

I certainly agree that the security model isn't in a reasonable place
right now. However, I feel that:

(1) adding an extra predefined role doesn't really help, because it
doesn't actually do anything in and of itself, it only prepares for
future work, and

(2) even adding the connection string security stuff that you're
proposing doesn't really help, because (2a) connection string security
is almost completely separate from the internal security
considerations addressed in the message to which you linked, and (2b)
in my opinion, there will be a lot of people who won't use that
connection string security stuff even if we had it, possibly even a
large majority of people won't use it, because it responds to a
specific use case which I think a lot of people don't have, and

(3) I don't agree either that this patch would encourage more use of
logical replication or that it would be bad if it did. I mean, there
could be someone who knows about this patch and will hesitate to
deploy logical replication if it doesn't get committed, or maybe
slightly more likely, won't be able to do so if this patch doesn't get
committed because they're running in a cloud environment. But probably
not. Cloud providers are already hacking around this problem,
Microsoft included. As a community, we're better off having a standard
solution in core than having every vendor hack it their own way. And
outside of a cloud environment, there's not really any reason for the
lack of this patch to make a potential user hesitate. Also, features
getting used is a thing that I think we should all want. If logical
replication is in such a bad state that we think people should be
using it, we should rip it out until the issues are fixed. I don't
think anyone would seriously propose that such a course of action is
advisable. So the alternative is to make it better.

To reiterate what I think the most important point here is, both Azure
and AWS already let you do this. EDB's own cloud offering is also
going to let you do this, whether this change goes in or not. But if
this patch gets committed, then eventually all of those vendors and
whatever others are out there will let you do this in the same way,
i.e. pg_create_subscription, instead of every vendor having their own
patch to the code that does what this patch does through some method
that is specific to that cloud vendor. That sort of fragmentation of
the ecosystem is not good for anyone, AFAICS.

> The other patch you posted seems like it makes a lot of progress in
> that direction, and I think that should go in first. That was one of
> the items I suggested previously[2], so thank you for working on that.

Perhaps you could review that work?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Fri, Mar 24, 2023 at 9:24 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > The other patch you posted seems like it makes a lot of progress in
> > that direction, and I think that should go in first. That was one of
> > the items I suggested previously[2], so thank you for working on that.
>
> Perhaps you could review that work?

Ah, you already did. Thanks.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Jacob Champion
Date:
On 3/20/23 09:32, Robert Haas wrote:
> I think this is the root of our disagreement.

Agreed.

> My understanding of the
> previous discussion is that people think that the major problem here
> is the wraparound-to-superuser attack. That is, in general, we expect
> that when we connect to a database over the network, we expect it to
> do some kind of active authentication, like asking us for a password,
> or asking us for an SSL certificate that isn't just lying around for
> anyone to use. However, in the specific case of a local connection, we
> have a reliable way of knowing who the remote user is without any kind
> of active authentication, namely 'peer' authentication or perhaps even
> 'trust' if we trust all the local users, and so we don't judge it
> unreasonable to allow local connections without any form of active
> authentication. There can be some scenarios where even over a network
> we can know the identity of the person connecting with complete
> certainty, e.g. if endpoints are locked down such that the source IP
> address is a reliable indicator of who is initiating the connection,
> but in general when there's a network involved you don't know who the
> person making the connection is and need to do something extra to
> figure it out.

Okay, but this is walking back from the network example you just
described upthread. Do you still consider that in scope, or...?

> If you accept this characterization of the problem,

I'm not going to say yes or no just yet, because I don't understand your
rationale for where to draw the lines.

If you just want the bare minimum thing that will solve the localhost
case, require_auth landed this week. Login triggers are not yet a thing,
so `require_auth=password,md5,scram-sha-256` ensures active
authentication. You don't even have to disallow localhost connections,
as far as I can tell; they'll work as intended.

If you think login triggers will get in for PG16, my bigger proposal
can't help in time. But if you're drawing the line at "environmental
HBAs are fundamentally unsafe and you shouldn't use them if you have a
proxy," why can't I instead draw the line at "login triggers are
fundamentally unsafe and you shouldn't use them if you have a proxy"?

And if you want to handle the across-the-network case, too, then I don't
accept the characterization of the problem.

> then I don't think
> the oracle is that hard to design. We simply set it up not to allow
> wraparound connections, or maybe even more narrowly to not allow
> wraparound connections to superuser. If the DBA has some weird network
> topology where that's not the correct rule, either because they want
> to allow wraparound connections or they want to disallow other things,
> then yeah they have to tell us what to allow, but I don't really see
> why that's an unreasonable expectation.

This seems like a security model that has been carefully gerrymandered
around the existing implementation. My argument is that the "weird
network topology" isn't weird at all, and it's only dangerous because of
decisions we made (and can unmake).

I feel pretty strongly that the design arrow needs to be pointed in the
opposite direction. The model needs to be chosen first, to prevent us
from saying, "We defend against whatever the implementation lets us
defend against today. Good luck, DBAs."

> If machines B and C aren't under our control such that we can
> configure them that way, then the configuration is fundamentally
> insecure in a way that we can't really fix.

Here's probably our biggest point of contention. You're unlikely to
convince me that this is the DBA's fault.

If machines B and C aren't under our control, then our *protocol* is
fundamentally insecure in a way that we have the ability to fix, in a
way that's already been characterized in security literature.

> I think that what you're proposing is that B and C can just be allowed
> to proxy to A and A can say "hey, by the way, I'm just gonna let you
> in without asking for anything else" and B and C can, when proxying,
> react to that by disconnecting before the connection actually goes
> through. That's simpler, in a sense. It doesn't require us to set up
> the proxy configuration on B and C in a way that matches what
> pg_hba.conf allows on A. Instead, B and C can automatically deduce
> what connections they should refuse to proxy.

Right. It's meant to take the "localhost/wraparound connection" out of a
class of special things we have to worry about, and make it completely
boring.

> I guess that's nice, but
> it feels pretty magical to me. It encourages the DBA not to think
> about what B and C should actually be allowed to proxy, and instead
> just trust that the automatics are going to prevent any security
> disasters.

I agree magical behavior is dangerous, if what you think it can do
doesn't match up with what it can actually do. Bugs are always possible,
and maybe I'm just not seeing a corner case yet, because I'm talking too
much and not coding it -- but is this really a case where I'm
overpromising? Or does it just feel magical because it's meant to fix
the root issue?

(Remember, I'm not arguing against your proxy filter; I just want both.
They complement each other.)

> I'm not sure that they always will, and I fear cultivating
> too much reliance on them.

I can't really argue against this... but I'm not really sure anyone could.

My strawman rephrasing of that is, "we have to make the feature crappy
enough that we can blame the DBA when things go wrong." And even that
strawman could be perfectly reasonable, in situations where the DBA
necessarily has more information than the machine. In this case, though,
it seems to me that the two machines have all the information necessary
to make a correct decision between them.

Thanks!
--Jacob



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Fri, 2023-03-24 at 09:24 -0400, Robert Haas wrote:
> I certainly agree that the security model isn't in a reasonable place
> right now. However, I feel that:
>
> (1) adding an extra predefined role

> (2) even adding the connection string security stuff

I don't see how these points are related to the question of whether you
should commit your non-superuser-subscription-owners patch or logical-
repl-as-table-owner patch first.


My perspective is that logical replication is an unfinished feature
with an incomplete design. As I said earlier, that's why I backed away
from trying to do non-superuser subscriptions as a documented feature:
it feels like we need to settle some of the underlying pieces first.

There are some big issues, like the security model for replaying
changes. And some smaller issues like feature gaps (RLS doesn't work,
if I remember correctly, and maybe something with partitioning). There
are potential clashes with other proposals, like the CREATE
SUBSCRIPTION ... SERVER, which I hope can be sorted out later. And I
don't feel like I have a good handle on the publisher security model
and threats, which hopefully is just a matter of documenting some best
practices.

Each time we dig into one of these issues I learn something, and I
think others do, too. If we skip past that process and start adding new
features on top of this unfinished design, then I think we are setting
ourselves up for trouble that is going to be harder to fix later.

I don't mean to say all of the above issues are blockers or that they
should all be resolved in my favor. But there are enough issues and
some of those issues are serious enough that I feel like it's premature
to just go ahead with the non-superuser subscriptions and the
predefined role.

There are already users, which complicates things. And you make a good
point that some important users may be already working around the
flaws. But there's already a patch and discussion going on for some
security model improvements (thanks to you), so let's try to get that
one in first. If we can't, it's probably because we learned something
important.

Regards,
    Jeff Davis




Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On 2023-03-25 12:16:35 -0700, Jeff Davis wrote:
> On Fri, 2023-03-24 at 09:24 -0400, Robert Haas wrote:
> > I certainly agree that the security model isn't in a reasonable place
> > right now. However, I feel that:
> > 
> > (1) adding an extra predefined role
> 
> > (2) even adding the connection string security stuff
> 
> I don't see how these points are related to the question of whether you
> should commit your non-superuser-subscription-owners patch or logical-
> repl-as-table-owner patch first.
> 
> 
> My perspective is that logical replication is an unfinished feature
> with an incomplete design.

I agree with that much.


>  As I said earlier, that's why I backed away from trying to do non-superuser
> subscriptions as a documented feature: it feels like we need to settle some
> of the underlying pieces first.

I don't agree. The patch allows to use logical rep in a far less dangerous
fashion than now. The alternative is to release 16 without a real way to use
logical rep less insanely. Which I think is work.


> There are some big issues, like the security model for replaying
> changes.

That seems largely unrelated.


> And some smaller issues like feature gaps (RLS doesn't work,
> if I remember correctly, and maybe something with partitioning).

Entirely unrelated?

Greetings,

Andres Freund



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Sat, Mar 25, 2023 at 3:16 PM Jeff Davis <pgsql@j-davis.com> wrote:
> On Fri, 2023-03-24 at 09:24 -0400, Robert Haas wrote:
> > I certainly agree that the security model isn't in a reasonable place
> > right now. However, I feel that:
> >
> > (1) adding an extra predefined role
>
> > (2) even adding the connection string security stuff
>
> I don't see how these points are related to the question of whether you
> should commit your non-superuser-subscription-owners patch or logical-
> repl-as-table-owner patch first.

I thought you were asking for those changes to be made before this
patch got committed, so that's what I was responding to. If you're
asking for it not to be committed at all, that's a different
discussion.

> My perspective is that logical replication is an unfinished feature
> with an incomplete design. As I said earlier, that's why I backed away
> from trying to do non-superuser subscriptions as a documented feature:
> it feels like we need to settle some of the underlying pieces first.

I kind of agree with you about the feature itself. Even though the
basic feature works quite well and does something people really want,
there are a lot of loose ends to sort out, and not just about
security. But I also want to make some progress. If there are problems
with what I'm proposing that will make us regret committing things
right before feature freeze, then we shouldn't. But waiting a whole
additional year to see any kind of improvement is not free; these
issues are serious.

> I don't mean to say all of the above issues are blockers or that they
> should all be resolved in my favor. But there are enough issues and
> some of those issues are serious enough that I feel like it's premature
> to just go ahead with the non-superuser subscriptions and the
> predefined role.
>
> There are already users, which complicates things. And you make a good
> point that some important users may be already working around the
> flaws. But there's already a patch and discussion going on for some
> security model improvements (thanks to you), so let's try to get that
> one in first. If we can't, it's probably because we learned something
> important.

I think this patch is a lot better-baked and less speculative than
that one. I think that patch is more important, so if they were
equally mature, I'd favor getting that one committed first. But that's
not the case.

Also, I don't really understand how we could end up not wanting this
patch. I mean there's a lot of things I don't understand that are
still true anyway, so the mere fact that I don't understand how we
could not end up wanting this patch doesn't mean that it couldn't
happen. But like, the current state of play is that subscription
owners are always going to be superusers at the time the subscription
is created, and literally nobody thinks that's a good idea. Some
people (like me) think that we ought to assume that subscription
owners will be and need to be high-privilege users like superusers,
but to my knowledge every such person thinks that it's OK for the
subscription owner to be a non-superuser if they have adequate
privileges. I just think that's a high amount of privileges, not that
it has to be all the privileges i.e. superuser. Other people (like
you, AIUI) think that we ought to try to set things up so that
subscription owners can be low-privilege users, in which case we, once
again, don't want the user who owns the subscription to start out a
superuser. I actually can't imagine anyone defending the idea of
having the subscription owner always be a superuser at the time they
first own the subscription. That's a weird rule that can only serve to
reduce security. Nor can I imagine anyone saying that forcing
subscriptions to be created only by superusers improves security. I
don't think anyone thinks that.

If we're going to delay this patch, probably for a full year, because
of other ongoing discussions, it should be because there is some
outcome of those discussions that would involve deciding that this
patch isn't needed or should be significantly redesigned. If this
patch is going to end up being desirable no matter how those
discussions turn out, and if it's not going to change significantly no
matter how those discussions turn out, then those discussions aren't a
reason not to get it into this release.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Mon, 2023-03-27 at 10:46 -0700, Andres Freund wrote:
> > There are some big issues, like the security model for replaying
> > changes.
>
> That seems largely unrelated.

They are self-evidently related in a fundamental way. The behavior of
the non-superuser-subscription patch depends on the presence of the
apply-as-table-owner patch.

I think I'd like to understand the apply-as-table-owner patch better to
understand the interaction.

Regards,
    Jeff Davis




Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Mon, 2023-03-27 at 14:06 -0400, Robert Haas wrote:
> I thought you were asking for those changes to be made before this
> patch got committed, so that's what I was responding to. If you're
> asking for it not to be committed at all, that's a different
> discussion.

I separately had a complaint (in a separate subthread) about the scope
of the predefined role you are introducing, which I think encompasses
two concepts that should be treated differently and I think that may
need to be revisited later. If you ignore this complaint it wouldn't be
the end of the world.

This subthread is about the order in which the patches get committed
(which is a topic you brought up), not whether they are ever to be
committed.

>
> I kind of agree with you about the feature itself. Even though the
> basic feature works quite well and does something people really want,
> there are a lot of loose ends to sort out, and not just about
> security. But I also want to make some progress. If there are
> problems
> with what I'm proposing that will make us regret committing things
> right before feature freeze, then we shouldn't. But waiting a whole
> additional year to see any kind of improvement is not free; these
> issues are serious.

The non-superuser-subscription-owner patch without the apply-as-table-
owner patch feels like a facade to me, at least right now. Perhaps I
can be convinced otherwise, but that's what it looks like to me.

>
> I think this patch is a lot better-baked and less speculative than
> that one. I think that patch is more important, so if they were
> equally mature, I'd favor getting that one committed first. But
> that's
> not the case.

You explicitly asked about the order of the patches, which made me
think it was more of an option?

If the apply-as-table-owner patch gets held up for whatever reason, we
might have to make a difficult decision. I'd prefer focus on the apply-
as-table-owner patch briefly, and now that it's getting some review
attention, we might find out how ready it is quite soon.


Regards,
    Jeff Davis




Re: Non-superuser subscription owners

From
Jeff Davis
Date:
On Fri, 2023-03-24 at 00:17 -0700, Jeff Davis wrote:
> The other patch you posted seems like it makes a lot of progress in
> that direction, and I think that should go in first. That was one of
> the items I suggested previously[2], so thank you for working on
> that.

The above is not a hard objection.

I still hold the opinion that the non-superuser subscriptions work is
feels premature without the apply-as-table-owner work. It would be
great if the other patch ends up ready quickly, which would moot the
commit-ordering question.

Regards,
    Jeff Davis




Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Robert Haas
Date:
On Fri, Mar 24, 2023 at 5:47 PM Jacob Champion <jchampion@timescale.com> wrote:
> Okay, but this is walking back from the network example you just
> described upthread. Do you still consider that in scope, or...?

Sorry, I don't know which example you mean.

> > If machines B and C aren't under our control such that we can
> > configure them that way, then the configuration is fundamentally
> > insecure in a way that we can't really fix.
>
> Here's probably our biggest point of contention. You're unlikely to
> convince me that this is the DBA's fault.
>
> If machines B and C aren't under our control, then our *protocol* is
> fundamentally insecure in a way that we have the ability to fix, in a
> way that's already been characterized in security literature.

I guess I wouldn't have a problem blaming the DBA here, but you seem
to be telling me that the security literature has settled on another
kind of approach, and I'm not in a position to dispute that. It still
feels weird to me, though.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Tue, Mar 28, 2023 at 1:52 PM Jeff Davis <pgsql@j-davis.com> wrote:
> On Fri, 2023-03-24 at 00:17 -0700, Jeff Davis wrote:
> > The other patch you posted seems like it makes a lot of progress in
> > that direction, and I think that should go in first. That was one of
> > the items I suggested previously[2], so thank you for working on
> > that.
>
> The above is not a hard objection.

The other patch is starting to go in a direction that is going to have
some conflicts with this one, so I went ahead and committed this one
to avoid rebasing pain.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Stephen Frost
Date:
Greetings,

* Jacob Champion (jchampion@timescale.com) wrote:
> On 3/20/23 09:32, Robert Haas wrote:
> > I think this is the root of our disagreement.
>
> Agreed.

I've read all the way back to the $SUBJECT change to try and get an
understanding of the questions here and it's not been easy, in part, I
think, due to the verbiage but also the perhaps lack of concrete
examples and instead references to other systems and protocols.

> > My understanding of the
> > previous discussion is that people think that the major problem here
> > is the wraparound-to-superuser attack. That is, in general, we expect
> > that when we connect to a database over the network, we expect it to
> > do some kind of active authentication, like asking us for a password,
> > or asking us for an SSL certificate that isn't just lying around for
> > anyone to use. However, in the specific case of a local connection, we
> > have a reliable way of knowing who the remote user is without any kind
> > of active authentication, namely 'peer' authentication or perhaps even
> > 'trust' if we trust all the local users, and so we don't judge it
> > unreasonable to allow local connections without any form of active
> > authentication. There can be some scenarios where even over a network
> > we can know the identity of the person connecting with complete
> > certainty, e.g. if endpoints are locked down such that the source IP
> > address is a reliable indicator of who is initiating the connection,
> > but in general when there's a network involved you don't know who the
> > person making the connection is and need to do something extra to
> > figure it out.
>
> Okay, but this is walking back from the network example you just
> described upthread. Do you still consider that in scope, or...?

The concern around the network certainly needs to be in-scope overall.

> > If you accept this characterization of the problem,
>
> I'm not going to say yes or no just yet, because I don't understand your
> rationale for where to draw the lines.
>
> If you just want the bare minimum thing that will solve the localhost
> case, require_auth landed this week. Login triggers are not yet a thing,
> so `require_auth=password,md5,scram-sha-256` ensures active
> authentication. You don't even have to disallow localhost connections,
> as far as I can tell; they'll work as intended.

I do think require_auth helps us move in a positive direction.  As I
mentioned elsewhere, I don't think we highlight it nearly enough in the
postgres_fdw documentation.  Let's look at that in a bit more depth with
concrete examples and perhaps everyone will be able to get a bit more
understanding of the issues.

Client is psql
Proxy is some PG server that's got postgres_fdw
Target is another PG server, that is being connected to from Proxy
Authentication is via GSS/Kerberos with proxied credentials

What do we want to require the user to configure to make this secure?

Proxy's pg_hba configured to require GSS auth from Client.
Target's pg_hba configured to require GSS auth from Proxy.

Who are we trusting with what?  In particular, I'd argue that the user
who is able to install the postgres_fdw extension and the user who is
able to issue the CREATE SERVER are largely trusted; at least in so far
as the user doing CREATE SERVER is allowed to create the server and
through that allowed to make outbound connections from the Proxy.

Therefore, the Proxy is configured with postgres_fdw and with a trusted
user performing the CREATE SERVER.

What doesn't this handle today?  Connection side-effects are one
problem- once the CREATE SERVER is done, any user with USAGE rights on
the server can create a USER MAPPING for themselves, either with a
password or without one (if they're able to proxy GSS credentials to the
system).  They aren't able to set password_required though, which
defaults to true.  However, without having require_auth set, they're
able to cause the Proxy to reach an authentication stage with the Target
that might not match what credentials they're supposed to be providing.

We attempt to address this by checking post-auth to Target that we used
the credentials to connect that we expected to- if GSS credentials were
proxied, then we expect to use those.  If a password was provided then
we expect to use a password to auth (only checked after we see if GSS
credentials were proxied and used).  The issue here is 'post-auth' bit,
we'd prefer to fail the connection pre-auth if it isn't what we're
expecting.  Should we then explicit set require_auth=gss when GSS
credentials are proxied?  Also, if a password is provided, then
explicitly set require_auth=scram-sha-256?  Or default to these, at
least, and allow the CREATE SERVER user to override our choices?  Or
should it be a USER MAPPING option that's restricted?  Or not?

> > I think that what you're proposing is that B and C can just be allowed
> > to proxy to A and A can say "hey, by the way, I'm just gonna let you
> > in without asking for anything else" and B and C can, when proxying,
> > react to that by disconnecting before the connection actually goes
> > through. That's simpler, in a sense. It doesn't require us to set up
> > the proxy configuration on B and C in a way that matches what
> > pg_hba.conf allows on A. Instead, B and C can automatically deduce
> > what connections they should refuse to proxy.
>
> Right. It's meant to take the "localhost/wraparound connection" out of a
> class of special things we have to worry about, and make it completely
> boring.

Again, trying to get at a more concrete example- the concern here is a
user with CREATE SERVER ability could leverage that access to become a
superuser if the system is configured with 'peer' access, right?  A
non-superuser is already prevented from being able to set
"password_required=false", perhaps we shouldn't allow them to set
"require_auth=none" (or have that effect) either?  Perhaps the system
should simply forcibly set require_auth based on the credentials
provided in the USER MAPPING or on the connection and have require_auth
otherwise restricted to superuser (who could override it if they'd
really like to)?  Perhaps if password_required=false we implicitly
un-set require_auth, to avoid having to make superusers change their
existing configurations where they've clearly already accepted that
credential-less connections are allowed.

Automatically setting require_auth and restricting the ability of it to
be set on user mappings to superusers doesn't strike me as terribly
difficult to do and seems like it'd prevent this concern.

Just to make sure I'm following- Robert's up-thread suggestion of an
'outbound pg_hba' would be an additional restriction when it comes to
what a user who can use CREATE SERVER is allowed to do?  I'm not against
the idea of having a way to lock that down.. but it's another level of
complication certainly and I'm not sure that some external config file
or such is the best way to try and deal with that, though I do see how
it can have some appeal for certain environments.  It does overall
strike me as something we've not tried to address in any way thus far
and a pretty large effort that's not likely to make it into PG16, unlike
the possibility of auto-setting require_auth, now that it exists.

Thanks!

Stephen

Attachment

RE: Non-superuser subscription owners

From
"houzj.fnst@fujitsu.com"
Date:
On Friday, March 31, 2023 12:05 AM Robert Haas <robertmhaas@gmail.com> wrote:

Hi,

> 
> On Tue, Mar 28, 2023 at 1:52 PM Jeff Davis <pgsql@j-davis.com> wrote:
> > On Fri, 2023-03-24 at 00:17 -0700, Jeff Davis wrote:
> > > The other patch you posted seems like it makes a lot of progress in
> > > that direction, and I think that should go in first. That was one of
> > > the items I suggested previously[2], so thank you for working on
> > > that.
> >
> > The above is not a hard objection.
> 
> The other patch is starting to go in a direction that is going to have some
> conflicts with this one, so I went ahead and committed this one to avoid
> rebasing pain.

I noticed the BF[1] report a core dump after this commit.

#0  0xfd581864 in _lwp_kill () from /usr/lib/libc.so.12
#0  0xfd581864 in _lwp_kill () from /usr/lib/libc.so.12
#1  0xfd5817dc in raise () from /usr/lib/libc.so.12
#2  0xfd581c88 in abort () from /usr/lib/libc.so.12
#3  0x01e6c8d4 in ExceptionalCondition (conditionName=conditionName@entry=0x2007758 "IsTransactionState()",
fileName=fileName@entry=0x20565c4"catcache.c", lineNumber=lineNumber@entry=1208) at assert.c:66
 
#4  0x01e4e404 in SearchCatCacheInternal (cache=0xfd21e500, nkeys=nkeys@entry=1, v1=v1@entry=28985, v2=v2@entry=0,
v3=v3@entry=0,v4=v4@entry=0) at catcache.c:1208
 
#5  0x01e4eea0 in SearchCatCache1 (cache=<optimized out>, v1=v1@entry=28985) at catcache.c:1162
#6  0x01e66e34 in SearchSysCache1 (cacheId=cacheId@entry=11, key1=key1@entry=28985) at syscache.c:825
#7  0x01e98c40 in superuser_arg (roleid=28985) at superuser.c:70
#8  0x01c657bc in ApplyWorkerMain (main_arg=<optimized out>) at worker.c:4552
#9  0x01c1ceac in StartBackgroundWorker () at bgworker.c:861
#10 0x01c23be0 in do_start_bgworker (rw=<optimized out>) at postmaster.c:5762
#11 maybe_start_bgworkers () at postmaster.c:5986
#12 0x01c2459c in process_pm_pmsignal () at postmaster.c:5149
#13 ServerLoop () at postmaster.c:1770
#14 0x01c26cdc in PostmasterMain (argc=argc@entry=4, argv=argv@entry=0xffffe0e4) at postmaster.c:1463
#15 0x01ee2c8c in main (argc=4, argv=0xffffe0e4) at main.c:200

It looks like the super user check is out of a transaction, I haven't checked why
it only failed on one BF animal, but it seems we can put the check into the
transaction like the following:

diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 6fd674b5d6..08f10fc331 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -4545,12 +4545,13 @@ ApplyWorkerMain(Datum main_arg)
         replorigin_session_setup(originid, 0);
         replorigin_session_origin = originid;
         origin_startpos = replorigin_session_get_progress(false);
-        CommitTransactionCommand();
 
         /* Is the use of a password mandatory? */
         must_use_password = MySubscription->passwordrequired &&
             !superuser_arg(MySubscription->owner);
 
+        CommitTransactionCommand();
+

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2023-03-30%2019%3A41%3A08

Best Regards,
Hou Zhijie

Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Thu, Mar 30, 2023 at 9:49 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
> It looks like the super user check is out of a transaction, I haven't checked why
> it only failed on one BF animal, but it seems we can put the check into the
> transaction like the following:

That looks like a reasonable fix but I can't reproduce the problem
locally. I thought the reason why that machine sees the problem might
be that it uses -DRELCACHE_FORCE_RELEASE, but I tried that option here
and the tests still pass. Anyone ideas how to reproduce?

--
Robert Haas
EDB: http://www.enterprisedb.com



RE: Non-superuser subscription owners

From
"houzj.fnst@fujitsu.com"
Date:
On Saturday, April 1, 2023 4:00 AM Robert Haas <robertmhaas@gmail.com>

Hi,

> 
> On Thu, Mar 30, 2023 at 9:49 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> > It looks like the super user check is out of a transaction, I haven't
> > checked why it only failed on one BF animal, but it seems we can put
> > the check into the transaction like the following:
> 
> That looks like a reasonable fix but I can't reproduce the problem locally. I
> thought the reason why that machine sees the problem might be that it uses
> -DRELCACHE_FORCE_RELEASE, but I tried that option here and the tests still pass.
> Anyone ideas how to reproduce?

I think it's a timing problem because superuser_arg() function will cache the
roleid that passed in last time, so it might not search the syscache to hit the
Assert() check each time. And in the regression test, the roleid cache happened
to be invalidated before the superuser_arg() by some concurrently ROLE change(
maybe in subscription.sql and publication.sql).

I can reproduce it by using gdb and starting another session to change the ROLE.

When the apply worker starts, use the gdb to block the apply worker in the
transaction before the super user check. Then start another session to ALTER
ROLE to invalidate the roleid cache in superuser_arg() which will cause the
apply worker to search the syscache and hit the Assert().

--
        origin_startpos = replorigin_session_get_progress(false);
B*        CommitTransactionCommand();

        /* Is the use of a password mandatory? */
        must_use_password = MySubscription->passwordrequired &&
            ! superuser_arg(MySubscription->owner);
--

Best Regards,
Hou zj

Re: Non-superuser subscription owners

From
Alexander Lakhin
Date:
Hello Robert,

31.03.2023 23:00, Robert Haas wrote:
That looks like a reasonable fix but I can't reproduce the problem
locally. I thought the reason why that machine sees the problem might
be that it uses -DRELCACHE_FORCE_RELEASE, but I tried that option here
and the tests still pass. Anyone ideas how to reproduce?

I've managed to reproduce it using the following script:
for ((i=1;i<=10;i++)); do
echo "iteration $i"
echo "
CREATE ROLE sub_user;
CREATE SUBSCRIPTION testsub CONNECTION 'dbname=db'
  PUBLICATION testpub WITH (connect = false);
ALTER SUBSCRIPTION testsub ENABLE;
DROP SUBSCRIPTION testsub;
SELECT pg_sleep(0.001);
DROP ROLE sub_user;
" | psql
psql -c "ALTER SUBSCRIPTION testsub DISABLE;"
psql -c "ALTER SUBSCRIPTION testsub SET (slot_name = NONE);"
psql -c "DROP SUBSCRIPTION testsub;"
grep 'TRAP' server.log && break
done

iteration 3
CREATE ROLE
...
ALTER SUBSCRIPTION
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because ano
ther server process exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
       This probably means the server terminated abnormally
       before or while processing the request.
connection to server was lost
TRAP: failed Assert("IsTransactionState()"), File: "catcache.c", Line: 1208, PID: 1001242

Best regards,
Alexander

Re: Non-superuser subscription owners

From
Andres Freund
Date:
Hi,

On April 1, 2023 9:00:00 AM PDT, Alexander Lakhin <exclusion@gmail.com> wrote:
>Hello Robert,
>
>31.03.2023 23:00, Robert Haas wrote:
>> That looks like a reasonable fix but I can't reproduce the problem
>> locally. I thought the reason why that machine sees the problem might
>> be that it uses -DRELCACHE_FORCE_RELEASE, but I tried that option here
>> and the tests still pass. Anyone ideas how to reproduce?
>
>I've managed to reproduce it using the following script:
>for ((i=1;i<=10;i++)); do
>echo "iteration $i"
>echo "
>CREATE ROLE sub_user;
>CREATE SUBSCRIPTION testsub CONNECTION 'dbname=db'
>  PUBLICATION testpub WITH (connect = false);
>ALTER SUBSCRIPTION testsub ENABLE;
>DROP SUBSCRIPTION testsub;
>SELECT pg_sleep(0.001);
>DROP ROLE sub_user;
>" | psql
>psql -c "ALTER SUBSCRIPTION testsub DISABLE;"
>psql -c "ALTER SUBSCRIPTION testsub SET (slot_name = NONE);"
>psql -c "DROP SUBSCRIPTION testsub;"
>grep 'TRAP' server.log && break
>done
>
>iteration 3
>CREATE ROLE
>...
>ALTER SUBSCRIPTION
>WARNING:  terminating connection because of crash of another server process
>DETAIL:  The postmaster has commanded this server process to roll back the current transaction and exit, because ano
>ther server process exited abnormally and possibly corrupted shared memory.
>HINT:  In a moment you should be able to reconnect to the database and repeat your command.
>server closed the connection unexpectedly
>       This probably means the server terminated abnormally
>       before or while processing the request.
>connection to server was lost
>TRAP: failed Assert("IsTransactionState()"), File: "catcache.c", Line: 1208, PID: 1001242

Errors like that are often easier to reproduce with clobber caches (or whatever the name is these days) enabled.

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Sat, Apr 1, 2023 at 12:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> I've managed to reproduce it using the following script:
> for ((i=1;i<=10;i++)); do
> echo "iteration $i"
> echo "
> CREATE ROLE sub_user;
> CREATE SUBSCRIPTION testsub CONNECTION 'dbname=db'
>   PUBLICATION testpub WITH (connect = false);
> ALTER SUBSCRIPTION testsub ENABLE;
> DROP SUBSCRIPTION testsub;
> SELECT pg_sleep(0.001);
> DROP ROLE sub_user;
> " | psql
> psql -c "ALTER SUBSCRIPTION testsub DISABLE;"
> psql -c "ALTER SUBSCRIPTION testsub SET (slot_name = NONE);"
> psql -c "DROP SUBSCRIPTION testsub;"
> grep 'TRAP' server.log && break
> done

After a bit of experimentation this repro worked for me -- I needed
-DRELCACHE_FORCE_RELEASE as well, and a bigger iteration count. I
verified that the patch fixed it, and committed the patch with the
addition of a comment.

Thanks very much for this repro, and likewise many thanks to Hou
Zhijie for the report and patch.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Thu, Mar 30, 2023 at 9:35 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Mar 28, 2023 at 1:52 PM Jeff Davis <pgsql@j-davis.com> wrote:
> > On Fri, 2023-03-24 at 00:17 -0700, Jeff Davis wrote:
> > > The other patch you posted seems like it makes a lot of progress in
> > > that direction, and I think that should go in first. That was one of
> > > the items I suggested previously[2], so thank you for working on
> > > that.
> >
> > The above is not a hard objection.
>
> The other patch is starting to go in a direction that is going to have
> some conflicts with this one, so I went ahead and committed this one
> to avoid rebasing pain.
>

Do we need to have a check for this new option "password_required" in
maybe_reread_subscription() where we "Exit if any parameter that
affects the remote connection was changed."? This new option is
related to the remote connection so I thought it is worth considering
whether we want to exit and restart the apply worker when this option
is changed.

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Sat, Apr 8, 2023 at 1:35 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> Do we need to have a check for this new option "password_required" in
> maybe_reread_subscription() where we "Exit if any parameter that
> affects the remote connection was changed."? This new option is
> related to the remote connection so I thought it is worth considering
> whether we want to exit and restart the apply worker when this option
> is changed.

Hmm, good question. I think that's probably a good idea. If the
current connection is already working, the only possible result of
getting rid of it and trying to create a new one is that it might now
fail instead, but someone might want that behavior. Otherwise, they'd
instead find the failure at a later, maybe less convenient, time.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Mon, Apr 10, 2023 at 9:15 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Sat, Apr 8, 2023 at 1:35 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Do we need to have a check for this new option "password_required" in
> > maybe_reread_subscription() where we "Exit if any parameter that
> > affects the remote connection was changed."? This new option is
> > related to the remote connection so I thought it is worth considering
> > whether we want to exit and restart the apply worker when this option
> > is changed.
>
> Hmm, good question. I think that's probably a good idea. If the
> current connection is already working, the only possible result of
> getting rid of it and trying to create a new one is that it might now
> fail instead, but someone might want that behavior. Otherwise, they'd
> instead find the failure at a later, maybe less convenient, time.
>

I think additionally, we should check that the new owner of the
subscription is not a superuser, otherwise, anyway, this parameter is
ignored. Please find the attached to add this check.

--
With Regards,
Amit Kapila.

Attachment

Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Tue, Apr 11, 2023 at 5:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> I think additionally, we should check that the new owner of the
> subscription is not a superuser, otherwise, anyway, this parameter is
> ignored. Please find the attached to add this check.

I don't see why we should check that. It makes this different from all
the other cases and I don't see any benefit.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Tue, Apr 11, 2023 at 8:21 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Apr 11, 2023 at 5:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > I think additionally, we should check that the new owner of the
> > subscription is not a superuser, otherwise, anyway, this parameter is
> > ignored. Please find the attached to add this check.
>
> I don't see why we should check that. It makes this different from all
> the other cases and I don't see any benefit.
>

I thought it would be better if we don't restart the worker unless it
is required. In case, the subscription's owner is a superuser, the
'password_required' is ignored, so why restart the apply worker when
somebody changes it in such a case? I understand that there may not be
a need to change the 'password_required' option when the
subscription's owner is the superuser but one may first choose to
change the password_required flag and then the owner of a subscription
to a non-superuser. Anyway, I don't think as such there is any problem
with restarting the worker even when the subscription owner is a
superuser, so adjusted the check accordingly.

--
With Regards,
Amit Kapila.

Attachment

Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Tue, Apr 11, 2023 at 10:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> Anyway, I don't think as such there is any problem
> with restarting the worker even when the subscription owner is a
> superuser, so adjusted the check accordingly.

LGTM. I realize we could do more sophisticated things here, but I
think it's better to keep the code simple.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Jacob Champion
Date:
On 3/30/23 05:58, Robert Haas wrote:
> On Fri, Mar 24, 2023 at 5:47 PM Jacob Champion <jchampion@timescale.com> wrote:
>> Okay, but this is walking back from the network example you just
>> described upthread. Do you still consider that in scope, or...?
> 
> Sorry, I don't know which example you mean.

The symmetrical proxy situation you described, where all the proxies are
mutually trusting. While it's easier to secure that setup than the
asymmetrical ones, it's also not a localhost-only situation anymore, and
the moment you open up to other machines is where I think your
characterization runs into trouble.

> I guess I wouldn't have a problem blaming the DBA here, but you seem
> to be telling me that the security literature has settled on another
> kind of approach, and I'm not in a position to dispute that. It still
> feels weird to me, though.

If it helps, [1] is a paper that helped me wrap my head around some of
it. It's focused on capability systems and an academic audience, but the
"Avoiding Confused Deputy Problems" section starting on page 11 is a
good place to jump to for the purposes of this discussion.

--Jacob

[1] https://srl.cs.jhu.edu/pubs/SRL2003-02.pdf



Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

From
Jacob Champion
Date:
On 3/30/23 11:13, Stephen Frost wrote:
>> Okay, but this is walking back from the network example you just
>> described upthread. Do you still consider that in scope, or...?
> 
> The concern around the network certainly needs to be in-scope overall.

Sounds good!

> Who are we trusting with what?  In particular, I'd argue that the user
> who is able to install the postgres_fdw extension and the user who is
> able to issue the CREATE SERVER are largely trusted; at least in so far
> as the user doing CREATE SERVER is allowed to create the server and
> through that allowed to make outbound connections from the Proxy.
> 
> Therefore, the Proxy is configured with postgres_fdw and with a trusted
> user performing the CREATE SERVER.
> 
> What doesn't this handle today?  Connection side-effects are one
> problem- once the CREATE SERVER is done, any user with USAGE rights on
> the server can create a USER MAPPING for themselves, either with a
> password or without one (if they're able to proxy GSS credentials to the
> system).  They aren't able to set password_required though, which
> defaults to true.  However, without having require_auth set, they're
> able to cause the Proxy to reach an authentication stage with the Target
> that might not match what credentials they're supposed to be providing.
> 
> We attempt to address this by checking post-auth to Target that we used
> the credentials to connect that we expected to- if GSS credentials were
> proxied, then we expect to use those.  If a password was provided then
> we expect to use a password to auth (only checked after we see if GSS
> credentials were proxied and used).  The issue here is 'post-auth' bit,
> we'd prefer to fail the connection pre-auth if it isn't what we're
> expecting.

Right. Keep in mind that require_auth is post-auth, though; it can't fix
that issue, so it doesn't fix any connection side-effect problems at all.

> Should we then explicit set require_auth=gss when GSS
> credentials are proxied?  Also, if a password is provided, then
> explicitly set require_auth=scram-sha-256?  Or default to these, at
> least, and allow the CREATE SERVER user to override our choices?  Or
> should it be a USER MAPPING option that's restricted?  Or not?
IMO, yes -- whatever credentials the proxy is forwarding from the user,
the proxy should be checking that the server has actually used them. The
person with the ability to create a USER MAPPING should probably not
have the ability to override that check.

>>> I think that what you're proposing is that B and C can just be allowed
>>> to proxy to A and A can say "hey, by the way, I'm just gonna let you
>>> in without asking for anything else" and B and C can, when proxying,
>>> react to that by disconnecting before the connection actually goes
>>> through. That's simpler, in a sense. It doesn't require us to set up
>>> the proxy configuration on B and C in a way that matches what
>>> pg_hba.conf allows on A. Instead, B and C can automatically deduce
>>> what connections they should refuse to proxy.
>>
>> Right. It's meant to take the "localhost/wraparound connection" out of a
>> class of special things we have to worry about, and make it completely
>> boring.
> 
> Again, trying to get at a more concrete example- the concern here is a
> user with CREATE SERVER ability could leverage that access to become a
> superuser if the system is configured with 'peer' access, right?

Or 'trust localhost', or 'ident [postgres user]', yes.

> A
> non-superuser is already prevented from being able to set
> "password_required=false", perhaps we shouldn't allow them to set
> "require_auth=none" (or have that effect) either?

I think that sounds reasonable.

> Perhaps the system
> should simply forcibly set require_auth based on the credentials
> provided in the USER MAPPING or on the connection and have require_auth
> otherwise restricted to superuser (who could override it if they'd
> really like to)?  Perhaps if password_required=false we implicitly
> un-set require_auth, to avoid having to make superusers change their
> existing configurations where they've clearly already accepted that
> credential-less connections are allowed.

Mm, I think I like the first idea better. If you've set a password,
wouldn't you like to know if the server ignored it? If password_required
is false, *and* you don't have a password, then we can drop require_auth
without issue.

> Automatically setting require_auth and restricting the ability of it to
> be set on user mappings to superusers doesn't strike me as terribly
> difficult to do and seems like it'd prevent this concern.
> 
> Just to make sure I'm following- Robert's up-thread suggestion of an
> 'outbound pg_hba' would be an additional restriction when it comes to
> what a user who can use CREATE SERVER is allowed to do?

Yes. That can provide additional safety in the case where you really
need to take the require_auth checks away for whatever reason. I think
it's just a good in-depth measure, and if we don't extend the protocol
in some way to do a pre-auth check, it's also the way for the DBA to
bless known-good connection paths.

Thanks,
--Jacob



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Wed, Apr 12, 2023 at 5:50 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Apr 11, 2023 at 10:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Anyway, I don't think as such there is any problem
> > with restarting the worker even when the subscription owner is a
> > superuser, so adjusted the check accordingly.
>
> LGTM.
>

Thanks. I am away for a few days so can push it only next week.

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Thu, Apr 13, 2023 at 8:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 12, 2023 at 5:50 PM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Tue, Apr 11, 2023 at 10:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > Anyway, I don't think as such there is any problem
> > > with restarting the worker even when the subscription owner is a
> > > superuser, so adjusted the check accordingly.
> >
> > LGTM.
> >
>
> Thanks. I am away for a few days so can push it only next week.
>

Pushed. I noticed that we didn't display this new subscription option
'password_required' in \dRs+:

postgres=# \dRs+

      List of subscriptions
 Name |  Owner   | Enabled | Publication | Binary | Streaming |
Two-phase commit | Disable on error | Origin | Run as Owner? |
Synchronous commit |    Conninfo     | Skip LSN

Is that intentional? Sorry, if it was discussed previously because I
haven't followed this discussion in detail.

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Thu, Apr 20, 2023 at 1:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> Pushed. I noticed that we didn't display this new subscription option
> 'password_required' in \dRs+:
>
> postgres=# \dRs+
>
>       List of subscriptions
>  Name |  Owner   | Enabled | Publication | Binary | Streaming |
> Two-phase commit | Disable on error | Origin | Run as Owner? |
> Synchronous commit |    Conninfo     | Skip LSN
>
> Is that intentional? Sorry, if it was discussed previously because I
> haven't followed this discussion in detail.

No, I don't think that's intentional. I just didn't think about it.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
vignesh C
Date:
On Fri, 21 Apr 2023 at 01:49, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Apr 20, 2023 at 1:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Pushed. I noticed that we didn't display this new subscription option
> > 'password_required' in \dRs+:
> >
> > postgres=# \dRs+
> >
> >       List of subscriptions
> >  Name |  Owner   | Enabled | Publication | Binary | Streaming |
> > Two-phase commit | Disable on error | Origin | Run as Owner? |
> > Synchronous commit |    Conninfo     | Skip LSN
> >
> > Is that intentional? Sorry, if it was discussed previously because I
> > haven't followed this discussion in detail.
>
> No, I don't think that's intentional. I just didn't think about it.

Here is a patch to display Password required with \dRs+ command. Also
added one test to describe subscription when password_required is
false, as all the existing tests were there only for password_required
as true.

Regards,
Vignesh

Attachment

Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Fri, Apr 21, 2023 at 12:30 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 21 Apr 2023 at 01:49, Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Thu, Apr 20, 2023 at 1:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > Pushed. I noticed that we didn't display this new subscription option
> > > 'password_required' in \dRs+:
> > >
> > > postgres=# \dRs+
> > >
> > >       List of subscriptions
> > >  Name |  Owner   | Enabled | Publication | Binary | Streaming |
> > > Two-phase commit | Disable on error | Origin | Run as Owner? |
> > > Synchronous commit |    Conninfo     | Skip LSN
> > >
> > > Is that intentional? Sorry, if it was discussed previously because I
> > > haven't followed this discussion in detail.
> >
> > No, I don't think that's intentional. I just didn't think about it.
>
> Here is a patch to display Password required with \dRs+ command. Also
> added one test to describe subscription when password_required is
> false, as all the existing tests were there only for password_required
> as true.
>

LGTM. Let's see if Robert or others have any comments, otherwise, I'll
push this early next week.

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Robert Haas
Date:
On Fri, Apr 21, 2023 at 8:19 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> LGTM. Let's see if Robert or others have any comments, otherwise, I'll
> push this early next week.

LGTM too.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Fri, Apr 21, 2023 at 6:21 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Fri, Apr 21, 2023 at 8:19 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > LGTM. Let's see if Robert or others have any comments, otherwise, I'll
> > push this early next week.
>
> LGTM too.
>

Pushed.

--
With Regards,
Amit Kapila.



RE: Non-superuser subscription owners

From
"Zhijie Hou (Fujitsu)"
Date:
On Tuesday, April 4, 2023 1:57 AM Robert Haas <robertmhaas@gmail.com> wrote:
> 
> On Sat, Apr 1, 2023 at 12:00 PM Alexander Lakhin <exclusion@gmail.com>
> wrote:
> > I've managed to reproduce it using the following script:
> > for ((i=1;i<=10;i++)); do
> > echo "iteration $i"
> > echo "
> > CREATE ROLE sub_user;
> > CREATE SUBSCRIPTION testsub CONNECTION 'dbname=db'
> >   PUBLICATION testpub WITH (connect = false); ALTER SUBSCRIPTION
> > testsub ENABLE; DROP SUBSCRIPTION testsub; SELECT pg_sleep(0.001);
> > DROP ROLE sub_user; " | psql psql -c "ALTER SUBSCRIPTION testsub
> > DISABLE;"
> > psql -c "ALTER SUBSCRIPTION testsub SET (slot_name = NONE);"
> > psql -c "DROP SUBSCRIPTION testsub;"
> > grep 'TRAP' server.log && break
> > done
> 
> After a bit of experimentation this repro worked for me -- I needed
> -DRELCACHE_FORCE_RELEASE as well, and a bigger iteration count. I verified
> that the patch fixed it, and committed the patch with the addition of a
> comment.

Thanks for pushing!

While testing this, I found a similar problem in table sync worker,
as we also invoke superuser_arg() in table sync worker which is not in a
transaction.

LogicalRepSyncTableStart
...
    /* Is the use of a password mandatory? */
    must_use_password = MySubscription->passwordrequired &&
        !superuser_arg(MySubscription->owner);

#0  0x00007f18bb55aaff in raise () from /lib64/libc.so.6
#1  0x00007f18bb52dea5 in abort () from /lib64/libc.so.6
#2  0x0000000000b69a22 in ExceptionalCondition (conditionName=0xda4338 "IsTransactionState()", fileName=0xda403e
"catcache.c",lineNumber=1208) at assert.c:66
 
#3  0x0000000000b4842a in SearchCatCacheInternal (cache=0x27cab80, nkeys=1, v1=10, v2=0, v3=0, v4=0) at
catcache.c:1208
#4  0x0000000000b48329 in SearchCatCache1 (cache=0x27cab80, v1=10) at catcache.c:1162
#5  0x0000000000b630c7 in SearchSysCache1 (cacheId=11, key1=10) at syscache.c:825
#6  0x0000000000b982e3 in superuser_arg (roleid=10) at superuser.c:70

I can reproduce this via gdb following similar steps in [1].

I think we need to move this call into a transaction as well and here is an attempt
to do that.

[1]
https://www.postgresql.org/message-id/OS0PR01MB5716E596E4FB83DE46F592FE948C9%40OS0PR01MB5716.jpnprd01.prod.outlook.com

Best Regards,
Hou zj

Attachment

Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Fri, May 12, 2023 at 3:28 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:
>
>
> I can reproduce this via gdb following similar steps in [1].
>
> I think we need to move this call into a transaction as well and here is an attempt
> to do that.
>

I am able to reproduce this issue following the steps mentioned by you
and the proposed patch to fix the issue looks good to me.

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Tue, Jun 13, 2023 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, May 12, 2023 at 3:28 PM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
> >
> >
> > I can reproduce this via gdb following similar steps in [1].
> >
> > I think we need to move this call into a transaction as well and here is an attempt
> > to do that.
> >
>
> I am able to reproduce this issue following the steps mentioned by you
> and the proposed patch to fix the issue looks good to me.
>

I'll push this tomorrow unless there are any suggestions or comments.

--
With Regards,
Amit Kapila.



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Tue, Jun 13, 2023 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, May 12, 2023 at 3:28 PM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
> >
> >
> > I can reproduce this via gdb following similar steps in [1].
> >
> > I think we need to move this call into a transaction as well and here is an attempt
> > to do that.
> >
>
> I am able to reproduce this issue following the steps mentioned by you
> and the proposed patch to fix the issue looks good to me.
>

Today, again looking at the patch, it seems to me that it would be
better if we can fix this without starting a new transaction. Won't it
be better if we move this syscall to a place where we are fetching
relstate (GetSubscriptionRelState()) a few lines above? I understand
by doing that in some cases like when copy_data = false, we may do
this syscall unnecessarily but OTOH, starting a new transaction just
for a syscall (superuser_arg()) also doesn't seem like a good idea to
me.

--
With Regards,
Amit Kapila.



RE: Non-superuser subscription owners

From
"Zhijie Hou (Fujitsu)"
Date:
On Wednesday, June 14, 2023 10:11 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Tue, Jun 13, 2023 at 2:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, May 12, 2023 at 3:28 PM Zhijie Hou (Fujitsu)
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > >
> > > I can reproduce this via gdb following similar steps in [1].
> > >
> > > I think we need to move this call into a transaction as well and
> > > here is an attempt to do that.
> > >
> >
> > I am able to reproduce this issue following the steps mentioned by you
> > and the proposed patch to fix the issue looks good to me.
> >
> 
> Today, again looking at the patch, it seems to me that it would be better if we
> can fix this without starting a new transaction. Won't it be better if we move this
> syscall to a place where we are fetching relstate (GetSubscriptionRelState()) a
> few lines above? I understand by doing that in some cases like when copy_data
> = false, we may do this syscall unnecessarily but OTOH, starting a new
> transaction just for a syscall (superuser_arg()) also doesn't seem like a good
> idea to me.

Makes sense to me, here is the updated patch which does the same.

Best Regards,
Hou zj

Attachment

Re: Non-superuser subscription owners

From
Alvaro Herrera
Date:
On 2023-Jun-13, Amit Kapila wrote:

> I'll push this tomorrow unless there are any suggestions or comments.

Note the proposed commit message is wrong about which commit is to blame
for the original problem -- it mentions e7e7da2f8d57 twice, but one of
them is actually c3afe8cf5a1e.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



Re: Non-superuser subscription owners

From
Amit Kapila
Date:
On Thu, Jun 15, 2023 at 11:18 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2023-Jun-13, Amit Kapila wrote:
>
> > I'll push this tomorrow unless there are any suggestions or comments.
>
> Note the proposed commit message is wrong about which commit is to blame
> for the original problem -- it mentions e7e7da2f8d57 twice, but one of
> them is actually c3afe8cf5a1e.
>

Right, I also noticed this and changed it before pushing, See
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b5c517379a40fa1af84c0852aa3730a5875a6482

--
With Regards,
Amit Kapila.



sandboxing untrusted code

From
Robert Haas
Date:
On Mon, Feb 27, 2023 at 7:37 PM Jeff Davis <pgsql@j-davis.com> wrote:
> On Mon, 2023-02-27 at 16:13 -0500, Robert Haas wrote:
> > On Mon, Feb 27, 2023 at 1:25 PM Jeff Davis <pgsql@j-davis.com> wrote:
> > > I think you are saying that we should still run Alice's code with
> > > the
> > > privileges of Bob, but somehow make that safe(r) for Bob. Is that
> > > right?
> >
> > Yeah. That's the idea I was floating, at least.
>
> Isn't that a hard problem; maybe impossible?

I want to flesh out the ideas I previously articulated in this area a bit more.

As a refresher, the scenario I'm talking about is any one in which one
user, who I'll call Bob, does something that results in executing code
provided by another user, who I'll call Alice. The most obvious way
that this can happen is if Bob performs some operation that targets a
table owned by Alice. That operation might be DML, like an INSERT or
UPDATE; or it might be some other kind of maintenance command that can
cause code execution, like REINDEX, which can evaluate index
expressions. The code being executed might be run either as Alice or
as Bob, depending on how it's been attached to the table and what
operation is being performed and maybe whether some function or
procedure that might contain it is SECURITY INVOKER or SECURITY
DEFINER. Regardless of the details, our concern is that Alice's code
might do something that Bob does not like. This is a particularly
lively concern if the code happens to be running with the privileges
of Bob, because then Alice might try to do something like access
objects for which Bob has permissions and Alice does not. But the
problems don't completely go away if the code is being run as Alice,
because even then, Alice could try to manipulate the session state in
some way that will cause Bob to hose himself later on. The existing
SECURITY_RESTRICTED_OPERATION flag defends against some scenarios of
this type, but at present we also rely heavily on Bob being *very*
careful, as Jeff has highlighted rather compellingly.

I think we can do better, both in the case where Bob is running code
provided by Alice using his own permissions, and also in the case
where Bob is running code provided by Alice using Alice's permissions.
To that end, I'd like to define a few terms. First, let's define the
provider of a piece of code as either (a) the owner of the function or
procedure that contains it or (b) the owner of the object to which
it's directly attached or (c) the session user, for code directly
entered at top level. For example, if Alice owns a table T1 and
applies a default expression which uses a function provided by
Charlie, and Bob then inserts into T1, then Bob provides the insert
statement, Alice provides the default expression, and Charlie provides
the code inside the function. I assert that in every context where
PostgreSQL evaluates expressions or runs SQL statements, there's a
well-defined provider for the expression or statement, and we can make
the system track it if we want to. Second, I'd like to define trust.
Users trust themselves, and they also trust users who have a superset
of their permissions, a category that most typically just includes
superusers but could include others if role grants are in use. A user
can also declare through some mechanism or other that they trust
another user even if that other user does not have a superset of their
permissions. Such a declaration carries the risk that the trusted user
could hijack the trusting user's permissions; we would document and
disclaim this risk.

Finally, let's define sandboxing. When code is sandboxed, the set of
operations that it is allowed to perform is restricted. Sandboxing
isn't necessarily all or nothing; there can be different categories of
operations and we can allow some and deny others, if we wish.
Obviously this is quite a bit of work to implement, but I don't think
it's unmanageable. YMMV. To keep things simple for purposes of
discussion, I'm going to just define two levels of sandboxing for the
moment; I think we might want more. If code is fully sandboxed, it can
only do the following things:

1. Compute stuff. There's no restriction on the permissible amount of
compute; if you call untrusted code, nothing prevents it from running
forever.
2. Call other code. This may be done by a function call or a command
such as CALL or DO, all subject to the usual permissions checks but no
further restrictions.
3. Access the current session state, without modifying it. For
example, executing SHOW or current_setting() is fine.
4. Transiently modify the current session state in ways that are
necessarily reversed before returning to the caller. For example, an
EXCEPTION block or a configuration change driven by proconfig is fine.
5. Produce messages at any log level. This includes any kind of ERROR.

Fully sandboxed code can't access or modify data beyond what gets
passed to it, with the exception of the session state mentioned above.
This includes data inside of PostgreSQL, like tables or statistics, as
well as data outside of PostgreSQL, like files that it might try to
read by calling pg_read_file(). If it tries, an ERROR occurs.

Partially sandboxed code is much less restricted. Partially sandboxed
code can do almost anything that unsandboxed code can do, but with one
important exception: it can't modify the session state. This means it
can't run commands like CLOSE, DEALLOCATE, DECLARE, DISCARD, EXECUTE,
FETCH, LISTEN, MOVE, PREPARE, or UNLISTEN. Nor can it try to COMMIT or
ROLLBACK the current transaction or set up a SAVEPOINT or ROLLBACK TO
SAVEPOINT. Nor can it use SET or set_config() to change a parameter
value.

With those definitions in hand, I think it's possible to propose a
meaningful security model:

Rule #1: If the current user does not trust the provider, the code is
fully sandboxed.
Rule #2: If the session user does not trust the provider either of the
currently-running code or of any other code that's still on the call
stack, the code is partially sandboxed.

Let's take a few examples. First, suppose Alice has a table and it has
some associated code for which the provider is always Alice. That is,
she may have default expressions or index expressions for which she is
necessarily the provider, and she may have triggers, but in this
example she owns the functions or procedures called by those triggers
and is thus the provider for those as well. Now, Bob, who does not
trust Alice, does something to Alice's table. The code might run as
Bob (by default) and then it will be fully sandboxed because of rule
#1. Or there might be a SECURITY DEFINER function or procedure
involved causing the code to run as Alice, in which case the code will
be partially sandboxed because of rule #2. I argue that Bob is pretty
safe here. Alice can't make any durable changes to Bob's session state
no matter what she does, and if she provides code that runs as Bob it
can only do innocuous things like calculating x+y or x || y or running
generate_series() or examining current_role. Yes, it could go into a
loop, but that doesn't compromise Bob's account: he can hit ^C or set
statement_timeout. If she provides code that runs as herself it can
make use of her privileges (but not Bob's) as long as it doesn't try
to touch the session state. So Bob is pretty safe.

Now, suppose instead that Bob has a table but some code that is
attached to it can call a function that is owned by Alice. In this
case, as long as everything on the call stack is provided by Bob,
there are no restrictions. But as soon as we enter Alice's function,
the code is fully sandboxed unless it arranges to switch to Alice's
permissions using SECURITY DEFINER, in which case it's still partially
sandboxed. Again, it's hard to see how Alice can get any leverage
here.

Finally, suppose Alice has a table and attaches a trigger to it that
calls a function provided by Charlie. Bob now does something to this
table that results in the execution of this trigger. If the current
user -- which will be either Alice or Bob depending on whether the
function is SECURITY DEFINER -- does not trust Charlie, the code
inside the trigger is going to run fully sandboxed because of rule #1.
But even if the current user does trust Charlie, the code inside the
trigger is still going to be partially sandboxed unless Bob trusts
BOTH Alice AND Charlie because of rule #2. This seems appropriate,
because in this situation, either Alice or Charlie could be trying to
fool Bob into taking some action he doesn't intend to take by
tinkering with his session.

In general if we have a great big call stack that involves calling a
whole bunch of functions either as SECURITY INVOKER or as SECURITY
DEFINER, changing the session state is blocked unless the session user
trusts the owners of all of those functions. And if we got to any of
those functions by means of code attached directly to tables, like an
index expression or default expression, changing the session state is
blocked unless the session user also trusts the owners of those
tables.

I see a few obvious objections to this line of attack that someone
might raise, and I'd like to address them now. First, somebody might
argue that this is too hard to implement. I don't think so, because a
lot of things can be blocked from common locations. However, it will
be necessary to go through all the functions we ship and add checks in
a bunch of places to individual functions. That's a pain, but it's not
that different from what we've already done with PARALLEL { SAFE |
RESTRICTED | UNSAFE } or LEAKPROOF. Those weren't terribly enjoyable
exercises for me and I made some mistakes categorizing some things,
but the job got done and those mechanisms are accepted infrastructure
now. Second, somebody might argue that full sandboxing is such a
draconian set of restrictions that it will inconvenience users greatly
or that it's pointless to even allow anything to be executed or
something along those lines. I think that argument has some merit, but
I think the restrictions sound worse than they actually are in
context. For instance, have you ever written a default expression for
a column that would fail under full sandboxing? I wouldn't be
surprised if you have, but I also bet it's a fairly small percentage
of cases. I think a lot of things that people want to do as a
practical matter will be totally fine. I can think of exceptions, most
obviously reading from a random-number generator that has a
user-controllable seed, which technically qualifies as tinkering with
the session state. But a lot of things are going to work fine, and the
things that do fall afoul of a mechanism like this probably deserve
some study and examination. If you're writing index expressions that
do anything more than simple calculation, it's probably fine for the
system to raise an eyebrow about that. Even if they do something as
simple as reading from another table, that's not necessarily going to
dump and restore properly, even if it's secure, because the table
ordering dependencies won't be clear to pg_dump.

And that brings me to another point, which is that we might think of
sandboxing some operations, either by default or unconditionally, for
reasons other than trust or the lack of it. There's a lot of things
that you COULD do in an index expression that you really SHOULD NOT
do. As mentioned, even reading a table is pretty sketchy, but should a
function called from an index expression ever be allowed to execute
DDL? Is it reasonable if such a function wants to execute CREATE
TABLE? Even a temporary table is dubious, and a non-temporary table is
really dubious. What if such a function wants to ALTER ROLE ...
SUPERUSER? I think that's bonkers and should almost certainly be
categorically denied. Probably someone is trying to hack something,
and even if they aren't, it's still nuts. So I would argue that in a
context like an index expression, some amount of sandboxing -- not
necessarily corresponding to either of the levels described above --
is probably a good idea, not based on the relationship between
whatever users are involved, but based rather on the context. There's
room for a lot of bikeshedding here and I don't think this kind of
thing is necessarily the top priority, but I think it's worth thinking
about.

Finally, I'd like to note that partial sandboxing can be viewed as a
strengthening of restrictions that we already have in the form of
SECURITY_RESTRICTED_OPERATION. I can't claim to be an authority on the
evolution of that flag, but I think that up to this point the general
philosophy has been to insert the smallest possible plug in the dike.
When a definite security problem is discovered, somebody tries to
block just enough stuff to make it not demonstrably insecure. However,
I feel that the surface area for things to go wrong is rather large,
and we'd be better off with a more comprehensive series of
restrictions. We likely have some security issues that haven't been
found yet, and even something we wouldn't classify as a security
vulnerability can still be a pitfall for the unwary. I imagine that
SECURITY_RESTRICTED_OPERATION might end up getting subsumed into what
I'm here calling partial sandboxing, but I'm not quite sure about that
because right now this is just a theoretical description of a system,
not something for which I've written any code.

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: sandboxing untrusted code

From
Jeff Davis
Date:
On Thu, 2023-08-31 at 11:25 -0400, Robert Haas wrote:
> As a refresher, the scenario I'm talking about is any one in which
> one
> user, who I'll call Bob, does something that results in executing
> code
> provided by another user, who I'll call Alice. The most obvious way
> that this can happen is if Bob performs some operation that targets a
> table owned by Alice. That operation might be DML, like an INSERT or
> UPDATE; or it might be some other kind of maintenance command that
> can
> cause code execution, like REINDEX, which can evaluate index
> expressions.

REINDEX executes index expressions as the table owner. (You are correct
that INSERT executes index expressions as the inserting user.)

>  The code being executed might be run either as Alice or
> as Bob, depending on how it's been attached to the table and what
> operation is being performed and maybe whether some function or
> procedure that might contain it is SECURITY INVOKER or SECURITY
> DEFINER. Regardless of the details, our concern is that Alice's code
> might do something that Bob does not like. This is a particularly
> lively concern if the code happens to be running with the privileges
> of Bob, because then Alice might try to do something like access
> objects for which Bob has permissions and Alice does not.

Agreed.


> 1. Compute stuff. There's no restriction on the permissible amount of
> compute; if you call untrusted code, nothing prevents it from running
> forever.
> 2. Call other code. This may be done by a function call or a command
> such as CALL or DO, all subject to the usual permissions checks but
> no
> further restrictions.
> 3. Access the current session state, without modifying it. For
> example, executing SHOW or current_setting() is fine.
> 4. Transiently modify the current session state in ways that are
> necessarily reversed before returning to the caller. For example, an
> EXCEPTION block or a configuration change driven by proconfig is
> fine.
> 5. Produce messages at any log level. This includes any kind of
> ERROR.

Nothing in that list really exercises privileges (except #2?). If those
are the allowed set of things a sandboxed function can do, is a
sandboxed function equivalent to a function running with no privileges
at all?

Please explain #2 in a bit more detail. Whose EXECUTE privileges would
be used (I assume it depende on SECURITY DEFINER/INVOKER)? Would the
called code also be sandboxed?

> In general if we have a great big call stack that involves calling a
> whole bunch of functions either as SECURITY INVOKER or as SECURITY
> DEFINER, changing the session state is blocked unless the session
> user
> trusts the owners of all of those functions.

That clarifies the earlier mechanics you described, thank you.

>  And if we got to any of
> those functions by means of code attached directly to tables, like an
> index expression or default expression, changing the session state is
> blocked unless the session user also trusts the owners of those
> tables.
>
> I see a few obvious objections to this line of attack that someone
> might raise, and I'd like to address them now. First, somebody might
> argue that this is too hard to implement.

That seems to be a response to my question above: "Isn't that a hard
problem; maybe impossible?".

Let me qualify that: if the function is written by Alice, and the code
is able to really exercise the privileges of the caller (Bob), then it
seems really hard to make it safe for the caller.

If the function is sandboxed such that it's not really using Bob's
privileges (it's just nominally running as Bob) that's a much more
tractable problem.

I believe there's some nuance to your proposal where some of Bob's
privileges could be used safely, but I'm not clear on exactly which
ones. The difficulty of the implementation would depend on these
details.

> Second, somebody might argue that full sandboxing is such a
> draconian set of restrictions that it will inconvenience users
> greatly
> or that it's pointless to even allow anything to be executed or
> something along those lines. I think that argument has some merit,
> but
> I think the restrictions sound worse than they actually are in
> context.

+100. We should make typical cases easy to secure.

> Even if they do something as
> simple as reading from another table, that's not necessarily going to
> dump and restore properly, even if it's secure, because the table
> ordering dependencies won't be clear to pg_dump.

A good point. A lot of these extraordinary cases are either incredibly
fragile or already broken.

> What if such a function wants to ALTER ROLE ...
> SUPERUSER? I think that's bonkers and should almost certainly be
> categorically denied.

...also agreed, a lot of these extraordinary cases are really just
surface area for attack with no legitimate use case.




One complaint (not an objection, because I don't think we have
the luxury of objecting to viable proposals when it comes to improving
our security model):

Although your proposal sounds like a good security backstop, it feels
like it's missing the point that there are different _kinds_ of
functions. We already have the IMMUTABLE marker and we already have
runtime checks to make sure that immutable functions can't CREATE
TABLE; why not build on that mechanism or create new markers?

Declarative markers are nice because they are easier to test: if Alice
writes a function and declares it as IMMUTABLE, she can test it before
even using it in an index expression and it will fail whatever runtime
protections IMMUTABLE offers. If we instead base it on the session user
and call stack, Alice wouldn't be able to test it effectively, only Bob
can test it.

In other words, there are some consistency aspects to how we run code
that go beyond pure security. A function author typically has
assumptions about the execution context of a function (the user, the
sandbox restrictions, the search_path, etc.) and guiding users towards
a consistent execution context in typical cases is a good thing.

Regards,
    Jeff Davis




Re: sandboxing untrusted code

From
Robert Haas
Date:
On Thu, Aug 31, 2023 at 8:57 PM Jeff Davis <pgsql@j-davis.com> wrote:
> > As a refresher, the scenario I'm talking about is any one in which
> > one
> > user, who I'll call Bob, does something that results in executing
> > code
> > provided by another user, who I'll call Alice. The most obvious way
> > that this can happen is if Bob performs some operation that targets a
> > table owned by Alice. That operation might be DML, like an INSERT or
> > UPDATE; or it might be some other kind of maintenance command that
> > can
> > cause code execution, like REINDEX, which can evaluate index
> > expressions.
>
> REINDEX executes index expressions as the table owner. (You are correct
> that INSERT executes index expressions as the inserting user.)

I was speaking here of who provided the code, rather than whose
credentials were used to execute it. The index expressions are
provided by the table owner no matter who evaluates them in a
particular case.

> > 1. Compute stuff. There's no restriction on the permissible amount of
> > compute; if you call untrusted code, nothing prevents it from running
> > forever.
> > 2. Call other code. This may be done by a function call or a command
> > such as CALL or DO, all subject to the usual permissions checks but
> > no
> > further restrictions.
> > 3. Access the current session state, without modifying it. For
> > example, executing SHOW or current_setting() is fine.
> > 4. Transiently modify the current session state in ways that are
> > necessarily reversed before returning to the caller. For example, an
> > EXCEPTION block or a configuration change driven by proconfig is
> > fine.
> > 5. Produce messages at any log level. This includes any kind of
> > ERROR.
>
> Nothing in that list really exercises privileges (except #2?). If those
> are the allowed set of things a sandboxed function can do, is a
> sandboxed function equivalent to a function running with no privileges
> at all?

Close but not quite. As you say, #2 does exercise privileges. Also,
even if no privileges are exercised, you could still refer to
CURRENT_ROLE, and I think you could also call a function like
has_table_privilege.  Your identity hasn't changed, but you're
restricted from exercising some of your privileges. Really, you still
have them, but they're just not available to you in that situation.

> Please explain #2 in a bit more detail. Whose EXECUTE privileges would
> be used (I assume it depende on SECURITY DEFINER/INVOKER)? Would the
> called code also be sandboxed?

Nothing in this proposed system has any impact on whose privileges are
used in any particular context, so any privilege checks conducted
pursuant to #2 are performed as the same user who would perform them
today. Whether the called code would be sandboxed depends on how the
rules I articulated in the previous email would apply to it. Since
those rules depend on the user IDs, if the called code is owned by the
same user as the calling code and is SECURITY INVOKER, then those
rules apply in the same way and the same level of sandboxing will
apply. But if the called function is owned by a different user or is
SECURITY DEFINER, then the rules might apply differently to the called
code than the calling code. It's possible this isn't quite good enough
and that some adjustments to the rules are necessary; I'm not sure.

> Let me qualify that: if the function is written by Alice, and the code
> is able to really exercise the privileges of the caller (Bob), then it
> seems really hard to make it safe for the caller.
>
> If the function is sandboxed such that it's not really using Bob's
> privileges (it's just nominally running as Bob) that's a much more
> tractable problem.

Agreed.

> One complaint (not an objection, because I don't think we have
> the luxury of objecting to viable proposals when it comes to improving
> our security model):
>
> Although your proposal sounds like a good security backstop, it feels
> like it's missing the point that there are different _kinds_ of
> functions. We already have the IMMUTABLE marker and we already have
> runtime checks to make sure that immutable functions can't CREATE
> TABLE; why not build on that mechanism or create new markers?

I haven't ruled that out completely, but there's some subtlety here
that doesn't exist in those other cases. If the owner of a function
marks it wrongly in terms of volatility or parallel safety, then they
might make queries run more slowly than they should, or they might
make queries return wrong answers, or error out, or even end up with
messed-up indexes. But none of that threatens the stability of the
system in any very deep way, or the security of the system. It's no
different than putting a CHECK (false) constraint on a table, or
something like that: it might make the system not work, and if that
happens, then you can fix it. Here, however, we can't trust the owners
of functions to label those functions accurately. It won't do for
Alice to create a function and then apply the NICE_AND_SAFE marker to
it. That defeats the whole point. We need to know the real behavior of
Alice's function, not the behavior that Alice says it has.

Now, in the case of a C function, things are a bit different. We can't
inspect the generated machine code and know what the function does,
because of that pesky halting problem. We could handle that either
through function labeling, since only superusers can create C
functions, or by putting checks directly in the C code. I was somewhat
inclined toward the latter approach, but I'm not completely sure yet
what makes sense. Thinking about your comments here made me realize
that there are other procedural languages to worry about, too, like
PL/python or PL/perl or PL/sh. Whatever we do for the C functions will
have to be extended to those cases somehow as well. If we label
functions, then we'll have to allow superusers only to label functions
in these languages as well and make the default label "this is
unsafe." If we put checks in the C code then I guess any given PL
needs to certify that it knows about sandboxing or have all of its
functions treated as unsafe. I think doing this at the C level would
be better, strictly speaking, because it's more granular. Imagine a
function that only conditionally does some prohibited action - it can
be allowed to work in the cases where it does not attempt the
prohibited operation, and blocked when it does. Labeling is
all-or-nothing.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: sandboxing untrusted code

From
Jeff Davis
Date:
On Fri, 2023-09-01 at 09:12 -0400, Robert Haas wrote:
> Close but not quite. As you say, #2 does exercise privileges. Also,
> even if no privileges are exercised, you could still refer to
> CURRENT_ROLE, and I think you could also call a function like
> has_table_privilege.  Your identity hasn't changed, but you're
> restricted from exercising some of your privileges. Really, you still
> have them, but they're just not available to you in that situation.

Which privileges are available in a sandboxed environment, exactly? Is
it kind of like masking away all privileges except EXECUTE, or are
other privileges available, like SELECT?

And the distinction that you are drawing between having the privileges
but them (mostly) not being available, versus not having the privileges
at all, is fairly subtle. Some examples showing why that distinction is
important would be helpful.

>
> > Although your proposal sounds like a good security backstop, it
> > feels
> > like it's missing the point that there are different _kinds_ of
> > functions. We already have the IMMUTABLE marker and we already have
> > runtime checks to make sure that immutable functions can't CREATE
> > TABLE; why not build on that mechanism or create new markers?

...

> Here, however, we can't trust the owners
> of functions to label those functions accurately.

Of course, but observe:

  =# CREATE FUNCTION f(i INT) RETURNS INT IMMUTABLE LANGUAGE plpgsql AS
  $$
  BEGIN
    CREATE TABLE x(t TEXT);
    RETURN 42 + i;
  END;
  $$;

  =# SELECT f(2);
  ERROR:  CREATE TABLE is not allowed in a non-volatile function
  CONTEXT:  SQL statement "CREATE TABLE x(t TEXT)"
  PL/pgSQL function f(integer) line 3 at SQL statement

The function f() is called at the top level, not as part of any index
expression or other special context. But it fails to CREATE TABLE
simply because that's not an allowed thing for an IMMUTABLE function to
do. That tells me right away that my function isn't going to work, and
I can rewrite it rather than waiting for some other user to say that it
failed when run in a sandbox.

>  It won't do for
> Alice to create a function and then apply the NICE_AND_SAFE marker to
> it.

You can if you always execute NICE_AND_SAFE functions in a sandbox. The
difference is that it's always executed in a sandbox, rather than
sometimes, so it will fail consistently.

> Now, in the case of a C function, things are a bit different. We
> can't
> inspect the generated machine code and know what the function does,
> because of that pesky halting problem. We could handle that either
> through function labeling, since only superusers can create C
> functions, or by putting checks directly in the C code. I was
> somewhat
> inclined toward the latter approach, but I'm not completely sure yet
> what makes sense. Thinking about your comments here made me realize
> that there are other procedural languages to worry about, too, like
> PL/python or PL/perl or PL/sh. Whatever we do for the C functions
> will
> have to be extended to those cases somehow as well. If we label
> functions, then we'll have to allow superusers only to label
> functions
> in these languages as well and make the default label "this is
> unsafe." If we put checks in the C code then I guess any given PL
> needs to certify that it knows about sandboxing or have all of its
> functions treated as unsafe. I think doing this at the C level would
> be better, strictly speaking, because it's more granular. Imagine a
> function that only conditionally does some prohibited action - it can
> be allowed to work in the cases where it does not attempt the
> prohibited operation, and blocked when it does. Labeling is
> all-or-nothing.

Here I'm getting a little lost in what you mean by "prohibited
operation". Most languages mostly use SPI, and whatever sandboxing
checks you do should work there, too. Are you talking about completely
separate side effects like writing files or opening sockets?

Regards,
    Jeff Davis




Re: sandboxing untrusted code

From
Robert Haas
Date:
On Fri, Sep 1, 2023 at 5:27 PM Jeff Davis <pgsql@j-davis.com> wrote:
> Which privileges are available in a sandboxed environment, exactly? Is
> it kind of like masking away all privileges except EXECUTE, or are
> other privileges available, like SELECT?

I think I've more or less answered this already -- fully sandboxed
code can't make reference to external data sources, from which it
follows that it can't exercise SELECT (and most other privileges).

> And the distinction that you are drawing between having the privileges
> but them (mostly) not being available, versus not having the privileges
> at all, is fairly subtle. Some examples showing why that distinction is
> important would be helpful.

I view it like this: when Bob tries to insert or update or delete
Alice's table, and Alice has some code attached to it, Alice is
effectively asking Bob to execute that code with his own privileges.
In general, I think we can reasonably expect that Bob WILL be willing
to do this: if he didn't want to modify into Alice's table, he
wouldn't have executed a DML statement against it, and executing the
code that Alice has attached to that table is a precondition of being
allowed to perform that modification. It's Alice's table and she gets
to set the rules. However, Bob is also allowed to protect himself. If
he's running Alice's code and it wants to do something with which Bob
isn't comfortable, he can change his mind and refuse to execute it
after all.

I always find it helpful to consider real world examples with similar
characteristics. Let's say that Bob is renting a VRBO from Alice.
Alice leaves behind, in the VRBO, a set of rules which Bob must follow
as a condition of being allowed to rent the VRBO. Those rules include
things that Bob but must do at checkout time, like washing all of his
dishes. As a matter of routine, Bob will follow Alice's checkout
instructions. But if Alice includes in the checkout instructions
"Leave your driver's license and social security card on the dining
room table after checkout, plus a record of all of your bank account
numbers," the security systems in Bob's brain should activate and
prevent those instructions from getting followed.

A major difference between that situation (a short term rental of
someone else's house) and the in-database case (a DML statement
against someone else's table) is that when Bob is following Alice's
VRBO checkout instructions, he knows exactly what actions he is
performing. When he executes a DML statement against Alice's table,
Bob the human being does not actually know what Alice's triggers or
index expressions or whatever are causing him to do. As I see it, the
purpose of this system is to prevent Bob from doing things that he
didn't intend to do. He's cool with adding 2 and 2 or concatenating
some strings or whatever, but probably not with reading data and
handing it over to Alice, and definitely not handing all of his
privileges over to Alice. Full sandboxing has to block that kind of
stuff, and it needs to do so precisely because *Bob would not allow
those operations if he knew about them*.

Now, it is not going to be possible to get that perfectly right.
PostgreSQL can not know the state of Bob's human mind, and it cannot
be expected to judge with perfect accuracy what actions Bob would or
would not approve. However, it can make some conservative guesses. If
Bob wants to override those guesses by saying "I trust Alice, do
whatever she says" that's fine. This system attempts to prevent Bob
from accidentally giving away his permissions to an adversary who has
buried malicious code in some unexpected place. But, unlike the
regular permissions system, it is not there to prevent Bob from doing
things that he isn't allowed to do. It's there to prevent Bob from
doing things that he didn't intend to do.

And that's where I see the distinction between *having* permissions
and those permissions being *available* in a particular context. Bob
has permission to give Alice an extra $1000 or whatever if he has the
money and wishes to do so. But those permissions are probably not
*available* in the context where Bob is following a set of
instructions from Alice. If Bob's brain spontaneously generated the
idea "let's give Alice a $1000 tip because her vacation home was
absolutely amazing and I am quite rich," he would probably go right
ahead and act on that idea and that is completely fine. But when Bob
encounters that same idea *on a list of instructions provided by
Alice*, the same operation is blocked *because it came from Alice*. If
the list of instructions from Alice said to sweep the parlor, Bob
would just go ahead and do it. Alice has permission to induce Bob to
sweep the parlor, but does not have permission to induce Bob to give
her a bunch of extra money.

And in the database context, I think it's fine if Alice induces Bob to
compute some values or look at the value of work_mem, but I don't
think it's OK if Alice induces Bob to make her a superuser. Unless Bob
declares that he trusts Alice completely, in which case it's fine if
she does that.

> Here I'm getting a little lost in what you mean by "prohibited
> operation". Most languages mostly use SPI, and whatever sandboxing
> checks you do should work there, too. Are you talking about completely
> separate side effects like writing files or opening sockets?

Yeah.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: sandboxing untrusted code

From
Jeff Davis
Date:
On Tue, 2023-09-05 at 12:25 -0400, Robert Haas wrote:
> I think I've more or less answered this already -- fully sandboxed
> code can't make reference to external data sources, from which it
> follows that it can't exercise SELECT (and most other privileges).

By what principle are we allowing EXECUTE but not SELECT? In theory, at
least, a function could hold secrets in the code, e.g.:

  CREATE FUNCTION answer_to_ultimate_question() RETURNS INT
    LANGUAGE plpgsql AS $$ BEGIN RETURN 42; END; $$;

Obviously that's a bad idea in plpgsql, because anyone can just read
pg_proc. And maybe C would be handled differently somehow, so maybe it
all works.

But it feels like something is wrong there: it's fine to execute the
answer_to_ultimate_question() not because Bob has an EXECUTE privilege,
but because the sandbox renders any security concerns with *anyone*
executing the function moot. So why bother checking the EXECUTE
privilege at all?

> And that's where I see the distinction between *having* permissions
> and those permissions being *available* in a particular context. Bob
> has permission to give Alice an extra $1000 or whatever if he has the
> money and wishes to do so. But those permissions are probably not
> *available* in the context where Bob is following a set of
> instructions from Alice. If Bob's brain spontaneously generated the
> idea "let's give Alice a $1000 tip because her vacation home was
> absolutely amazing and I am quite rich," he would probably go right
> ahead and act on that idea and that is completely fine. But when Bob
> encounters that same idea *on a list of instructions provided by
> Alice*, the same operation is blocked *because it came from Alice*.
> If
> the list of instructions from Alice said to sweep the parlor, Bob
> would just go ahead and do it. Alice has permission to induce Bob to
> sweep the parlor, but does not have permission to induce Bob to give
> her a bunch of extra money.

In the real world example, sweeping the parlor has a (slight) cost to
the person doing it and it (slightly) matters who does it. In Postgres,
we don't do any CPU accounting per user, and it's all executed under
the same PID, so it really doesn't matter.

So it raises the question: why would we not simply say that this list
of instructions should be executed by the person who wrote it, in which
case the existing privilege mechanism would work just fine?

> And in the database context, I think it's fine if Alice induces Bob
> to
> compute some values or look at the value of work_mem, but I don't
> think it's OK if Alice induces Bob to make her a superuser.

If all the code can do is compute some values or look at work_mem,
perhaps the function needs no privileges at all (or some minimal
privileges)?

You explained conceptually where you're coming from, but I still don't
see much of a practical difference between having privileges but being
in a context where they won't be used, and dropping the privileges
entirely during that time. I suppose the answer is that the EXECUTE
privilege will still be used, but as I said above, that doesn't
entirely make sense to me, either.

Regards,
    Jeff Davis