Thread: RLS makes COPY TO process child tables

RLS makes COPY TO process child tables

From
Antonin Houska
Date:
While working on [1] I noticed that if RLS gets enabled, the COPY TO command
includes the contents of child table into the result, although the
documentation says it should not:

    "COPY TO can be used only with plain tables, not views, and does not
    copy rows from child tables or child partitions. For example, COPY
    table TO copies the same rows as SELECT * FROM ONLY table. The syntax
    COPY (SELECT * FROM table) TO ... can be used to dump all of the rows
    in an inheritance hierarchy, partitioned table, or view."

A test case is attached (rls.sql) as well as fix proposal
(copy_rls_no_inh.diff).

[1] https://commitfest.postgresql.org/41/3641/

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

create table a(i int);
insert into a values (1);

create table a1() inherits(a);
insert into a1 values (1);

-- Only the parent table is copied, as the documentation claims.
copy a to stdout;

alter table a enable row level security;
create role u;
create policy pol_perm on a as permissive for select to u using (true);
grant select on table a to u;
set role u;

-- Both "a" and "a1" appears in the output.
copy a to stdout;
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index e34f583ea7..3b8c25dadd 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -249,6 +249,7 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
             from = makeRangeVar(get_namespace_name(RelationGetNamespace(rel)),
                                 pstrdup(RelationGetRelationName(rel)),
                                 -1);
+            from->inh = false;
 
             /* Build query */
             select = makeNode(SelectStmt);

Re: RLS makes COPY TO process child tables

From
Yugo NAGATA
Date:
On Wed, 01 Feb 2023 12:45:57 +0100
Antonin Houska <ah@cybertec.at> wrote:

> While working on [1] I noticed that if RLS gets enabled, the COPY TO command
> includes the contents of child table into the result, although the
> documentation says it should not:
> 
>     "COPY TO can be used only with plain tables, not views, and does not
>     copy rows from child tables or child partitions. For example, COPY
>     table TO copies the same rows as SELECT * FROM ONLY table. The syntax
>     COPY (SELECT * FROM table) TO ... can be used to dump all of the rows
>     in an inheritance hierarchy, partitioned table, or view."
> 
> A test case is attached (rls.sql) as well as fix proposal
> (copy_rls_no_inh.diff).

I think this is a bug because the current behaviour is different from
the documentation. 

When RLS is enabled on a table in `COPY ... TO ...`, the query is converted
to `COPY (SELECT * FROM ...) TO ...` to allow the rewriter to add in RLS
clauses. This causes to dump the rows of child tables.

The patch fixes this by setting "inh" of the table in the converted query
to false. This seems reasonable and actually fixes the problem.

However, I think we would want a comment on the added line. Also, the
attached test should be placed in the regression test.

Regards,
Yugo Nagata

> 
> [1] https://commitfest.postgresql.org/41/3641/
> 
> -- 
> Antonin Houska
> Web: https://www.cybertec-postgresql.com
> 


-- 
Yugo NAGATA <nagata@sraoss.co.jp>



Re: RLS makes COPY TO process child tables

From
Tom Lane
Date:
Yugo NAGATA <nagata@sraoss.co.jp> writes:
> Antonin Houska <ah@cybertec.at> wrote:
>> While working on [1] I noticed that if RLS gets enabled, the COPY TO command
>> includes the contents of child table into the result, although the
>> documentation says it should not:

> I think this is a bug because the current behaviour is different from
> the documentation.

I agree, it shouldn't do that.

> When RLS is enabled on a table in `COPY ... TO ...`, the query is converted
> to `COPY (SELECT * FROM ...) TO ...` to allow the rewriter to add in RLS
> clauses. This causes to dump the rows of child tables.

Do we actually say that in so many words, either in the code or docs?
If so, it ought to read `COPY (SELECT * FROM ONLY ...) TO ...`
instead.  (If we say that in the docs, then arguably the code *does*
conform to the docs.  But I don't see it in the COPY ref page at least.)

            regards, tom lane



Re: RLS makes COPY TO process child tables

From
Yugo NAGATA
Date:
On Wed, 01 Feb 2023 11:47:23 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Yugo NAGATA <nagata@sraoss.co.jp> writes:
> > Antonin Houska <ah@cybertec.at> wrote:
> >> While working on [1] I noticed that if RLS gets enabled, the COPY TO command
> >> includes the contents of child table into the result, although the
> >> documentation says it should not:
> 
> > I think this is a bug because the current behaviour is different from
> > the documentation. 
> 
> I agree, it shouldn't do that.
> 
> > When RLS is enabled on a table in `COPY ... TO ...`, the query is converted
> > to `COPY (SELECT * FROM ...) TO ...` to allow the rewriter to add in RLS
> > clauses. This causes to dump the rows of child tables.
> 
> Do we actually say that in so many words, either in the code or docs?
> If so, it ought to read `COPY (SELECT * FROM ONLY ...) TO ...`
> instead.  (If we say that in the docs, then arguably the code *does*
> conform to the docs.  But I don't see it in the COPY ref page at least.)

The documentation do not say that, but the current code actually do that.
Also, there is the following comment in BeginCopyTo().

         * With row-level security and a user using "COPY relation TO", we
         * have to convert the "COPY relation TO" to a query-based COPY (eg:
         * "COPY (SELECT * FROM relation) TO"), to allow the rewriter to add
         * in any RLS clauses.

Maybe, it is be better to change the description in the comment to
"COPY (SELECT * FROM ONLY relation) TO" when fixing the bug.

Regards,
Yugo Nagata

-- 
Yugo NAGATA <nagata@sraoss.co.jp>



Re: RLS makes COPY TO process child tables

From
Antonin Houska
Date:
Yugo NAGATA <nagata@sraoss.co.jp> wrote:

> On Wed, 01 Feb 2023 12:45:57 +0100
> Antonin Houska <ah@cybertec.at> wrote:
> 
> > While working on [1] I noticed that if RLS gets enabled, the COPY TO command
> > includes the contents of child table into the result, although the
> > documentation says it should not:
> > 
> >     "COPY TO can be used only with plain tables, not views, and does not
> >     copy rows from child tables or child partitions. For example, COPY
> >     table TO copies the same rows as SELECT * FROM ONLY table. The syntax
> >     COPY (SELECT * FROM table) TO ... can be used to dump all of the rows
> >     in an inheritance hierarchy, partitioned table, or view."
> > 
> > A test case is attached (rls.sql) as well as fix proposal
> > (copy_rls_no_inh.diff).
> 
> I think this is a bug because the current behaviour is different from
> the documentation. 
> 
> When RLS is enabled on a table in `COPY ... TO ...`, the query is converted
> to `COPY (SELECT * FROM ...) TO ...` to allow the rewriter to add in RLS
> clauses. This causes to dump the rows of child tables.
> 
> The patch fixes this by setting "inh" of the table in the converted query
> to false. This seems reasonable and actually fixes the problem.
> 
> However, I think we would want a comment on the added line.

A short comment added, see the new patch version.

> Also, the attached test should be placed in the regression test.

Hm, I'm not sure it's necessary. It would effectively test whether the 'inh'
field works, but if it didn't, many other tests would fail. I discovered the
bug by reading the code, so I wanted to demonstrate (also to myself) that it
causes incorrect behavior from user perspective. That was the purpose of the
test.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

From 3a4acb29901a7b53eb32767e30bdfce74b80df8b Mon Sep 17 00:00:00 2001
From: Antonin Houska <ah@cybertec.at>
Date: Thu, 2 Feb 2023 07:31:32 +0100
Subject: [PATCH] Make sure that COPY TO does not process child tables.

It's expected (and documented) that the child tables are not copied, however
the query generated for RLS didn't meet this expectation so far.
---
 src/backend/commands/copy.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index e34f583ea7..539fb682c2 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -250,6 +250,9 @@ DoCopy(ParseState *pstate, const CopyStmt *stmt,
                                 pstrdup(RelationGetRelationName(rel)),
                                 -1);
 
+            /* COPY TO should not process child tables. */
+            from->inh = false;
+
             /* Build query */
             select = makeNode(SelectStmt);
             select->targetList = targetList;
-- 
2.31.1


Re: RLS makes COPY TO process child tables

From
Stephen Frost
Date:
Greetings,

* Yugo NAGATA (nagata@sraoss.co.jp) wrote:
> On Wed, 01 Feb 2023 11:47:23 -0500
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> > Yugo NAGATA <nagata@sraoss.co.jp> writes:
> > > Antonin Houska <ah@cybertec.at> wrote:
> > >> While working on [1] I noticed that if RLS gets enabled, the COPY TO command
> > >> includes the contents of child table into the result, although the
> > >> documentation says it should not:
> >
> > > I think this is a bug because the current behaviour is different from
> > > the documentation.
> >
> > I agree, it shouldn't do that.

Yeah, I agree based on what the COPY table TO docs say should be
happening.

> > > When RLS is enabled on a table in `COPY ... TO ...`, the query is converted
> > > to `COPY (SELECT * FROM ...) TO ...` to allow the rewriter to add in RLS
> > > clauses. This causes to dump the rows of child tables.
> >
> > Do we actually say that in so many words, either in the code or docs?
> > If so, it ought to read `COPY (SELECT * FROM ONLY ...) TO ...`
> > instead.  (If we say that in the docs, then arguably the code *does*
> > conform to the docs.  But I don't see it in the COPY ref page at least.)
>
> The documentation do not say that, but the current code actually do that.
> Also, there is the following comment in BeginCopyTo().
>
>          * With row-level security and a user using "COPY relation TO", we
>          * have to convert the "COPY relation TO" to a query-based COPY (eg:
>          * "COPY (SELECT * FROM relation) TO"), to allow the rewriter to add
>          * in any RLS clauses.
>
> Maybe, it is be better to change the description in the comment to
> "COPY (SELECT * FROM ONLY relation) TO" when fixing the bug.

Yeah, that should also be updated.  Perhaps you'd send an updated patch
which includes fixing that too and maybe adds clarifying documentation
to COPY which mentions what happens when RLS is enabled on the relation?

I'm not sure if this makes good sense to back-patch.

Thanks,

Stephen

Attachment

Re: RLS makes COPY TO process child tables

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
>> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Yugo NAGATA <nagata@sraoss.co.jp> writes:
>>>> I think this is a bug because the current behaviour is different from
>>>> the documentation. 

>>> I agree, it shouldn't do that.

> Yeah, I agree based on what the COPY table TO docs say should be
> happening.

Yeah, the documentation is quite clear that child data is not included.

> I'm not sure if this makes good sense to back-patch.

I think we have to.  The alternative is to back-patch some very confusing
documentation changes saying "never mind all that if RLS is on".

            regards, tom lane



Re: RLS makes COPY TO process child tables

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> Yeah, that should also be updated.  Perhaps you'd send an updated patch
> which includes fixing that too and maybe adds clarifying documentation
> to COPY which mentions what happens when RLS is enabled on the relation?

I couldn't find anything in copy.sgml that seemed to need adjustment.
It already says

    If row-level security is enabled for the table, the relevant SELECT
    policies will apply to COPY table TO statements.

Together with the already-mentioned

    COPY table TO copies the same rows as SELECT * FROM ONLY table.

I'd say that the behavior is very clearly specified already.  We just
need to make the code do what the docs say.  So I pushed the patch
without any docs changes.  I did add a test case, because I don't
like back-patching without something that proves that the issue is
relevant to, and corrected in, each branch.

            regards, tom lane