Re: speedup COPY TO for partitioned table. - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: speedup COPY TO for partitioned table. |
Date | |
Msg-id | CAD21AoCUZ07wDreg97Nw-9U7kGmtYu9=S7+AY_GbnUP61jLEkw@mail.gmail.com Whole thread Raw |
In response to | Re: speedup COPY TO for partitioned table. (jian he <jian.universality@gmail.com>) |
List | pgsql-hackers |
On Fri, Oct 10, 2025 at 12:10 AM jian he <jian.universality@gmail.com> wrote: > > On Fri, Oct 10, 2025 at 6:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > --- > > > > + relation_name = get_rel_name(childreloid); > > > > + ereport(ERROR, > > > > + errcode(ERRCODE_WRONG_OBJECT_TYPE), > > > > + errmsg("cannot copy from foreign table > > > > \"%s\"", relation_name), > > > > + errdetail("Partition \"%s\" is a foreign > > > > table in the partitioned table \"%s\"", > > > > + relation_name, > > > > RelationGetRelationName(rel)), > > > > + errhint("Try the COPY (SELECT ...) TO variant.")); > > > > > > > > I think we don't need "the" in the error message. > > > > > > > > It's conventional to put all err*() macros in parentheses (i.e., > > > > "(errcode(), ...)", it's technically omittable though. > > > > > > > > > > https://www.postgresql.org/docs/current/error-message-reporting.html > > > QUOTE: > > > <<<<>>>>> > > > The extra parentheses were required before PostgreSQL version 12, but > > > are now optional. > > > Here is a more complex example: > > > ..... > > > <<<<>>>>> > > > > > > related commit: > > > https://git.postgresql.org/cgit/postgresql.git/commit/?id=e3a87b4991cc2d00b7a3082abb54c5f12baedfd1 > > > Less parenthesis is generally more readable, I think. > > > > Yes, but I think it's more consistent given that we use the > > parentheses in all other places in copyto.c. > > > > If you look at tablecmds.c, like ATExecSetNotNull, there are > parentheses and no parentheses cases. > Overall, I think less parentheses improves readability and makes the > code more future-proof. > Understood. > > > --- > > +-- Test COPY TO with a foreign table or when the foreign table is a partition > > +COPY async_p3 TO stdout; --error > > +ERROR: cannot copy from foreign table "async_p3" > > +HINT: Try the COPY (SELECT ...) TO variant. > > > > async_p3 is a foreign table so it seems not related to this patch. > > > I replied in > https://postgr.es/m/CACJufxGkkMtRUJEbLczRnWp7x2YWqu4r1gEJEv9Po1UPxS6kGQ%40mail.gmail.com > I kind of doubt anyone would submit a patch just to rewrite a coverage test for > the sake of coverage itself. While we're here, adding nearby coverage tests > should be fine? For me, it's perfectly fine to have patches just for improving the test coverage and I think we have had such patches ever. Given this patch expands the supported relation kind, I guess it makes sense to cover other cases as well in this patch (i.e., foreign tables and sequences) or to have a separate patch to increase the overall test coverage of copyto.c. > > i just found out I ignored the case when partitioned tables have RLS. > when exporting a partitioned table, > find_all_inheritors will sort the returned partition by oid. > in DoCopy, we can do the same: > make a SortBy node for SelectStmt->sortClause also mark the > RangeVar->inh as true. > OR > ereport(ERRCODE_FEATURE_NOT_SUPPORTED...) for partitioned tables with RLS. > > please see the change I made in DoCopy. Good catch. However, I guess adding a SortBy node with "tableoid" doesn't necessarily work in the same way as the 'COPY TO' using find_all_inheritors(): + /* + * To COPY data from multiple partitions, we rely on the order of + * the partitions' tableoids, which matches the order produced by + * find_all_inheritors. + */ The table list returned by find_all_inheritors() is deterministic, but it doesn't sort the whole list by their OIDs. If I understand correctly, it retrieves all descendant tables in a BFS order. For example, if I create the tables in the following sequence: create table p (i int) partition by list (i); create table c12 partition of p for values in (1, 2) partition by list (i); create table c12_1 partition of c12 for values in (1); create table c12_2 partition of c12 for values in (2); create table c3 partition of p for values in (3); insert into p values (1), (2), (3); alter table p enable row level security; create policy policy_p on p using (i > 0); create user test_user; grant select on table p to test_user; I got the result without RLS: copy p to stdout; 3 1 2 whereas I got the results with RLS: copy p to stdout; 1 2 3 I think that adding SortBy doesn't help more than making the results deterministic. Or we can re-sort the OID list returned by find_all_inheritors() to match it. However, I'm not sure that we need to make COPY TO results deterministic in the first place. It's not guaranteed that the order of tuples returned from 'COPY TO rel' where rel is not a partitioned table is sorted nor even deterministic (e.g., due to sync scans). If 'rel' is a partitioned table without RLS, the order of tables to scan is deterministic but returned tuples within a single partition is not. Given that sorting the whole results is not cost free, I'm not sure that guaranteeing this ordering also for partitioned tables with RLS would be useful for users. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: