Re: speedup COPY TO for partitioned table. - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: speedup COPY TO for partitioned table. |
Date | |
Msg-id | CAD21AoCW4pmwk3f0UJ_F5XJR_FDHrmsWNaFr43dYqgR0SU1BvQ@mail.gmail.com Whole thread Raw |
In response to | Re: speedup COPY TO for partitioned table. (jian he <jian.universality@gmail.com>) |
Responses |
Re: speedup COPY TO for partitioned table.
|
List | pgsql-hackers |
On Tue, Oct 14, 2025 at 6:53 PM jian he <jian.universality@gmail.com> wrote: > > On Tue, Oct 14, 2025 at 4:08 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > +-- Test COPY TO with a foreign table or when the foreign table is a partition > > > > +COPY async_p3 TO stdout; --error > > > > +ERROR: cannot copy from foreign table "async_p3" > > > > +HINT: Try the COPY (SELECT ...) TO variant. > > > > > > > > async_p3 is a foreign table so it seems not related to this patch. > > > > > > > I replied in > > > https://postgr.es/m/CACJufxGkkMtRUJEbLczRnWp7x2YWqu4r1gEJEv9Po1UPxS6kGQ%40mail.gmail.com > > > I kind of doubt anyone would submit a patch just to rewrite a coverage test for > > > the sake of coverage itself. While we're here, adding nearby coverage tests > > > should be fine? > > > > For me, it's perfectly fine to have patches just for improving the > > test coverage and I think we have had such patches ever. Given this > > patch expands the supported relation kind, I guess it makes sense to > > cover other cases as well in this patch (i.e., foreign tables and > > sequences) or to have a separate patch to increase the overall test > > coverage of copyto.c. > > > > Let's have a seperate patch to handle COPY test coverage. > > > > > > > i just found out I ignored the case when partitioned tables have RLS. > > > when exporting a partitioned table, > > > find_all_inheritors will sort the returned partition by oid. > > > in DoCopy, we can do the same: > > > make a SortBy node for SelectStmt->sortClause also mark the > > > RangeVar->inh as true. > > > OR > > > ereport(ERRCODE_FEATURE_NOT_SUPPORTED...) for partitioned tables with RLS. > > > > > > please see the change I made in DoCopy. > > > > Good catch. However, I guess adding a SortBy node with "tableoid" > > doesn't necessarily work in the same way as the 'COPY TO' using > > find_all_inheritors(): > > > > + /* > > + * To COPY data from multiple partitions, we rely on the order of > > + * the partitions' tableoids, which matches the order produced by > > + * find_all_inheritors. > > + */ > > > > The table list returned by find_all_inheritors() is deterministic, but > > it doesn't sort the whole list by their OIDs. If I understand > > correctly, it retrieves all descendant tables in a BFS order. For > > example, if I create the tables in the following sequence: > > > > create table p (i int) partition by list (i); > > create table c12 partition of p for values in (1, 2) partition by list (i); > > create table c12_1 partition of c12 for values in (1); > > create table c12_2 partition of c12 for values in (2); > > create table c3 partition of p for values in (3); > > insert into p values (1), (2), (3); > > alter table p enable row level security; > > create policy policy_p on p using (i > 0); > > create user test_user; > > grant select on table p to test_user; > > > > I got the result without RLS: > > > > copy p to stdout; > > 3 > > 1 > > 2 > > > > whereas I got the results with RLS: > > > > copy p to stdout; > > 1 > > 2 > > 3 > > > > I think that adding SortBy doesn't help more than making the results > > deterministic. Or we can re-sort the OID list returned by > > find_all_inheritors() to match it. However, I'm not sure that we need > > to make COPY TO results deterministic in the first place. It's not > > guaranteed that the order of tuples returned from 'COPY TO rel' where > > rel is not a partitioned table is sorted nor even deterministic (e.g., > > due to sync scans). If 'rel' is a partitioned table without RLS, the > > order of tables to scan is deterministic but returned tuples within a > > single partition is not. Given that sorting the whole results is not > > cost free, I'm not sure that guaranteeing this ordering also for > > partitioned tables with RLS would be useful for users. > > > > I removed the "SortBy node", and also double checked the patch again. > Please check the attached v18. Thank you for updating the patch! I've reviewed the patch and here is one review comment: from->inh = false; /* apply ONLY */ + if (get_rel_relkind(relid) == RELKIND_PARTITIONED_TABLE) + from->inh = true; It's better to check rel->rd_rel->relkind instead of calling get_rel_relkind() as it checks syscache. I've attached a patch to fix the above and includes some cosmetic changes. Please review it. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
pgsql-hackers by date: