Re: speedup COPY TO for partitioned table. - Mailing list pgsql-hackers

From torikoshia
Subject Re: speedup COPY TO for partitioned table.
Date
Msg-id a9ee6d25fc6837b68082cec923b2c081@oss.nttdata.com
Whole thread Raw
In response to Re: speedup COPY TO for partitioned table.  (jian he <jian.universality@gmail.com>)
List pgsql-hackers
On 2025-07-02 13:10, jian he wrote:
Thanks for updating the patch.

> now it is:
> /*
>  * Scan a single table (which may be a partition) and export its rows
> to the
>  * COPY destination.

Based on the explanations in the glossary, should 'parition' be
partitioned table/relation?

|  -- https://www.postgresql.org/docs/devel/glossary.html
|  partition: One of several disjoint (not overlapping) subsets of a
larger set
|  Partitioned table(relation): A relation that is in semantic terms the
same as a table, but whose storage is distributed across several
partitions

Also, the terms "table" and "relation" seem to be used somewhat
interchangeably in this patch.
For consistency, perhaps it's better to pick one term and use it
consistently throughout the comments.

249 + * root_rel: if not NULL, it indicates that we are copying
partitioned relation
270 +    * exporting partitioned table data here, we must convert it
back to the

>  *
>  * rel: the relation from which the actual data will be copied.
>  * root_rel: if not NULL, it indicates that we are copying partitioned
> relation
>  * data to the destination, and "rel" is the partition of "root_rel".
>  * processed: number of tuples processed.
> */
> static void
> CopyRelTo(CopyToState cstate, Relation rel, Relation root_rel,
>           uint64 *processed)
>
>
> > Also, regarding the function name CopyThisRelTo() — I wonder if the
> > "This" is really necessary?
> > Maybe something simpler like CopyRelTo() is enough.
> >
> > What do you think?
> >
> sure. CopyRelTo looks good to me.
>
> while at it.
> I found that in BeginCopyTo:
>             ereport(ERROR,
>                     (errcode(ERRCODE_WRONG_OBJECT_TYPE),
>                      errmsg("cannot copy from foreign table \"%s\"",
>                             RelationGetRelationName(rel)),
>                      errhint("Try the COPY (SELECT ...) TO
> variant.")));
>
>                     ereport(ERROR,
>                             errcode(ERRCODE_WRONG_OBJECT_TYPE),
>                             errmsg("cannot copy from foreign table
> \"%s\"", relation_name),
>                             errdetail("Partition \"%s\" is a foreign
> table in the partitioned table \"%s\"",
>                                       relation_name,
> RelationGetRelationName(rel)),
>                             errhint("Try the COPY (SELECT ...) TO
> variant."));
>
> don't have any regress tests on it.

Hmm, I agree there are no regression tests for this, but is it about
copying foreign table, isn't it?

Since this patch is primarily about supporting COPY on partitioned
tables, I’m not sure adding regression tests for foreign tables is in
scope here.
It might be better handled in a follow-up patch focused on improving
test coverage for such unsupported cases, if we decide that's
worthwhile.



--https://coverage.postgresql.org/src/backend/commands/copyto.c.gcov.html
   670           0 :         else if (rel->rd_rel->relkind ==
RELKIND_SEQUENCE)
   671           0 :             ereport(ERROR,
   672             :
(errcode(ERRCODE_WRONG_OBJECT_TYPE),
   673             :                      errmsg("cannot copy from
sequence \"%s\"",
   674             :
RelationGetRelationName(rel))));

Also, I’m not entirely sure how much value such tests would bring,
especially if the error paths are straightforward and unlikely to
regress.


Regarding performance: it's already confirmed that COPY
partitioned_table performs better than COPY (SELECT * FROM
partitioned_table) as expected [1].
I was a bit curious, though, whether this patch might introduce any
performance regression when copying a regular (non-partitioned) table.
To check this, I ran a simple benchmark and did not observe any
degradation.
To minimize I/O overhead, I used a tmpfs mount:

   % mkdir /tmp/mem
   % sudo mount_tmpfs -s 500M /tmp/mem
   % pgbench -i

Then I ran the following command several times on both patched and
unpatched builds:
   =# COPY pgbench_accounts TO '/tmp/mem/accounts';


[1]
https://www.postgresql.org/message-id/174219852967.294107.6195385625494034792.pgcf%40coridan.postgresql.org


--
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA Japan Corporation to SRA OSS K.K.



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Make wal_receiver_timeout configurable per subscription
Next
From: Álvaro Herrera
Date:
Subject: Re: Missing NULL check after calling ecpg_strdup