Re: speedup COPY TO for partitioned table. - Mailing list pgsql-hackers

From David Rowley
Subject Re: speedup COPY TO for partitioned table.
Date
Msg-id CAApHDvqPGoqvofphWmTp8EFMYa2KVg9ZGCsQAF6qp7ZHAsSd5g@mail.gmail.com
Whole thread Raw
In response to Re: speedup COPY TO for partitioned table.  (Melih Mutlu <m.melihmutlu@gmail.com>)
List pgsql-hackers
On Tue, 11 Feb 2025 at 08:10, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
> jian he <jian.universality@gmail.com>, 27 Oca 2025 Pzt, 04:47 tarihinde şunu yazdı:
>> so the find_all_inheritors output order is deterministic?
>
> You're right that order in find_all_inheritors is deterministic. But it's not always the same with the order of
SELECToutput. You can quickly see what I mean by running a slightly modified version of the example that you shared in
yourfirst email: 

I think it's fine to raise the question as to whether the order
changing matters, however, I don't personally think there should be
any concerns with this. The main reason I think this is that the
command isn't the same, so the user shouldn't expect the same
behaviour. They'll need to adjust their commands to get the new
behaviour and possibly a different order.

Another few reasons are:

1) In the subquery version, the Append children are sorted by cost, so
the order isn't that predictable in the first place. (See
create_append_path() -> list_sort(subpaths,
append_total_cost_compare))
2) The order tuples are copied with COPY TO on non-partitioned tables
isn't that well defined in the first place. Two reasons for this, a)
the heap is a heap and has no defined order; and b) sync scans might
be used and the scan might start at any point in the heap and circle
back around again to the page prior to the page where the scan
started. See (table_beginscan() adds SO_ALLOW_SYNC to the flags).

I think the main thing to be concerned about regarding order is to
ensure that all rows from the same partition are copied consecutively,
and that does not seem to be at risk of changing here. This is
important as 3592e0ff9 added caching of the last found partition when
partition lookups continually find the same partition.

David



pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: describe special values in GUC descriptions more consistently
Next
From: Erik Rijkers
Date:
Subject: Re: 2025-02-13 release announcement draft