Re: enhance the efficiency of migrating particularly large tables - Mailing list pgsql-hackers

From David Rowley
Subject Re: enhance the efficiency of migrating particularly large tables
Date
Msg-id CAApHDvoxzRjqau9JzeqqyJK5C2bpmL3HVE-dxTXLApuiAVQ8gA@mail.gmail.com
Whole thread Raw
In response to Re: enhance the efficiency of migrating particularly large tables  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Tue, 9 Apr 2024 at 11:02, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Rowley <dgrowleyml@gmail.com> writes:
> > Unsure if such a feature is worthwhile. I think maybe not for just
> > min(ctid)/max(ctid). However, there could be other reasons, such as
> > the transform OR to UNION stuff that Tom worked on a few years ago.
> > That needed to eliminate duplicate rows that matched both OR branches
> > and that was done using ctid.
>
> I'm kind of allergic to adding features that fundamentally depend on
> ctid, seeing that there's so much activity around non-heap table AMs
> that may not have any such concept, or may have a row ID that looks
> totally different.  (That's one reason why I stopped working on that
> OR-to-UNION patch.)

I understand that point of view, however, I think if we were to
maintain it as a policy that we'd likely miss out on various
optimisations that future AMs could provide.

When I pushed TID Range Scans a few years ago, I added "amflags" and
we have AMFLAG_HAS_TID_RANGE so the planner can check the AM supports
that before adding the Path.

Anyway, I'm not saying let's do the non-sync scan SeqScanPath thing,
I'm just saying that blocking optimisations as some future AM might
not support it might mean we're missing out on some great speedups.

David



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: post-freeze damage control
Next
From: Amit Kapila
Date:
Subject: Re: Synchronizing slots from primary to standby