On Thu, 9 Jan 2025 at 20:39, Tom Lane <
tgl@sss.pgh.pa.us> wrote:
> Yeah. That prevents pushing down the join condition, since in a FULL join all
> rows of both sides will contribute to the result; none can be removed ahead of
> the join.
> Not sure if this pattern is common enough to be worth trying to implement such
> an optimization.
I'm not going to pursue this much further, however for those reading along/
future reference, what we're trying to do is:
- for a given target table
- merge a subset of the data using a source table (only update any significant
changes). The subset filter `WHERE device_id=$1` is applied to both the JOIN
as well as the NOT MATCHED BY SOURCE part of the merge.
- when using MERGE ... NOT MATCHED BY SOURCE THEN DELETE currently (pg17) reads
the whole of the target table before applying the subset filter. As we plan to
merge only very small subsets (1/10,000 or so), this means that for now, this
isn't usable for that use case.
For example, for a monitoring system, we have 10's of thousands of remote
systems, for which we want to merge a snapshot of their current state inside a
central target table. That use case is currently not well supported with MERGE.
(The workaround is a DELETE inside a CTE).
I would expect this use case to be quite common, however, as this feature is
only available in PG17, it may not be used yet with MERGE, so I don't expect
any others to voice the same concern.