On Mon, Apr 20, 2026 at 2:28 PM Lakshmi N <lakshmin.jhs@gmail.com> wrote:
>
> Hi Hackers,
>
> The PG19-optimized slot catchup query uses a CTE that filters on
> invalidation_reason IS NULL, then cross-joins it with the main slot
> query. When ALL logical slots in a database are invalid, the CTE
> returns zero rows, and the cross join produces an empty result set.
> This causes pg_upgrade to silently skip those slots entirely --
> neither detecting them as invalid (which should block the upgrade)
> nor attempting to migrate them.
>
> The pre-PG19 query path does not have this problem because it queries
> pg_replication_slots directly without a cross join. This may not impact
> upgrade to PG19 but will change the behavior for PG20 upgrade.
>
> Fix by changing the cross join to a LEFT JOIN,
> so that invalid slots still appear in the result set with NULL
> caught_up values.
>
I agree with the problem here.
Another way to solve this would be using a scalar subquery(see [1]),
but that would reduce readability. Thus, I prefer a LEFT OUTER JOIN on
TRUE here. There should also be no performance impact, since the
right-hand side query will always return at most one row due to the
LIMIT 1 clause. So IMO, the proposed solution is good. Copying
Sawada-san, as he was the author of the original patch.
[1]:
SELECT slot_name, plugin, two_phase, failover,
CASE
WHEN invalidation_reason IS NOT NULL THEN FALSE
ELSE (
(SELECT last_pending_wal FROM check_caught_up) IS NULL
OR confirmed_flush_lsn > (SELECT last_pending_wal FROM check_caught_up)
)
END as caught_up,
invalidation_reason IS NOT NULL as invalid
FROM pg_catalog.pg_replication_slots
WHERE slot_type = 'logical'
AND database = current_database()
AND temporary IS FALSE;
thanks
Shveta